The existing condition didn't make sense if typedbytes was not installed
as an egg, since it would make Hadoop think the typedbytes module was on
HDFS. The new method is the same as what's in backends/common.
Fix distributing typedbytes module when it's not installed as an egg
Add a docstring about envdef()
I'm afraid I'm not following completely here. When typedbytes is not installed as an egg, then the old code will revert to opts.add('file', modpath) which will make sure the .py file is send along and thus available on HDFS, right? Not sure what's left to fix then...
Unfortunately I forgot exactly what was happening, but I definitely tested dumbo after installing typedbytes through pip, and it didn't work with the original code but it worked with this change.
I think the problem might have been that opts['file'] would be re-interpreted by the code starting at line 176. The module path didn't start with file://, so it wasn't actually getting passed to streaming as a -file. But if you add it as a libegg, it does get sent along with the job.
typedbytes is installed on every node in my cluster, I don't need or want it distributed to HDFS. Does this patch remove that "feature"?
Think this is a better solution: