One of the frustrating problems I've been running into is that if I
have "print statements" in code called by my mapper/reducer this will
break the pipe used by my streaming job.
It seems like a simple change to dumbo can fix this.
In core.py change
This way all we have to do is redirect stdout to stderr and extraneous
print statements will no longer cause problems.
I've tried this out and it seems to work for me.
I apologize for the cross post but this is how I fixed this problem in Hadoopy