Skip to content


Subversion checkout URL

You can clone with
Download ZIP


run scripts written in other languages #24

davidmarin opened this Issue · 6 comments

2 participants


It wouldn't actually be that difficult for MRJob to run scripts written in other languages if they implemented the MRJob protocol (--steps, --mapper, --reducer, and --step-num). Instead of prepending python to our command inside Hadoop streaming, we'd prepend ruby or java or (for shell scripts) nothing. We'd probably run them like:

python mrjob.job.MRJob --mr-job-script mr_perform_aweomeness.rb

or alternately:

mrjob mr_perform_awesomeness.rb

The main thing is, I'm not sure there's any demand for such a feature.

Tell you what, you write the base MRJob class in your favorite language and put it up on github, and I'll hook it up to mrjob for you. :)


Also --combiner. (Just keeping this up to date)


Other things involved:

  • Rename python_bin to interpreter and support --no-interpreter and interpreter: null
  • Passthrough options: ???
  • Recompiling compiled executables for different contexts: ???

We need to incorporate Hadoop input/output format, partitioner, and jobconf into the steps format.



Default runner is inline.

MRJobLauncher does not support inline.

We'll have to have different defaults, I suppose.


Yeah, that's true. But reasonable.

I can imagine people not even realizing that inline and local modes have names, and just think; the job runs itself inline, and if I want to simulate Hadoop, I can run it from mrjob-launch (or whatever we call the binary).


This pretty much works now, but the interface for invoking it is pretty hacky. Rest of the issue is in #225.

@irskep irskep closed this
@irskep irskep was assigned
@irskep irskep was unassigned by tomelm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.