Martian stage code adapter
This readme describes the interface between
mrp and the
process, as well as the interface between the job monitor and the stage code,
which must be obeyed by any language-specific adapter.
To support a new language, one must implement the adapter interface, which mostly amounts to reading and writing the right files.
Stage adapter interface
Command line arguments
The arguments to the adapter shell process are whatever was specified in the MRO for the stage (e.g. a path to the stage code), followed by:
- The run type: either
- The path to the stage metadata files (see below).
- The path to the stage files directory. This is the directory where data files are written, and should also be the working directory.
- The journal file prefix (see below).
Open file descriptors
In addition to standard output and standard error (standard in is null), the monitor will leave two file handles open:
- The stage log file. Generally this contains timestamps for progress stage process. This should not be closed by the adapter.
- The error reporting pipe. In the event of a fatal error, write the error
message to this descriptor and then close the pipe. If the message is prefixed
ASSERT:" then the message is treated as an assertion. After closing the pipe, the adapter process will likely be killed shortly thereafter.
A metadata file named "
name" is placed in the path
mrp of the existence of a new file for a split or join, create a
[journal prefix].<run type>_<name>. For chunks, use simply
args file contains the json-serialized arguments to the stage or chunk.
Other common files depend on the phase.
- Split phases must output a
chunk_defsfile. This file should contain a json-serialized array of chunk definitions, each of which is a dictionary containing the per-chunk arguments, and possibly a key named
__mem_gbto specify the reservation for that job.
- Chunk phases create an
outsfile with the json-serialized dictionary of output values. If the stage splits, the args and outs are per-chunk. If it does not, they are for the stage overall.
- Join phases, in addition to the stage
outsfiles, may read the
chunk_defs. Additionally, the
outsfiles for each chunk are aggregated by mrp into an array in the
Any stage may wish to access the
jobinfo file. In some cases it may be
appropriate to add to it. Be sure, however, that the updates to that file do
not corrupt it.
A stage may update its
progress file at any time. If it updates the journal
progress then mrp will read the
progress file and print its
contents to the main mrp log next time it checks. Note that if mrp is
restarted or is overwritten by a newer progress message before mrp sees that
message, the older message will not make it into the mrp log.
The adapter process should generally not catch any signals except possibly those generated by the process itself, such as SIGSEGV. Catching signals is the responsibility of the monitor process.
The monitor process,
mrjob, is maintained with martian. Maintainers of
language adapters do not need to understand its contracts unless they are also
planning to maintain the monitor itself. Note that, unlike the API between
mrjob and the adapter child process,
mrjob is tightly coupled to the
version it was built with, and there should be no assumptions about the stability
of the interface between them.
The job monitor process has several duties. If monitoring is enabled in the
jobinfo file, the monitor will periodically inspect the process for its
memory usage and kill it if it exceeds its memory reservation. In addition, it
ensures that the correct files are created and journaled to ensure correct
progress for mrp. In particular:
logfile is created when the job starts. It gets a timestamp when the job starts and when it completes, and may get additional messages through the file descriptor passed to the adapter.
- If the stage code writes to the errors pipe, the monitor will read up to 8kb
and put that in the
errorsmetadata file (or
assertif the message begins with
ASSERT:) to indicate stage failure. If the stage process returns a nonzero exit code but does not write to the errors pipe, the exit code is written into
- If the stage process exits with a return code of 0, the monitor will write a
- When the job completes, the jobinfo file is updated with performance information including rusage.
- While the job is running, the journal for the
heartbeatfile is periodically written to make sure that mrp knows the job is still running.
- If the profile mode indicates that an external profiler should be attached to the stage code, the monitor process handles that, including ensuring the profiler is given the correct output paths.
The monitor process catches certain signals. If a signal is caught, the
errors file is written and the stage code is killed.
The arguments to the monitor process are passed through to the adapter except for the monitor executable itself.