submission: auxiliary one-off data as files mounted into job container #45

lukasheinrich · 2017-05-29T13:25:22Z

There is a need for having data mounted into the container that is not part of the workflow work directory but rather encapsulated information that is only needed by the specific job.

Examples are

normally, we submit a container and a cmd to the job controller, where the cmd is prepared by the workflow controller (it constructs the from a template, and workflow specific data, like file paths that are only known at run-time. Sometimes the cmd is pretty long and a one-off multi-linescript is a better choice. The script can be constructed by the workflow controller, but needs to be mounted into the container by the job controller

Example:

cat /path/only/known/at/runtime/by/wflowcontroller/input.txt
echo some
echo very
echo long
echo script
cat /path/only/known/at/runtime/by/wflowcontroller/output.txt

we would like this to be mounted at some well-defined location in the container say /reana/script, such that we can submit a job with command: bash /reana/script

The Job manifest could look like this

experiment: ATLAS
docker_img: my_atlas_analysis
cmd: bash /reana/script
aux_mounts: 
   -  mountpath: /reana/script
      data: |
         echo some
         echo very
         echo long
         echo script

a related Example deals with situations when the commands/script become to large, we'd like to mount some of the data into the container. Take the example of merging 500 ROOT files into a single output file. For few files this is possible via hadd merged.root inputA.root inputB.root. For large lists (of absolute paths) this can become unworkable, and we'd rather write a script such as merge.py merged.root inputfiles.json. The inputfiles.json can be constructed by the workflow controller and submitted like so:

experiment: ATLAS
docker_img: my_atlas_analysis
cmd: merge.py /reana/inputfiles.json /workdir/location/merged.root
aux_mounts: 
   -  mountpath: /reana/inputfiles.json
      data: |
         {"inputsfiles": [
            "/one/very/long/path/to/a/file"
            "/one/very/long/path/to/a/file"
            "/one/very/long/path/to/a/file"
            "/one/very/long/path/to/a/file"
           ... 100s of more file paths
            "/one/very/long/path/to/a/file"
            "/one/very/long/path/to/a/file"
           ]
        }

Implementation:

Kubernetes should transparently support this via either secrets or configmaps

The text was updated successfully, but these errors were encountered:

lukasheinrich · 2018-06-28T09:09:25Z

this is also relevant for reanahub/reana-workflow-engine-serial#17 (comment)

if we could mount the desired stdin as a one-off file we could have a nicer command without the base64 hack e.g.

command: ['sh','-c','root < /my/mounted/script']

diegodelemos added this to the Someday milestone May 29, 2017

diegodelemos added the type/feature label Oct 6, 2019

diegodelemos removed this from the Someday milestone Oct 6, 2019

diegodelemos added this to To triage in Triage Oct 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

submission: auxiliary one-off data as files mounted into job container #45

submission: auxiliary one-off data as files mounted into job container #45

lukasheinrich commented May 29, 2017 •

edited

lukasheinrich commented Jun 28, 2018

submission: auxiliary one-off data as files mounted into job container #45

submission: auxiliary one-off data as files mounted into job container #45

Comments

lukasheinrich commented May 29, 2017 • edited

lukasheinrich commented Jun 28, 2018

lukasheinrich commented May 29, 2017 •

edited