Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rome example fails for --backend local #51

Open
matthewfeickert opened this issue Sep 20, 2021 · 1 comment
Open

rome example fails for --backend local #51

matthewfeickert opened this issue Sep 20, 2021 · 1 comment
Assignees

Comments

@matthewfeickert
Copy link
Member

matthewfeickert commented Sep 20, 2021

In the current RECAST docs, one of the first things a new users sees is the rome example:

pip install recast-atlas
recast run examples/rome  # using `--backend docker` by default

This command will work and produce the expected output from the docs. However, running this example with the --backend local option will fail.

Minimal Failing Example

Expand for full console output:
$ cd /tmp
feickert@ThinkPad-X1:/tmp$ pyenv virtualenv 3.8.11 example
feickert@ThinkPad-X1:/tmp$ pyenv activate example
(example) feickert@ThinkPad-X1:/tmp$ python -m pip --quiet install --upgrade pip setuptools wheel
(example) feickert@ThinkPad-X1:/tmp$ python -m pip --quiet install recast-atlas
(example) feickert@ThinkPad-X1:/tmp$ pip show recast-atlas
Name: recast-atlas
Version: 0.1.7
Summary: RECAST for ATLAS at the LHC
Home-page: UNKNOWN
Author: Lukas Heinrich
Author-email: lukas.heinrich@cern.ch
License: UNKNOWN
Location: /home/feickert/.pyenv/versions/3.8.11/envs/example/lib/python3.8/site-packages
Requires: click, yadage-schemas, jsonschema, pyyaml
Required-by: 
(example) feickert@ThinkPad-X1:/tmp$ recast catalogue ls
NAME                               DESCRIPTION                                                 EXAMPLES            TAGS                
atlas/atlas-conf-2018-041          ATLAS MBJ                                                   default                                 
examples/checkmate1                CheckMate Tutorial Example (Herwig + CM1)                   default                                 
examples/checkmate2                CheckMate Tutorial Example (Herwig + CM2)                   default                                 
examples/rome                      Example from ATLAS Exotics Rome Workshop 2018               default,newsignal                       
testing/busyboxtest                Simple, lightweight Functionality Test                      default
(example) feickert@ThinkPad-X1:/tmp$ recast run examples/rome  # using `--backend docker` by default
Unable to find image 'recast/recastatlas:v0.1.7' locally
v0.1.7: Pulling from recast/recastatlas
ca3cd42a7c95: Already exists 
fbd7def92be5: Already exists 
071c71d4725b: Already exists 
f725c26e6c96: Already exists 
3ca6f85a1371: Already exists 
f46e76fe2b43: Already exists 
ea4bb38bf23d: Already exists 
37e1a0b67691: Pull complete 
42618669bd89: Pull complete 
c56cd0b9b5fe: Pull complete 
871520166c21: Pull complete 
e7e91df4609c: Pull complete 
0290390f308a: Pull complete 
Digest: sha256:a5b26672db39fa6fc8b7a620d42271cab6fed47e231b93d5463491f229c11040
Status: Downloaded newer image for recast/recastatlas:v0.1.7
2021-09-20 20:41:06,930 | packtivity.asyncback |   INFO | configured pool size to 12
2021-09-20 20:41:07,384 |      yadage.creators |   INFO | initializing workflow with initdata: {'did': 404958, 'dxaod_file': 'https://recastwww.web.cern.ch/recastwww/data/reana-recast-demo/mc15_13TeV.123456.cap_recast_demo_signal_one.root', 'xsec_in_pb': 0.00122} discover: True relative: True
2021-09-20 20:41:07,384 |    adage.pollingexec |   INFO | preparing adage coroutine.
2021-09-20 20:41:07,384 |                adage |   INFO | starting state loop.
2021-09-20 20:41:07,491 |     yadage.wflowview |   INFO | added </init:0|defined|unknown>
2021-09-20 20:41:08,660 |     yadage.wflowview |   INFO | added </eventselection:0|defined|unknown>
2021-09-20 20:41:10,262 |     yadage.wflowview |   INFO | added </statanalysis:0|defined|unknown>
2021-09-20 20:41:12,247 |    adage.pollingexec |   INFO | submitting nodes [</init:0|defined|known>]
2021-09-20 20:41:12,946 |       pack.init.step |   INFO | publishing data: <TypedLeafs: {'did': 404958, 'dxaod_file': 'https://recastwww.web.cern.ch/recastwww/data/reana-recast-demo/mc15_13TeV.123456.cap_recast_demo_signal_one.root', 'xsec_in_pb': 0.00122}>
2021-09-20 20:41:12,946 |                adage |   INFO | unsubmittable: 0 | submitted: 0 | successful: 0 | failed: 0 | total: 3 | open rules: 0 | applied rules: 3
2021-09-20 20:41:14,961 |           adage.node |   INFO | node ready </init:0|success|known>
2021-09-20 20:41:14,961 |    adage.pollingexec |   INFO | submitting nodes [</eventselection:0|defined|known>]
2021-09-20 20:41:14,962 | pack.eventselection. |   INFO | starting file logging for topic: step
2021-09-20 20:41:28,979 |           adage.node |   INFO | node ready </eventselection:0|success|known>
2021-09-20 20:41:28,980 |    adage.pollingexec |   INFO | submitting nodes [</statanalysis:0|defined|known>]
2021-09-20 20:41:28,981 | pack.statanalysis.st |   INFO | starting file logging for topic: step
2021-09-20 20:41:35,157 |           adage.node |   INFO | node ready </statanalysis:0|success|known>
2021-09-20 20:41:35,178 | adage.controllerutil |   INFO | no nodes can be run anymore and no rules are applicable
2021-09-20 20:41:35,178 | adage.controllerutil |   INFO | no nodes can be run anymore and no rules are applicable
2021-09-20 20:41:35,178 |                adage |   INFO | unsubmittable: 0 | submitted: 0 | successful: 3 | failed: 0 | total: 3 | open rules: 0 | applied rules: 3
2021-09-20 20:41:38,746 |                adage |   INFO | adage state loop done.
2021-09-20 20:41:38,746 |                adage |   INFO | execution valid. (in terms of execution order)
2021-09-20 20:41:38,747 |                adage |   INFO | workflow completed successfully.
2021-09-20 20:41:38,747 |  yadage.steering_api |   INFO | done. dumping workflow to disk.
2021-09-20 20:41:38,749 |  yadage.steering_api |   INFO | visualizing workflow.
/usr/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdout = io.open(c2pread, 'rb', bufsize)
/usr/lib/python3.8/subprocess.py:842: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdin = io.open(p2cwrite, 'wb', bufsize)
/usr/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdout = io.open(c2pread, 'rb', bufsize)
/usr/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdout = io.open(c2pread, 'rb', bufsize)
/usr/lib/python3.8/subprocess.py:842: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdin = io.open(p2cwrite, 'wb', bufsize)
/usr/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdout = io.open(c2pread, 'rb', bufsize)
2021-09-20 15:41:39,542 | recastatlas.subcomma |   INFO | RECAST run finished.

RECAST result examples/rome recast-03fef822:
--------------
- name: CLs 95% based upper limit on poi
  value:
    exp: 0.8924846399964371
    exp_m1: 0.6377501820447065
    exp_m2: 0.4731739008380644
    exp_p1: 1.2720762961732819
    exp_p2: 1.7545752712294322
    obs: 1.3352971254860764
- name: CLs 95% at nominal poi
  value:
    exp: 0.25999040745937085
    exp_m1: 0.10547655600578199
    exp_m2: 0.03889040527686523
    exp_p1: 0.5345040672498215
    exp_p2: 0.8276574946063575
    obs: 0.574709475331039

(example) feickert@ThinkPad-X1:/tmp$ python -m pip --quiet install --upgrade "git+https://github.com/recast-hep/recast-atlas.git@4eba02ea6678f253a9bb578cf70385d01b581f32#egg=recast-atlas[local]"  # To have local extra not fail on install
(example) feickert@ThinkPad-X1:/tmp$ recast run examples/rome  # Rerun of the same command with upgraded release still works
(example) feickert@ThinkPad-X1:/tmp$ recast run examples/rome --backend local
2021-09-20 15:43:21,961 | packtivity.asyncback |   INFO | configured pool size to 12
2021-09-20 15:43:22,353 |      yadage.creators |   INFO | initializing workflow with initdata: {'did': 404958, 'dxaod_file': 'https://recastwww.web.cern.ch/recastwww/data/reana-recast-demo/mc15_13TeV.123456.cap_recast_demo_signal_one.root', 'xsec_in_pb': 0.00122} discover: True relative: True
2021-09-20 15:43:22,353 |    adage.pollingexec |   INFO | preparing adage coroutine.
2021-09-20 15:43:22,353 |                adage |   INFO | starting state loop.
2021-09-20 15:43:22,421 |     yadage.wflowview |   INFO | added </init:0|defined|unknown>
2021-09-20 15:43:23,476 |     yadage.wflowview |   INFO | added </eventselection:0|defined|unknown>
2021-09-20 15:43:24,965 |     yadage.wflowview |   INFO | added </statanalysis:0|defined|unknown>
2021-09-20 15:43:26,822 |    adage.pollingexec |   INFO | submitting nodes [</init:0|defined|known>]
2021-09-20 15:43:27,427 |       pack.init.step |   INFO | publishing data: <TypedLeafs: {'did': 404958, 'dxaod_file': 'https://recastwww.web.cern.ch/recastwww/data/reana-recast-demo/mc15_13TeV.123456.cap_recast_demo_signal_one.root', 'xsec_in_pb': 0.00122}>
2021-09-20 15:43:27,427 |                adage |   INFO | unsubmittable: 0 | submitted: 0 | successful: 0 | failed: 0 | total: 3 | open rules: 0 | applied rules: 3
2021-09-20 15:43:29,369 |           adage.node |   INFO | node ready </init:0|success|known>
2021-09-20 15:43:29,369 |    adage.pollingexec |   INFO | submitting nodes [</eventselection:0|defined|known>]
2021-09-20 15:43:29,370 | pack.eventselection. |   INFO | starting file logging for topic: step
/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdout = io.open(c2pread, 'rb', bufsize)
/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/subprocess.py:842: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdin = io.open(p2cwrite, 'wb', bufsize)
/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdout = io.open(c2pread, 'rb', bufsize)
2021-09-20 15:43:31,511 | pack.eventselection. |  ERROR | non-zero return code raising exception
2021-09-20 15:43:31,511 | pack.eventselection. |  ERROR | subprocess failed. code: 134,  command docker run --rm -i  --cidfile /tmp/recast-64ee8b66/eventselection/_packtivity/eventselection.cid      -v /tmp/recast-64ee8b66/eventselection:/tmp/recast-64ee8b66/eventselection:rw -v /tmp/recast-64ee8b66/init:/tmp/recast-64ee8b66/init:rw reanahub/reana-demo-atlas-recast-eventselection:1.0 sh -c bash
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/packtivity/handlers/execution_handlers.py", line 332, in execute_and_tail_subprocess
    raise subprocess.CalledProcessError(
subprocess.CalledProcessError: Command 'docker run --rm -i  --cidfile /tmp/recast-64ee8b66/eventselection/_packtivity/eventselection.cid      -v /tmp/recast-64ee8b66/eventselection:/tmp/recast-64ee8b66/eventselection:rw -v /tmp/recast-64ee8b66/init:/tmp/recast-64ee8b66/init:rw reanahub/reana-demo-atlas-recast-eventselection:1.0 sh -c bash' returned non-zero exit status 134.
2021-09-20 15:43:31,511 | pack.eventselection. |  ERROR | job execution if job {'name': 'eventselection', 'wflow_node_id': '734229ac-e75a-4f21-b412-456a2db08ff7', 'wflow_offset': '', 'wflow_stage': 'eventselection', 'wflow_stage_node_idx': 0, 'wflow_hints': {'is_purepub': False}} raise exception exception
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/packtivity/handlers/execution_handlers.py", line 332, in execute_and_tail_subprocess
    raise subprocess.CalledProcessError(
subprocess.CalledProcessError: Command 'docker run --rm -i  --cidfile /tmp/recast-64ee8b66/eventselection/_packtivity/eventselection.cid      -v /tmp/recast-64ee8b66/eventselection:/tmp/recast-64ee8b66/eventselection:rw -v /tmp/recast-64ee8b66/init:/tmp/recast-64ee8b66/init:rw reanahub/reana-demo-atlas-recast-eventselection:1.0 sh -c bash' returned non-zero exit status 134.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/packtivity/syncbackends.py", line 192, in run_packtivity
    run_in_env(job, env, state, metadata, pack_config, exec_config)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/packtivity/syncbackends.py", line 130, in run_in_env
    return handler(exec_config, environment, state, job, metadata)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/packtivity/handlers/execution_handlers.py", line 460, in docker_enc_handler
    result = run(config, state, log, metadata, rspec)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/packtivity/handlers/execution_handlers.py", line 404, in run_containers_in_docker_runtime
    execute_and_tail_subprocess(
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/packtivity/handlers/execution_handlers.py", line 340, in execute_and_tail_subprocess
    raise RuntimeError("failed container execution subprocess. %s", command_string)
RuntimeError: ('failed container execution subprocess. %s', 'docker run --rm -i  --cidfile /tmp/recast-64ee8b66/eventselection/_packtivity/eventselection.cid      -v /tmp/recast-64ee8b66/eventselection:/tmp/recast-64ee8b66/eventselection:rw -v /tmp/recast-64ee8b66/init:/tmp/recast-64ee8b66/init:rw reanahub/reana-demo-atlas-recast-eventselection:1.0 sh -c bash')
2021-09-20 15:43:33,371 |           adage.node |   INFO | node ready </eventselection:0|failed|known>
2021-09-20 15:43:33,391 | adage.controllerutil |   INFO | no nodes can be run anymore and no rules are applicable
2021-09-20 15:43:33,392 | adage.controllerutil |   INFO | no nodes can be run anymore and no rules are applicable
2021-09-20 15:43:33,392 |                adage |  ERROR | some weird exception caught in adage process loop
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/adage/__init__.py", line 51, in run_polling_workflow
    for stepnum, controller in enumerate(coroutine):
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/adage/pollingexec.py", line 89, in adage_coroutine
    raise RuntimeError('workflow finished but failed')
RuntimeError: workflow finished but failed
2021-09-20 15:43:33,393 |                adage |  ERROR | node: </eventselection:0|failed|known> failed. reason: unknown
2021-09-20 15:43:33,393 |                adage |   INFO | unsubmittable: 1 | submitted: 0 | successful: 1 | failed: 1 | total: 3 | open rules: 0 | applied rules: 3
2021-09-20 15:43:35,650 |  yadage.steering_api |   INFO | done. dumping workflow to disk.
2021-09-20 15:43:35,651 | recastatlas.subcomma |  ERROR | caught exception
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/recastatlas/backends/local.py", line 20, in run_workflow
    run_workflow(**spec)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/yadage/steering_api.py", line 20, in run_workflow
    pass
  File "/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/yadage/steering_api.py", line 110, in steering_ctx
    execute_steering(
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/yadage/steering_api.py", line 60, in execute_steering
    ys.run_adage(backend)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/yadage/steering_object.py", line 100, in run_adage
    adage.rundag(controller=self.controller, **self.adage_kwargs)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/adage/__init__.py", line 137, in rundag
    run_polling_workflow(controller, coroutine, update_interval, trackerlist, maxsteps)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/adage/__init__.py", line 51, in run_polling_workflow
    for stepnum, controller in enumerate(coroutine):
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/adage/pollingexec.py", line 89, in adage_coroutine
    raise RuntimeError('workflow finished but failed')
RuntimeError: workflow finished but failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/recastatlas/subcommands/run.py", line 52, in run
    run_sync(name, spec, backend=backend)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/recastatlas/backends/__init__.py", line 69, in run_sync
    BACKENDS[backend].run_workflow(name,spec)
  File "/home/feickert/.pyenv/versions/example/lib/python3.8/site-packages/recastatlas/backends/local.py", line 22, in run_workflow
    raise FailedRunException
recastatlas.exceptions.FailedRunException
Error: Workflow failed
Exception ignored in: <function Pool.__del__ at 0x7fc93d79c8b0>
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/multiprocessing/pool.py", line 268, in __del__
    self._change_notifier.put(None)
  File "/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/multiprocessing/queues.py", line 368, in put
    self._writer.send_bytes(obj)
  File "/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/home/feickert/.pyenv/versions/3.8.11/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor
(example) feickert@ThinkPad-X1:/tmp$

The resulting recast directory is attached as the following zip file (recast-64ee8b66.zip), but perhaps relevant is also the output of the eventselection.run.log

(example) feickert@ThinkPad-X1:/tmp$ cat recast-64ee8b66/eventselection/_packtivity/eventselection.run.log 
2021-09-20 15:43:30,556 | pack.eventselection. |   INFO | starting file logging for topic: run
2021-09-20 15:43:31,059 | pack.eventselection. |   INFO | b'Configured GCC from: /opt/lcg/gcc/6.2.0binutils/x86_64-slc6'
2021-09-20 15:43:31,060 | pack.eventselection. |   INFO | b'Configured AnalysisBase from: /usr/AnalysisBase/21.2.51/InstallArea/x86_64-slc6-gcc62-opt'
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b'xAOD::Init                INFO    Environment initialised for data access'
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b'SampleHandler with 1 files'
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b'Sample:name=sample,tags=()'
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b'https://recastwww.web.cern.ch/recastwww/data/reana-recast-demo/mc15_13TeV.123456.cap_recast_demo_signal_one.root'
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b''
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b''
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b'/build2/atnight/localbuilds/nightlies/21.2/AnalysisBase/athena/PhysicsAnalysis/D3PDTools/EventLoop/Root/Driver.cxx:107:exception: could not create output directory /tmp/recast-64ee8b66/eventselection/submitDir'
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b"terminate called after throwing an instance of 'RCU::ExceptionMsg'"
2021-09-20 15:43:31,182 | pack.eventselection. |   INFO | b'what():  /build2/atnight/localbuilds/nightlies/21.2/AnalysisBase/athena/PhysicsAnalysis/D3PDTools/EventLoop/Root/Driver.cxx:107:exception: could not create output directory /tmp/recast-64ee8b66/eventselection/submitDir'
2021-09-20 15:43:31,284 | pack.eventselection. |   INFO | b'bash: line 10:   227 Aborted                 (core dumped) myEventSelection /tmp/recast-64ee8b66/eventselection/submitDir recast_inputs.txt recast_xsecs.txt 30.0'
@matthewfeickert
Copy link
Member Author

This is the same issue as Issue #118.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant