Skip to content

When performance logging fails, ReFrame crashes with a confusing dump #2747

@vkarak

Description

@vkarak

That doesn't even help you to debug:

ERROR: run session stopped: attribute error: 'AsynchronousExecutionPolicy' object has no attribute '_advance_retired'
ERROR: Traceback (most recent call last):
  File "/users/eirinik/repos/reframe-test/reframe/frontend/cli.py", line 1267, in main
    runner.runall(testcases, restored_cases)
  File "/users/eirinik/repos/reframe-test/reframe/core/logging.py", line 892, in _fn
    return fn(*args, **kwargs)
  File "/users/eirinik/repos/reframe-test/reframe/frontend/executors/__init__.py", line 518, in runall
    self._runall(testcases)
  File "/users/eirinik/repos/reframe-test/reframe/frontend/executors/__init__.py", line 579, in _runall
    self._policy.exit()
  File "/users/eirinik/repos/reframe-test/reframe/frontend/executors/policies.py", line 340, in exit
    self._advance_all(self._current_tasks, timeout)
  File "/users/eirinik/repos/reframe-test/reframe/frontend/executors/policies.py", line 411, in _advance_all
    bump_state = getattr(self, f'_advance_{t.state}')
AttributeError: 'AsynchronousExecutionPolicy' object has no attribute '_advance_retired'

The backtrace is the same as in #2715 and its root cause is that errors raised inside the on_task_{success,failure} execution policy callbacks are not handled correctly and execution proceeds with the _current_tasks not in a consistent state; so then it tries to advance a test case that has already finished (RETIRED state) and reframe crashes at a later time producing the above error, which although it was caused by an earlier failure, there is no path that allows you to track the root cause.

This crash has appeared repeatedly when something was wrong with the performance logging, but it's not limited to it.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions