Skip to content

Bug in failure statistics #2057

@ekouts

Description

@ekouts

Even though all the tests have finished and there shouldn't be any more jobs from ReFrame the printing of the statistics caused the following exception:

==============================================================================
FAILURE STATISTICS

Total number of test cases: 39
Total number of failures: 1

Phase         #     Failing test cases
------------- ----- ------------------------------------------------------------
performance   1     [CPMDCheck_small, builtin, daint:gpu]
------------------------------------------------------------------------------
[31m./bin/reframe: run session stopped: type error: conversion from NoneType to Decimal is not supported[0m
[31m./bin/reframe: Traceback (most recent call last):
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/frontend/cli.py", line 1006, in main
    runner.runall(testcases, restored_cases)
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/frontend/executors/__init__.py", line 434, in runall
    self._retry_failed(testcases + restored_cases)
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/frontend/executors/__init__.py", line 475, in _retry_failed
    self._runall(failed_cases)
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/frontend/executors/__init__.py", line 504, in _runall
    self._policy.exit()
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/frontend/executors/policies.py", line 532, in exit
    self._poll_tasks()
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/frontend/executors/policies.py", line 446, in _poll_tasks
    part.scheduler.poll(*part_jobs)
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/core/schedulers/slurm.py", line 434, in poll
    self._cancel_if_blocked(job)
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/core/schedulers/slurm.py", line 467, in _cancel_if_blocked
    completed = _run_strict('squeue -h -j %s -o %%r' % job.jobid)
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/utility/osext.py", line 72, in run_command
    completed.returncode)
reframe.core.exceptions.SpawnedProcessError: command 'squeue -h -j 32385747 -o %r' failed with exit code 1:
--- stdout ---
--- stdout ---
--- stderr ---
slurm_load_jobs error: Socket timed out on send/recv operation

--- stderr ---

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/frontend/cli.py", line 1067, in main
    junit_xml = runreport.junit_xml_report(json_report)
  File "/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe/frontend/runreport.py", line 216, in junit_xml_report
    'time': str(decimal.Decimal(tc['time_total'])),
TypeError: conversion from NoneType to Decimal is not supported
[0m
Log file(s) saved in: '/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe.log', '/scratch/snx3000/jenscscs/daint102/workspace/reframe-daint-production-daily/reframe.out'

On top it says that there was a type error because of conversion from NoneType to Decimal is not supported but then there is also this message slurm_load_jobs error: Socket timed out on send/recv operation.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions