Skip to content

Conversation

@183amir
Copy link
Collaborator

@183amir 183amir commented Jul 18, 2024

fix: handle sacct failures
fix: retrieve job status from scontrol when sacct is not available
fix: do not delete the logs folder when the database is not empty
fix: do not recommend installing with pixi


📚 Documentation preview 📚: https://gridtk--6.org.readthedocs.build/en/6/

183amir and others added 4 commits July 17, 2024 16:06
Some Slurm installations have accounting disabled and you can only get information about the job using scontrol show job <slurm-job-id>
For now, we just handle the failed call here.
It can happen that the logs folder is empty (all jobs are pending) but the database is not empty. We should not delete the logs folder in this case.
when installing with pixi global install, the CONDA_PREFIX and PATH env variables are shadowed (see prefix-dev/pixi#1382) and this breaks binary discovery (e.g. python) when a job is submitted.
This can happen on slurm installations where accounting is disabled.
@183amir 183amir requested a review from Yannick-Dayer July 18, 2024 10:25
@github-actions
Copy link

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  src/gridtk
  manager.py 41-55, 60-67, 78-83
Project Total  

This report was generated by python-coverage-comment-action

Copy link
Member

@Yannick-Dayer Yannick-Dayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. 👍🏼
I don't have the setup to try that specific case but the logic seems OK, using scontrol instead of sacct and adapting the output.

@183amir 183amir merged commit 3e2fc62 into main Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants