Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs from "completed" moved to "shouldrun" #413

Closed
LudvigOlsen opened this issue Feb 19, 2024 · 3 comments
Closed

Jobs from "completed" moved to "shouldrun" #413

LudvigOlsen opened this issue Feb 19, 2024 · 3 comments

Comments

@LudvigOlsen
Copy link

LudvigOlsen commented Feb 19, 2024

Last night, I ran the gwf status summary twice in a row in my workflow. I didn't change anything in-between. Jobs that were marked as completed in the first summary were somehow marked as shouldrun in the second summary.

billede

gwfss is just an alias for making the summary with some timestamps around it. So:
alias gwfss="echo '----------------------------'; echo 'gwf status'; date; gwf status -f summary; date; echo '----------------------------'".

I am not sure how to reproduce this. It seems kind of random since I didn't change any files or anything.

@dansondergaard
Copy link
Collaborator

That sounds weird. I haven't heard of any other instances of this happening. It could be many many things not related to gwf at all (the filesystem, Slurm reporting intermittent states, jobs not being completely finished, but being counted as "completed" by gwf for a short while).

Have you experienced this since you reported it?

@LudvigOlsen
Copy link
Author

LudvigOlsen commented Jun 19, 2024

Hi Dan,
I haven't experienced it since no. Not sure what went wrong. I see how it could be a problem in slurm and not GWF.

I have recently started using the following instead of gwf status since it returns instantly (gwf status takes 45min with my giant workflow :-) ):

alias sqs='squeue -u $USER -o "%.10i %.9P %.8j %.8u %.2t %.10M %.6D %.6R" | awk '\''
NR > 1 {
    state = $5
    if (state == "R") state = "Running"
    else if (state == "PD") state = "Pending"
    else if (state == "CG") state = "Completing"
    else if (state == "CD") state = "Completed"
    else if (state == "F") state = "Failed"
    else if (state == "TO") state = "Timeout"
    # Add more state mappings as needed
    count[state]++
}
END {
    for (status in count) print status, count[status]
}'\'

@dansondergaard
Copy link
Collaborator

I'll close this for now then :-)

(gwf status needs to ask for metadata on every single file in your workflow to determine if everything is up-to-date, while squeue just gives you the list of jobs and their status).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants