Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart from iteration 4 stopping when entering main loop for Iteration 5 #406

Open
djbradshaw2 opened this issue May 1, 2024 · 6 comments

Comments

@djbradshaw2
Copy link

Dear Gubbins Creators,

Thank you for such a great tool! I am having issues with restarting a run from the core.full.aln from snippy results for comparing 4,149 bacterial short read assemblies that was stopped by slurm before finishing. Near as I can tell it was in the middle of the 5th iteration, so I restarted from the completed 4th iteration. Attached is the stdout which stops right after it says "Entering main loop". I moved all files related to iteration 4 from the temporary directory from the initial failed run (screenshot below) to the main directory to assist with the restart but did not have access to any outputs in the main directory since the script was requeued and restarted and apparently wiped those files away. Is there something wrong with my command or setup? Can I not restart without the non-temporary directory files? Thanks for your time and help. Please let me know if you need any other information.

Thank you for your time and help.

Sincerely,

David Bradshaw

Gubbins Version: 3.3.0

Scripts:
snippy-clean_full_aln core.full.aln > SX519_Chromosomal_Ref_clean.core.full.aln

run_gubbins.py --first-tree-builder rapidnj --first-model JC -p gubbins -c 32 -v SX519_Chromosomal_Ref_clean.core.full.aln

run_gubbins.py --first-tree-builder rapidnj --first-model JC -p gubbins -c 32 -v SX519_Chromosomal_Ref_clean.core.full.aln --resume SX519_Chromosomal_Ref_clean.core.full.iteration_4.tre.rooted

stderr:
Resuming Gubbins analysis at iteration 5
Initial tree builder is not used if a starting tree is provided

stdout: Full version attached, summary below
gubbins_restart_iteration_4_stderr.txt

Checking dependencies and input files...
File exists: ./SX519_Chromosomal_Ref_clean.core.full.iteration_4.internal.joint.tre

Checking input alignment file...

Filtering input alignment...

###Bunch of Lines###

Running Gubbins to detect SNPs...
gubbins /90daydata/fsepru113/dbradshaw/senftenberg/Senftenberg_020924/SX519_chromo_gubbins/tmpv0er2vw5/SX519_Chromosomal_Ref_clean.core.full.aln
...done. Run time: 4120.56 s

Entering the main loop.

*** Iteration 5 ***

Temporary Directory from Latest Failed Run:
SX519_Chromosomal_Ref_clean.core.full.aln
SX519_Chromosomal_Ref_clean.core.full.iteration_4.tre.rooted

Temporary Directory from Initial Failed Run:
image

@nickjcroucher
Copy link
Owner

This is a lack of detailed documentation/bug, sorry - it should work if you run:

cp SX519_Chromosomal_Ref_clean.core.full.iteration_4.tre.rooted SX519_Chromosomal_Ref_clean.core.full.iteration_4.tre
run_gubbins.py --first-tree-builder rapidnj --first-model JC -p gubbins -c 32 -v SX519_Chromosomal_Ref_clean.core.full.aln --resume SX519_Chromosomal_Ref_clean.core.full.iteration_4.tre

The problem is that Gubbins expects [tree file].phylip to exist, which is not the case if (a) .rooted is at the end, or (b) the file is in the temporary directory still. Apologies, I will try to fix this, or at least put in a more informative error message.

@djbradshaw2
Copy link
Author

Thank you for the quick response!

I reran as you suggested, but it unfortunately did the same thing and stopped in the same spot. Below are the details. Please let me know you need any other information.

If it helps, the [tree file].phylip from iteration 4 was not even in the temporary directory near as I can tell, although as you said it would not exist if it was rooted. When the slurm run requeue and reran in the middle of the night before I could stop it, it replaced anything that was not in the temporary directory near as I can tell, so if any of those files were necessary to restart, I would not have them.

Thanks for your time and help.

Code ran via slurm/sbatch per your advice after cping [tree file].tre.rooted to just {tree file}.tre file:
run_gubbins.py --first-tree-builder rapidnj --first-model JC -p gubbins -c 32 -v SX519_Chromosomal_Ref_clean.core.full.aln --resume SX519_Chromosomal_Ref_clean.core.full.iteration_4.tre

Attached is the stdout file
post_github_advice_stdout.txt

ls -lah - Anything made on May 2nd was from the latest runs post your advice:
image

@djbradshaw2
Copy link
Author

My other run completed, and I was able to get all the expected outputs. So, I no longer need to restart from a killed and restarted slurm run with deleted non-temporary folder intermediates. If you no longer need this thread to be reminded of the documentation/bug you referenced, please feel free to close it. Thank you very much for your time, expertise, and wonderful tool!

@nickjcroucher
Copy link
Owner

Thanks, I'm glad that run worked! Sorry for the lack of response - I will keep this open to remind me this needs fixing.

@djbradshaw2
Copy link
Author

No worries, can imagine you being very busy. Aye aye to keeping it open.

@IanaAmke
Copy link

I'm having the same problem resuming my run even if I have the .tre and .phylip files in my main folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants