-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update prefetch #970
update prefetch #970
Conversation
@@ -33,6 +33,7 @@ rule run2sra: | |||
# and remove the top level folder since prefetch will assume we are done otherwise | |||
mkdir -p {params.outdir}; cd {params.outdir}; rm -r {wildcards.run} | |||
|
|||
# TODO: for loop can be removed if we dont see the debug message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't understand what the original code does actually haha
Im not sure actually why this lock is here. I guess some filelock issues with many parallel starts. But why only this rule specifically? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefetch can fail (it even says so in the docs) so apparently we gave it additional retries, in addition to the rule having retries. No idea if it is still needed, that's why I added some DEBUG messages + this note to self.
The filelock... im not so sure about.
# TODO: this is the strangest bug, in that on some machines (ocimum) prefetch downloads | ||
# to a different location. Not sure what causes this, but this should fix that. Could simply | ||
# be a global setting that we haven't discovered yet... | ||
# TODO: section can be removed if we dont see the debug message | ||
# bug report: https://github.com/ncbi/sra-tools/issues/533 | ||
if [[ -f "{params.outdir}/{wildcards.run}.sra" ]]; then | ||
echo "DEBUG: moving output to correct directory" >> {log} | ||
mkdir -p {params.outdir}/{wildcards.run} | ||
mv {params.outdir}/{wildcards.run}.sra {output} | ||
fi | ||
|
||
# TODO: section can be removed if we dont see the debug message | ||
# If an sralite file was downloaded instead of a sra file, just rename it | ||
if [[ -f "{params.outdir}/{wildcards.run}.sralite" ]]; then | ||
echo "DEBUG: renaming SRAlite" >> {log} | ||
mkdir -p {params.outdir}/{wildcards.run} | ||
mv {params.outdir}/{wildcards.run}.sralite {output} | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't be so sure that just because we don't have these issues with our test data, that we don't get them in general. The sralite one used to be specific to some accessions if I remember correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the top issue should be fixed in prefetch now. I left it in because I agree ;)
Nope, you really can't pipe
prefetch
output. But we can update the tool! Prefetch received bugfixes and can now accept and unlimited max-size. And we can clean up the yaml a bit (pigz isn't used anymore?)