Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

READS ARE NOT AT LEAST 75 BP LONG error #17

Open
cooketho opened this issue Jan 27, 2019 · 2 comments
Open

READS ARE NOT AT LEAST 75 BP LONG error #17

cooketho opened this issue Jan 27, 2019 · 2 comments

Comments

@cooketho
Copy link

cooketho commented Jan 27, 2019

ORP fails on the test data set when I trim the reads to 75 bp, with the following error:

IT LOOKS LIKE YOUR READS ARE NOT AT LEAST 75 BP LONG,
 PLEASE EDIT YOUR COMMAND USING THE SPADES2_KMER=INT FLAGS,
 SETTING THE ASSEMBLY KMER LENGTH LESS THAN YOUR READ LENGTH 

/bin/bash: line 8: shell: command not found
make: *** [readcheck] Error 127

The culprit seems to be the following line from oyster.mk:

if [ $$(gzip -cd $${READ1} | head -n 400 | awk '{if(NR%4==2) {count++; bases += length} } END{print int(bases/count)}') -gt 75 ] && [ $$(gzip -cd $${READ2} | head -n 400 | awk '{if(NR%4==2) {count++; bases += length} } END{print int(bases/count)}') -gt 75 ];\

After some digging, it looks like rnaspades.py actually does need k-mer -k to be an odd number less than the read length. This requirement for an odd number should also be added to the error message, which should read "76 bp", not "75 bp". Furthermore, the spades developers evidently recommend -k of about half the read length (ablab/spades#215). There should probably be some discussion of this in the ORP documentation.

My workaround is to change that line to:

if [ $$(gzip -cd $${READ1} | head -n 400 | awk '{if(NR%4==2) {count++; bases += length} } END{print int(bases/count)}') -gt ${SPADES2_KMER} ] && [ $$(gzip -cd $${READ2} | head -n 400 | awk '{if(NR%4==2) {count++; bases += length} } END{print int(bases/count)}') -gt ${SPADES2_KMER} ];\

so that "75" is not hard-coded in there.

@macmanes
Copy link
Contributor

Hi,

Thanks for your reports - I will respond to each. In this case, the error message has been updated to 76bp, and I will add the part about odd numbers. I know SPAdes rec. is 50% of read length, but we with this second SPAdes assembly we are trying to to recover specifically higher expression transcripts.

Of note, you could have avoided editing the code by passing the SPADES2_KMER=INT flag, if I'm understanding your issue properly.

@cooketho
Copy link
Author

I could be wrong, but I don't think the issue can be fixed by just passing the SPADES2_KMER=INT flag. I tried this, but couldn't get it to work. The reason is: if that if statement executes (which it does if your reads are 75 bp, regardless of what you set SPADES2_KMER as), it prints the error message and then calls $$(shell exit);\, ending the program. The if statement should check whether the read length is compatible with the current SPADES2_KMER setting, not whether it is > 75.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants