Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Discarding reads based on primer mismatch on just one read #242
I am using cutadapt to trim the primers from my sequences from Illumina (very handy thank you for developing it!). The data was given to me in a way that reads 1 and 2 are interleaved. I was looking for the presence of primers and I found that the reverse primer is not present in read one, but the forward primer is present at the beginning of read two. So what I want to do now is to trim the forward primer from read 2 and discard both reads if it's not present in the read. What is happening is that everything is discarded because nothing is found in read 1 (I think that is what is happening). Is there any way to eliminate the reads just based on the presence/absence of the forward primer in read 2 ?
Thank you !
Hello I used this loop: for i in *_renamed.fastq do cutadapt --interleaved —discard-untrimmed -G GGACTACHVGGGTWTCTAAT --overlap 20 -e 0.1 -o trimmed2/$i $i done And here is an extract of my data: @MISEQ08:252:000000000-AFN7U:1:1101:16276:1658 1:N:0:TCTGGAACGCT CTGGAGTGCCAGCAGCCGCGGTAATACGTAGGGTGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGGGCTCGTAGGCGGTCCGTCACGTCGGGAGTGAAAACTCGGGGCTTAACCCCGAGCC + 6A@--CF<CF<FF@<FFFG>:F:,C@F@F9<;7E7C,,;D;FEFECFB7,+CF,CE,,,:6@ +,6,,,6CC=@7,,:++ @MISEQ08:252:000000000-AFN7U:1:1101:16276:1658 2:N:0:TCTGGAACGCT CCGGACTACCGGGGTTTCTAATCCCGTTTGCTACCCTAGCTTT + Best regards Maria Rovisco Monteiro 2017-05-03 16:51 GMT+12:00 Marcel Martin <firstname.lastname@example.org>:…
Can you please let me know which command you are currently using? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#242 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGs768SNaLPzYIjXMN4Xh0gRdkwVi4wrks5r2AfOgaJpZM4NO59o> .
Thanks, this is possibly a bug. See also the discussion at issue #243. Unfortunately, the workaround I suggest there, which is to swap the R1 and R2 input and output files, does not apply in your case since you have interleaved data. However, you can instead add the option
As an explanation: The bug is that R1 reads are counted as "untrimmed" since no adapter was removed from them - even though no adapters to be removed were provided. By default, each read pair is considered to be "untrimmed" if any of the two reads are untrimmed. Here, this will always be the case and all reads are discarded.
Please do not close this issue even if the workaround helps.