New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality trimming with --nextseq-trim not applied to second read? #244

Closed
flo-compbio opened this Issue May 5, 2017 · 4 comments

Comments

Projects
None yet
3 participants
@flo-compbio

flo-compbio commented May 5, 2017

I'm working with paired-end reads. In the second FASTQ file (containing the second mates of the pairs), I have the following read:

@XYZ
AAANAAAAAAAAAAAAAAAAAAAAAAAAAAGAGGGGAAAGCGGGGGTGGGGGGAAGGGGGGAAGGGAGGGGGGGGGGCGAAAA
+
/FF#FFFFFFFFFFFAFF/FAF//FF/A///////////////F/F////F///////////////////F/AAA//6/////

The quality scores of this read are low. The end of the read is all / = ASCII character 47 - 33 = Phred score 14. However, when I run cutadapt with --nextseq-trim=20, the read is not quality-trimmed at all, while the first read (not shown) is trimmed properly.

In contrast, when I run cutadapt with -q 20, the second read is trimmed properly:

@XYZ
AAANAAAAAAAAAAAAAAAAAAAAAAAAAAGAGGGGAAAGCGGGGGTGGGGGGAAGGGGGGAAGGGAGGGGGGGG
+
/FF#FFFFFFFFFFFAFF/FAF//FF/A///////////////F/F////F///////////////////F/AAA

(I know that in this case, it would probably make sense to trim more aggressively. This is just an example to demonstrate this particular behavior of cutadapt.)

If this is the desired behavior, then the documentation should be updated to reflect that. Currently, it incorrectly states:

This works like regular quality trimming (where one would use -q 20 instead), except that the qualities of G bases are ignored.

@flo-compbio flo-compbio changed the title from Quality trimming with --nextseq-trim not applied to second read? to Quality trimming with ```--nextseq-trim``` not applied to second read? May 5, 2017

@flo-compbio flo-compbio changed the title from Quality trimming with ```--nextseq-trim``` not applied to second read? to Quality trimming with `--nextseq-trim` not applied to second read? May 5, 2017

@flo-compbio flo-compbio changed the title from Quality trimming with `--nextseq-trim` not applied to second read? to Quality trimming with ``--nextseq-trim`` not applied to second read? May 5, 2017

@flo-compbio flo-compbio changed the title from Quality trimming with ``--nextseq-trim`` not applied to second read? to Quality trimming with --nextseq-trim not applied to second read? May 5, 2017

@marcelm

This comment has been minimized.

Owner

marcelm commented May 5, 2017

Can you please provide the full command-line that you are using?

@marcelm marcelm closed this in 1cc8470 May 5, 2017

@marcelm

This comment has been minimized.

Owner

marcelm commented May 5, 2017

Ok, I think I found it. You are correct, Nextseq trimming was only done on the first read - I had not tested it on paired-end data. Should be fixed now. Thanks for reporting!

@flo-compbio

This comment has been minimized.

flo-compbio commented May 5, 2017

Great, thanks for fixing it!

@kkanger

This comment has been minimized.

kkanger commented Mar 27, 2018

Hi,
I seem to have the same problem as described above using cutadapt v.1.16 and paired-end data. Poly-G tails in reverse reads were not removed although I specified the --nextseq-trim=20 option. My full command was:
cutadapt --nextseq-trim=20 --max-n 0 --minimum-length 35 -o trimmed/sample.R1_trimmed.fastq -p trimmed/sample.R2_trimmed.fastq sample.R1.fastq sample.R2.fastq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment