Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Cutadapt outputting some untrimmed second reads when demultiplexing paired-end reads ("--untrimmed-paired-output" argument seems not to work) #347
See biostars post here:
I am attempting to demultiplex barcoded 100bp paired-end illumina short-read sequencing data with cutadapt following these instructions: https://cutadapt.readthedocs.io/en/stable/guide.html#demultiplexing
I only want to retain read pairs where the barcodes were found and trimmed/removed in both reads of a pair.
However cutadapt is outputting untrimmed second reads despite having specified the "--untrimmed-paired-output" argument.
Full details of my analysis are as follows (using cutadapt 1.17 installed with pip; Python 3.6.5):
Before cutadapt demultiplexing (example read1 FASTQ entry in Input_1.fastq.gz):
Before cutadapt demultiplexing (example read2 FASTQ entry in Input_2.fastq.gz):
After cutadapt demultiplexing (example read1 FASTQ entry in Input_Rep1_read1.fastq):
After cutadapt demultiplexing (example read2 FASTQ entry in Input_Rep1_read2.fastq):
As you can see, the barcode was not matched in the second read but both reads nevertheless still appear in the supposedly trimmed output files (Input_Rep1_read1.fastq and Input_Rep1_read2.fastq).
I have tried specifying "--pair-filter=any" although this is the default setting. Neither specifying "any" nor "both" makes any difference to this read pair being retained despite the second read being untrimmed.
Any help would be appreciated!
I meet the same problem.
Perhaps this is the designed behavior, as it writes in the doc https://cutadapt.readthedocs.io/en/stable/guide.html#demultiplexing
Therefore, I think currently
However, I think that make
Hi, and sorry for the delay. Yes, the current behavior is as designed, as @qifei9 pointed out.
So currently, only the adapter matches in R1 are used as the demultiplexing criterion. Additionally, whatever happens in R2 is completely independent of R1. So you can trim whichever adapters you want from R2 or none at all, it doesn’t influence the demultiplexing.
So what would the desired functionality actually be? I could think of adding a new option, let’s call it
I think this would solve the problem. Demultiplexing would still only look at the the R1 adapter match, but then you could be sure that it has been found together with a matching adapter in R2. All the pairs where the adapter in R1 does not match the one in R2 would end up in the
I’ve implemented the
Please see the documentation here and let me know whether that is what you need:
Great that sounds perfect! Exactly what I need. Thanks, Andre…
On Thu, Mar 14, 2019 at 3:51 PM Marcel Martin ***@***.***> wrote: I’ve implemented the --pair-adapters option now. The --pair-filter option is orthogonal, so I haven’t touched it. Please see the documentation here and let me know whether that is what you need: https://cutadapt.readthedocs.io/en/latest/guide.html#paired-adapters-dual-indices — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#347 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AlcUXeT2q90JpcN2jEE0D_Uecq7E_0g2ks5vWmIMgaJpZM4ZYZxK> .