Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble demultiplexing dual indexed reads #378

Closed
thermokarst opened this issue Apr 24, 2019 · 5 comments

Comments

@thermokarst
Copy link

commented Apr 24, 2019

Thanks for the great tool, @marcelm! I am trying to test out the recently-added dual-index demultiplexing functionality. I can get DI trimming to work, but not demultiplexing --- no reads are demultiplexed. I am hoping this is just user error on my end --- any assistance would be greatly appreciated!

Possibly related to #347.

  • cutadapt 2.1
  • python 3.6

Minimum working example

forward.fastq.gz
@id1
AAAAACGTACGT
+
zzzzzzzzzzzz
@id2
CCCCACGTACGT
+
zzzzzzzzzzzz
@id3
AAAAACGTACGT
+
zzzzzzzzzzzz
@id4
CCCCACGTACGT
+
zzzzzzzzzzzz
@id5
CCCCACGTACGT
+
zzzzzzzzzzzz
@id6
GGGGACGTACGT
+
zzzzzzzzzzzz
reverse.fastq.gz

@id1
GGGGTGCATGCA
+
zzzzzzzzzzzz
@id2
TTTTTGCATGCA
+
zzzzzzzzzzzz
@id3
GGGGTGCATGCA
+
zzzzzzzzzzzz
@id4
TTTTTGCATGCA
+
zzzzzzzzzzzz
@id5
TTTTTGCATGCA
+
zzzzzzzzzzzz
@id6
TTTTTGCATGCA
+
zzzzzzzzzzzz

command run:

cutadapt \
  -g sample1=AAAA \
  -G sample1=GGGG \
  -g sample2=CCCC \
  -G sample2=TTTT \
  --pair-adapters \
  --error-rate 0 \
  -o output/{name}.1.fastq.gz \
  -p output/{name}.2.fastq.gz \
  --untrimmed-output output/untrimmed.1.fastq.gz \
  --untrimmed-paired-output output/untrimmed.2.fastq.gz \
  forward.fastq.gz \
  reverse.fastq.gz

This results in two "untrimmed" files:

output/untrimmed.1.fastq.gz
@id1
ACGTACGT
+
zzzzzzzz
@id2
ACGTACGT
+
zzzzzzzz
@id3
ACGTACGT
+
zzzzzzzz
@id4
ACGTACGT
+
zzzzzzzz
@id5
ACGTACGT
+
zzzzzzzz
@id6
GGGGACGTACGT
+
zzzzzzzzzzzz
output/untrimmed.2.fastq.gz

@id1
TGCATGCA
+
zzzzzzzz
@id2
TGCATGCA
+
zzzzzzzz
@id3
TGCATGCA
+
zzzzzzzz
@id4
TGCATGCA
+
zzzzzzzz
@id5
TGCATGCA
+
zzzzzzzz
@id6
TTTTTGCATGCA
+
zzzzzzzzzzzz

There are two surprising things here:

  • There are no demultiplexed files, even though the barcodes were clearly found and removed
  • The trimmed reads are showing up in the untrimmed output. My guess is that it is just because they weren't demultiplexed, but I have no idea.

Did I misunderstand this part of the documentation:

image

If I run this command sans --pair-adapters the reads appear to demultiplex as expected.

Thanks for reading this far, please let me know if you need additional details from me.

@marcelm

This comment has been minimized.

Copy link
Owner

commented Apr 25, 2019

I confirm what you’re seeing. I believe my tests weren’t sufficient. I’ll investigate.

@marcelm marcelm closed this in 57656f0 Apr 25, 2019

@marcelm

This comment has been minimized.

Copy link
Owner

commented Apr 25, 2019

Ok, this was a bit embarrasing because the --pair-adapters option was explicitly designed for demultiplexing (as the documentation states), but I only tested whether trimming (without demultiplexing) worked properly.

Thanks by the way for the really nice test! That has made it very easy to reproduce. I have taken the liberty of integrating it as a unit test into Cutadapt.

I have fixed the problem now.

@marcelm

This comment has been minimized.

Copy link
Owner

commented Apr 25, 2019

Cutadapt 2.3 with the fix is now on PyPI. Bioconda package coming soon.

@thermokarst

This comment has been minimized.

Copy link
Author

commented Apr 25, 2019

Awesome, thanks @marcelm! Tested 2.3 out and everything works as expected. Thanks for the quick turnaround!

@marcelm

This comment has been minimized.

Copy link
Owner

commented Apr 25, 2019

Great to hear, glad I could help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.