Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All reads ending up in none bin when demultiplexing #44

Open
jverlouw opened this issue Feb 1, 2018 · 6 comments
Open

All reads ending up in none bin when demultiplexing #44

jverlouw opened this issue Feb 1, 2018 · 6 comments

Comments

@jverlouw
Copy link

jverlouw commented Feb 1, 2018

Hello,

We're currently testing nanopore runs from our GridION. However, the live basecalling software guppy does not seem to support demultiplexing at the moment. As such I wanted to give Porechop a go on this data to compare to Albacore basecalling and demultiplexing.
However, I'm running into the problem that Porechop will recognize the correct barcodes at good identity levels, but keeps binning all reads into the none bin, no matter how lenient I set the barcode recognition. Do you have any idea what might be causing this behaviour? This occurs on both the guppy called fastq files and the albacore fastq files.

Thanks in forward!
Joost

Example commands:
porechop -i total.fastq -b Porechop_Test/
porechop -i total.fastq -b Porechop_Test/ --barcode_threshold 60 --barcode_diff 1 --threads 20
Log:

Loading reads
total.fastq
1,255,086 reads loaded


Looking for known adapter sets
10,000 / 10,000 (100.0%)
                                        Best
                                        read       Best
                                        start      read end
  Set                                   %ID        %ID
  SQK-NSK007                                82.8       78.3
  Rapid                                     70.4        0.0
  SQK-MAP006                                80.0       78.3
  SQK-MAP006 Short                          84.0       79.2
  PCR adapters 1                            78.3       77.3
  PCR tail 1                                75.0       75.0
  PCR tail 2                                73.3       75.9
  1D^2 part 1                               71.0       71.4
  1D^2 part 2                               79.4       78.1
  Barcode 1 (reverse)                       76.0      100.0
  Barcode 2 (reverse)                       76.9      100.0
  Barcode 3 (reverse)                       76.0       79.2
  Barcode 4 (reverse)                       72.0       73.3
  Barcode 5 (reverse)                       76.9       75.0
  Barcode 6 (reverse)                       76.0       76.9
  Barcode 7 (reverse)                       76.0       75.0
  Barcode 8 (reverse)                       77.8       79.2
  Barcode 9 (reverse)                       76.0       73.1
  Barcode 10 (reverse)                      75.0       80.0
  Barcode 11 (reverse)                      79.2       75.0
  Barcode 12 (reverse)                      79.2       79.2
  Barcode 1 (forward)                      100.0      100.0
  Barcode 2 (forward)                      100.0      100.0
  Barcode 3 (forward)                       79.2       79.2
  Barcode 4 (forward)                       83.3       76.0
  Barcode 5 (forward)                       76.0       76.0
  Barcode 6 (forward)                       77.8       76.9
  Barcode 7 (forward)                       76.0       75.0
  Barcode 8 (forward)                       75.0       72.0
  Barcode 9 (forward)                       73.1       76.0
  Barcode 10 (forward)                      75.0       74.1
  Barcode 11 (forward)                      77.8       75.0
  Barcode 12 (forward)                      79.2       79.2
  Barcode 13 (forward)                      76.0       76.0
  Barcode 14 (forward)                      75.0       74.1
  Barcode 15 (forward)                      76.0       76.9
  Barcode 16 (forward)                      75.0       76.0
  Barcode 17 (forward)                      79.2       76.0
  Barcode 18 (forward)                      72.0       72.0
  Barcode 19 (forward)                      79.2       76.0
  Barcode 20 (forward)                      76.9       76.9
  Barcode 21 (forward)                      76.0       76.9
  Barcode 22 (forward)                      76.0       76.9
  Barcode 23 (forward)                      76.0       76.9
  Barcode 24 (forward)                      75.0       79.2
  Barcode 25 (forward)                      76.9       76.0
  Barcode 26 (forward)                      79.2       76.9
  Barcode 27 (forward)                      77.8       76.0
  Barcode 28 (forward)                      75.0       74.1
  Barcode 29 (forward)                      76.9       76.0
  Barcode 30 (forward)                      76.9       76.9
  Barcode 31 (forward)                      80.0       79.2
  Barcode 32 (forward)                      75.0       80.0
  Barcode 33 (forward)                      76.9       80.0
  Barcode 34 (forward)                      75.0       74.1
  Barcode 35 (forward)                      73.1       75.0
  Barcode 36 (forward)                      76.9       77.8
  Barcode 37 (forward)                      79.2       76.9
  Barcode 38 (forward)                      75.0       79.2
  Barcode 39 (forward)                      76.0       76.9
  Barcode 40 (forward)                      76.9       73.1
  Barcode 41 (forward)                      75.0       80.0
  Barcode 42 (forward)                      76.9       76.9
  Barcode 43 (forward)                      76.9       76.0
  Barcode 44 (forward)                      74.1       75.0
  Barcode 45 (forward)                      76.0       76.0
  Barcode 46 (forward)                      76.0       76.9
  Barcode 47 (forward)                      76.9       76.9
  Barcode 48 (forward)                      76.0       76.9
  Barcode 49 (forward)                      76.0       76.9
  Barcode 50 (forward)                      75.0       76.0
  Barcode 51 (forward)                      80.0       76.0
  Barcode 52 (forward)                      76.9       79.2
  Barcode 53 (forward)                      76.0       75.0
  Barcode 54 (forward)                      76.9       76.9
  Barcode 55 (forward)                      73.1       76.0
  Barcode 56 (forward)                      76.0       76.0
  Barcode 57 (forward)                      76.9       77.8
  Barcode 58 (forward)                      80.8       79.2
  Barcode 59 (forward)                      76.0       76.0
  Barcode 60 (forward)                      76.0       76.9
  Barcode 61 (forward)                      73.1       79.2
  Barcode 62 (forward)                      75.0       75.0
  Barcode 63 (forward)                      79.2       76.0
  Barcode 64 (forward)                      76.9       79.2
  Barcode 65 (forward)                      76.9       80.8
  Barcode 66 (forward)                      79.2       76.9
  Barcode 67 (forward)                      79.2       76.9
  Barcode 68 (forward)                      79.2       76.0
  Barcode 69 (forward)                      74.1       75.0
  Barcode 70 (forward)                      80.0       80.0
  Barcode 71 (forward)                      76.0       76.0
  Barcode 72 (forward)                      79.2       80.0
  Barcode 73 (forward)                      76.0       76.9
  Barcode 74 (forward)                      75.0       76.9
  Barcode 75 (forward)                      77.8       73.1
  Barcode 76 (forward)                      76.0       76.9
  Barcode 77 (forward)                      76.0       76.0
  Barcode 78 (forward)                      76.0       75.0
  Barcode 79 (forward)                      76.0       76.9
  Barcode 80 (forward)                      80.0       76.0
  Barcode 81 (forward)                      76.9       76.9
  Barcode 82 (forward)                      79.2       76.0
  Barcode 83 (forward)                      80.0       79.2
  Barcode 84 (forward)                      76.9       76.9
  Barcode 85 (forward)                      76.0       76.0
  Barcode 86 (forward)                      72.0       75.0
  Barcode 87 (forward)                      76.9       76.9
  Barcode 88 (forward)                      77.8       79.2
  Barcode 89 (forward)                      75.9       75.0
  Barcode 90 (forward)                      74.1       73.1
  Barcode 91 (forward)                      76.0       76.9
  Barcode 92 (forward)                      75.0       76.9
  Barcode 93 (forward)                      79.2       83.3
  Barcode 94 (forward)                      74.1       76.0
  Barcode 95 (forward)                      76.0       76.9
  Barcode 96 (forward)                      75.0       79.2


Trimming adapters from read ends
  BC01_rev: CACAAAGACACCGACAACTTTCTT
      BC01: AAGAAAGTTGTCGGTGTCTTTGTG
  BC02_rev: ACAGACGACTACAAACGGAATCGA
      BC02: TCGATTCCGTTTGTAGTCGTCTGT
      BC01: AAGAAAGTTGTCGGTGTCTTTGTG
  BC01_rev: CACAAAGACACCGACAACTTTCTT
      BC02: TCGATTCCGTTTGTAGTCGTCTGT
  BC02_rev: ACAGACGACTACAAACGGAATCGA

1,255,086 / 1,255,086 (100.0%)

1,084,784 / 1,255,086 reads had adapters trimmed from their start (89,476,779 bp removed)
  954,999 / 1,255,086 reads had adapters trimmed from their end (32,552,203 bp removed)


Discarding reads containing middle adapters
1,255,086 / 1,255,086 (100.0%)

15,082 / 1,255,086 reads were discarded based on middle adapters


Saving untrimmed reads to barcode-specific files

  Barcode  Reads      Bases          File
  none     1,240,004  1,878,862,050  Porechop_Test/none.fastq
@rrwick
Copy link
Owner

rrwick commented Feb 2, 2018

I think you've uncovered an interesting bug here!

The problem stems from the fact that Porechop tries to determine whether you are using forward or reverse barcodes (different kits add the barcodes onto different ends of the reads). After Porechop determines the barcode orientation, it ignores barcodes in the other direction when demultiplexing. In your case, you have a tie for both orientations: barcodes 1 and 2 hit got 100% for both the forward and reverse directions. This made Porechop fail to recognise either.

Also, I'm confused about your results. Specifically, I'm wondering why you ended up with both forward and reverse barcodes on read ends. I'm assuming the correct orientation is forward barcodes, but that's weird too. I thought only the rapid and PCR kits used the forward barcodes, but you're not hitting either rapid or PCR adapters. In fact, you didn't hit any other adapters, besides the barcodes! What kit did you use?

Assuming that 'forward' is the correct orientation for your read set, I've pushed a change up to Porechop's development branch that will hopefully fix it for you. Install from the development branch like this:

git clone https://github.com/rrwick/Porechop.git
cd Porechop
python3 setup.py install

Check you've got the right one with porechop --version. You should see 0.2.4-beta. Then see if Porechop bins your reads correctly. Let me know how you go!

Ryan

@jverlouw
Copy link
Author

jverlouw commented Feb 5, 2018

Hi Ryan,

The change seems to solved the issue, the vast majority of the reads are now assigned to the barcodes. Slightly less than the Albacore demultiplexing, but close enough. The similarity between barcodes did cross my mind, but I didn't think it would be that much of an issue for the algorithm.

Barcode Reads Bases
BC01 498,943 764,406,691
BC02 724,452 1,097,253,177
none 18,490 22,584,836

The lack of recognizable adapter could possible be caused by the kit used, which was not included in the Albacore database either. The kit used was the SQK-RAB204 rapid 16s amplicon kit, which I think is fairly new.

Cheers,
Joost

@rrwick
Copy link
Owner

rrwick commented Feb 5, 2018

Okay, thanks.

If there aren't data privacy issues, is there any chance you could share some of those reads? It might help me to get the new kit adapters into Porechop. I could share a Dropbox directory with you. But no worries if this isn't possible.

Ryan

@jverlouw
Copy link
Author

jverlouw commented Feb 6, 2018

Unfortunately as usual there are some privacy issues with these samples. I've made sure that during the next run, somewhere this week, one of our controls will be run as well. The data from this sample I can freely send, so I'll update you once it's been sequenced.

Joost

@timacwalker
Copy link

Do you still need some reads to help with this?

@gwaldbieser
Copy link

gwaldbieser commented Jun 7, 2018

I'm having the same problem with a multiplexed barcoded run on the GridION. Porechop recognizes the adapters and barcodes but doesn't demultiplex them. When I followed the instructions from Feb 1, I only get "0.2.3".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants