Question - Split on adapter #29

jagos01 · 2022-12-19T04:23:59Z

Hello,
I am duplex basecalling with dorado. Can split_on_adapter accept unmapped bam files for input/output?
Thanks

onordesjo · 2022-12-19T04:39:29Z

Hi,

Thanks for the question. It's not yet possible, but I would suspect that it would be useful. We intend to release a better version of template/complement splitting today hopefully that should be better than adapter splitting for duplex.

jagos01 · 2022-12-19T05:03:59Z

Thanks for your quick reply. I will try it out when it is released.

onordesjo · 2022-12-19T13:32:19Z

Hi @jagos01, v0.2.20 is now out, and you can use this to recover reads which are non-split.

Feel free to try it out by

simplex-calling (fast is ok):

$ dorado basecaller dna_r10.4.1_e8.2_400bps_fast@v4.0.0 pod5s/ --emit-moves > unmapped_reads_with_moves.sam

run split_pairs like this:

duplex_tools split_pairs unmapped_reads_with_moves.sam pod5s/ pod5s_splitduplex/

This should give you new pod5s in the pod5s_splitduplex directory (with new read-ids), together with the pair_ids that correspond to the new read_ids.

Feel free to try it out and let me know how things are working.

jagos01 · 2022-12-19T22:00:31Z

Hello @onordesjo, I followed the directions outlined in the readme for duplex calling with dorado. I generated the pair_id files for both step 2a and 2b. They contained 4667 and 7867 pairs respectively. When stereo basecalling those reads, dorado only basecalled 4114 and 1338 reads. Why is the number of stereo basecalled reads less than the number of read pairs?
Thanks

onordesjo · 2022-12-19T22:08:33Z

Hi @jagos01. Can I ask what type of data you have been looking at? Whole genome? Any amplification? There is some filtering happening in Dorado to ensure that bad pairs don't get through, so that is to be expected. I would expect less pairs generated in step 2b than 2b but greater retention of good pairs. 2a would also necessarily have to be generated without a subset (or alternatively a selection of channels).

Any of this information would help to explain what you are seeing.

jagos01 · 2022-12-19T22:49:24Z

Hello @onordesjo. This is bacterial whole genome sequence data. No amplification was carried out. The data is split over two runs (had to restart the sequencer a couple hours into the run). I was also expecting less pairs from 2b. 2a was generated from the complete data set.

ollenordesjo · 2022-12-19T22:54:28Z

Ah, then my next question is if they were basecalled at the same time (did the SAM contain reads from both of the runs?)

…

________________________________ From: jagos01 ***@***.***> Sent: Monday, December 19, 2022 10:00:44 PM To: nanoporetech/duplex-tools ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [nanoporetech/duplex-tools] Question - Split on adapter (Issue #29) Hello @onordesjo<https://github.com/onordesjo>, I followed the directions outlined in the readme for duplex calling with dorado. I generated the pair_id files for both step 2a and 2b. They contained 4667 and 7867 pairs respectively. When stereo basecalling those reads, dorado only basecalled 4114 and 1338 reads. Why is the number of stereo basecalled reads less than the number of read pairs? Thanks — Reply to this email directly, view it on GitHub<#29 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AR6SGNYKDNBJGDOYCC7LROLWODLIZANCNFSM6AAAAAATC5UIJQ>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***> IMPORTANT NOTICE: The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments. CONFIDENTIAL

jagos01 · 2022-12-19T23:47:28Z

I inspected the pod5 reads for each run and the unmapped BAM file contains reads from both runs.

ollenordesjo · 2022-12-20T07:15:02Z

Thanks, that helps. Would be keen to take a look at the bam. If you're happy to share it, feel free to email me at olle.nordesjo at nanoporetech.com and I can take a closer look at it.

…

________________________________ From: jagos01 ***@***.***> Sent: Monday, December 19, 2022 11:47:39 PM To: nanoporetech/duplex-tools ***@***.***> Cc: Olle Nordesjo ***@***.***>; Comment ***@***.***> Subject: Re: [nanoporetech/duplex-tools] Question - Split on adapter (Issue #29) I inspected the pod5 reads for each run and the unmapped BAM file contains reads from both runs. — Reply to this email directly, view it on GitHub<#29 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AR6SGNYN5AWFJMLB4KR3ARLWODXZXANCNFSM6AAAAAATC5UIJQ>. You are receiving this because you commented.Message ID: ***@***.***> IMPORTANT NOTICE: The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments. CONFIDENTIAL

jagos01 · 2022-12-20T18:33:25Z

Thanks, I have emailed a link to the bam file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question - Split on adapter #29

Question - Split on adapter #29

jagos01 commented Dec 19, 2022

onordesjo commented Dec 19, 2022

jagos01 commented Dec 19, 2022

onordesjo commented Dec 19, 2022

jagos01 commented Dec 19, 2022

onordesjo commented Dec 19, 2022

jagos01 commented Dec 19, 2022

ollenordesjo commented Dec 19, 2022 via email

jagos01 commented Dec 19, 2022

ollenordesjo commented Dec 20, 2022 via email

jagos01 commented Dec 20, 2022

Question - Split on adapter #29

Question - Split on adapter #29

Comments

jagos01 commented Dec 19, 2022

onordesjo commented Dec 19, 2022

jagos01 commented Dec 19, 2022

onordesjo commented Dec 19, 2022

jagos01 commented Dec 19, 2022

onordesjo commented Dec 19, 2022

jagos01 commented Dec 19, 2022

ollenordesjo commented Dec 19, 2022 via email

jagos01 commented Dec 19, 2022

ollenordesjo commented Dec 20, 2022 via email

jagos01 commented Dec 20, 2022