Feature request: ONT Chimeric read splitting #747

rhpvorderman · 2023-12-12T07:49:57Z

Currently I am researching ONT possibilities with cutadapt, and it seems that the most basic functionality can be achieved. Unfortunately after the adapters have been adequately cut, sequali still finds adapter sequences.

These are most likely due to chimeric reads, where reads are joined by adapter sequences. These reads should be split. With the newest chemistry the amount of chimeric reads is estimated at 10% (previously around 2%). These chimeric reads are not always split by the sequence provider and historic data may also contain the 2% reads because splitting was not available back then.

Since cutadapt already has a decent alignment algorithm that can detect sequences anywhere in the read, it should be possible to write a routine that detects chimeric reads.

The hard part I guess will be the actual splitting, were one read becomes two or more reads and feed that back into the pipeline. I can imagine that consideration wasn't a thing when cutadapt was designed.

rhpvorderman · 2023-12-19T08:27:55Z

I did some thinking and research. The best way to approach this is as follows:

Publish the user guide with the current cutadapt code. Chimeric reads are detected by using adapter detection and using --discard to throw them away.
Make a dedicated read splitter. Rather than splitting the read, the longest segment is presented as canonical.
Look how read splitting can be incorporated in the cutadapt single-end pipeline.

3 is quite challenging, but by following the steps, cutadapt will already be useful for nanopore with chimeric reads at step 1, without requiring extra code.

rhpvorderman mentioned this issue Dec 12, 2023

Write a user guide specifically for ONT data #742

Open

marcelm mentioned this issue Apr 25, 2024

Nanopore adapter detection #782

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: ONT Chimeric read splitting #747

Feature request: ONT Chimeric read splitting #747

rhpvorderman commented Dec 12, 2023

rhpvorderman commented Dec 19, 2023

Feature request: ONT Chimeric read splitting #747

Feature request: ONT Chimeric read splitting #747

Comments

rhpvorderman commented Dec 12, 2023

rhpvorderman commented Dec 19, 2023