Seq error correction after dedup #23

peterch405 · 2016-04-22T09:29:16Z

I was wondering, how do you merge the duplicate read or do you simply discard them? I would be interested in looking at the full sequence and possibly deriving a consensus to correct for pcr/seq errors.

TomSmithCGAT · 2016-04-22T09:39:50Z

Hi Perch, we just output a single "best" read. This read should be the consensus since it will have the highest counts.

If you really want to retain all the duplicates but have them marked them in some way so you can manually derive the consensus, I guess we could add this as an option?

peterch405 · 2016-04-22T10:44:30Z

That sounds good.

IanSudbery · 2016-05-03T13:18:03Z

Hi Guys, I'm a bit worried what this might do to the memory usage. We keep a buffer of reads to ensure that we get all the reads from a region before outputting (because we using the start of the read in orientation, rather than genome orientation - i.e. for a read on the reverse strand we care about the 3' most coordinate, not the BAM pos field). At the moment we only retain the representative read. If we keep all reads, we could have 100s of times more memory usage - still we could add this as an option, but with the warning that the memory usage might shoot through the roof.

TomSmithCGAT · 2017-01-19T09:42:50Z

The group command can now be used to group read by their UMI for downstream processing such as deriving a consensus to correct for pcr/seq errors.

IanSudbery added the enhancement label May 3, 2016

peterch405 mentioned this issue Nov 10, 2016

Output read groups rather than deduplicated BAM #54

Closed

TomSmithCGAT closed this as completed Jan 19, 2017

avivdemorgan mentioned this issue Dec 19, 2021

Problems with dedup output and RSEM #465

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seq error correction after dedup #23

Seq error correction after dedup #23

peterch405 commented Apr 22, 2016

TomSmithCGAT commented Apr 22, 2016

peterch405 commented Apr 22, 2016

IanSudbery commented May 3, 2016

TomSmithCGAT commented Jan 19, 2017

Seq error correction after dedup #23

Seq error correction after dedup #23

Comments

peterch405 commented Apr 22, 2016

TomSmithCGAT commented Apr 22, 2016

peterch405 commented Apr 22, 2016

IanSudbery commented May 3, 2016

TomSmithCGAT commented Jan 19, 2017