learnErrors #1730

Pablo-UY · 2023-04-21T02:00:45Z

Hi everyone.
I'm exploring a GBS metagenomics dataset, where the sequences are SE from Illumina, and I'm getting this error profile plot.
Reads have a wide range distribution from 40bp to 90bp, but mostly 90 bp.
I only trimmed by Q and min length and then:
errF = learnErrors(filtered_R, multithread = T)
Could be this the main problem in obtaining these results? or is it due to the nature of the sequences?

Thanks.
Pablo

benjjneb · 2023-04-21T16:55:30Z

What sequencing instrument and read size are you using? (e.g. HiSeq 1x100)

Reads have a wide range distribution from 40bp to 90bp, but mostly 90 bp.

The raw reads are varying in size? Or this is after some sort of trimming step?
What does an example sample plotQualityProfile look like?

Pablo-UY · 2023-04-21T17:04:27Z

It was sequenced in a Novaseq 1x91.
Yes. Raw reads vary in size (15-91) and I trimmed to keep sequences more than 40pb.
plotQualityProfile looks like this:

benjjneb · 2023-04-21T18:34:31Z

Probably part of what you are seeing is the less-good fit of the error model for binned quality score data like NovaSeq produces, see more here: #1307

Yes. Raw reads vary in size (15-91)

This is still concerning to me though. Typically Illumina raw reads are all of uniform length. If there is a step that is introducing random length variation (e.g. read-by-read truncation at a specific quality score threshold) then this will reduce the sensitivity of dada2 and hurt its ability to accurately infer the error model.

hhollandmoritz mentioned this issue May 10, 2023

Binned quality scores and their effect on (non-decreasing) trans rates #1307

Open

benjjneb closed this as completed May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

learnErrors #1730

learnErrors #1730

Pablo-UY commented Apr 21, 2023

benjjneb commented Apr 21, 2023

Pablo-UY commented Apr 21, 2023

benjjneb commented Apr 21, 2023

learnErrors #1730

learnErrors #1730

Comments

Pablo-UY commented Apr 21, 2023

benjjneb commented Apr 21, 2023

Pablo-UY commented Apr 21, 2023

benjjneb commented Apr 21, 2023