-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
learnErrors #1730
Comments
What sequencing instrument and read size are you using? (e.g. HiSeq 1x100)
The raw reads are varying in size? Or this is after some sort of trimming step? |
Probably part of what you are seeing is the less-good fit of the error model for binned quality score data like NovaSeq produces, see more here: #1307
This is still concerning to me though. Typically Illumina raw reads are all of uniform length. If there is a step that is introducing random length variation (e.g. read-by-read truncation at a specific quality score threshold) then this will reduce the sensitivity of dada2 and hurt its ability to accurately infer the error model. |
Hi everyone.
I'm exploring a GBS metagenomics dataset, where the sequences are SE from Illumina, and I'm getting this error profile plot.
Reads have a wide range distribution from 40bp to 90bp, but mostly 90 bp.
I only trimmed by Q and min length and then:
errF = learnErrors(filtered_R, multithread = T)
Could be this the main problem in obtaining these results? or is it due to the nature of the sequences?
Thanks.
Pablo
The text was updated successfully, but these errors were encountered: