Simple Error Model #5

HadrienG · 2016-11-21T15:32:11Z

Issue to track the progress on the Roadmap item "Add a simple error model"

I guess that the simplest would be to:

Add 2 parameters: mean_quality and std_dev*
Generate random quality scores following a normal distribution
Eventually modify the nucleotides whose quality aren't perfect (it never is)
write docstrings and tests

A few unknowns:

Should the length of the sequences vary?
To which base switch if we get an erroneous call. A random other nucleotide? I would guess a random nucl. is good enough for a simple error model.

* I haven't added a standard deviation parameter. it is hardcoded to 0.01 but can be discussed

Ackia · 2016-11-21T15:34:47Z

A few unknowns:

Should the length of the sequences vary?

Yes, they should vary. At least to a certain degree. Within all sequencing technologies they are varying and are often not normally distributed.

To which base switch if we get an erroneous call. A random other nucleotide? I would guess a random nucl. is good enough for a simple error model.

Random should be good enough. Possibly also include INDEL?

HadrienG · 2016-11-29T08:02:54Z

@Ackia the last HiSeq reads I received are all 76bp. Also, in the BEAR article, they state that "Illumina reads are generally uniform in length, reads from other technologies can vary greatly in length" which makes sense since X cycles should give you X base pairs.

Indels occur at a really low rate in Illumina data: 2.8 x 10^−6 (errors per base) for R1 insertions and 5.1 x 10^−6 (errors per base) for R1 deletions according to doi.org/10.1186/s12859-016-0976-y
I'm gonna leave them out of the simple error model, which is just really a test for me to create reads than a model we're gonna use

Ackia · 2016-11-29T08:14:24Z

I agree. I was mixing Illumina up with IonTorrent. My bad. Good progress!

HadrienG · 2016-11-29T09:32:31Z

Closed with 031454b ! 🚀

HadrienG self-assigned this Nov 21, 2016

HadrienG mentioned this issue Nov 21, 2016

Roadmap #1

Open

20 tasks

HadrienG closed this as completed Nov 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple Error Model #5

Simple Error Model #5

HadrienG commented Nov 21, 2016 •

edited

Loading

Ackia commented Nov 21, 2016

HadrienG commented Nov 29, 2016

Ackia commented Nov 29, 2016

HadrienG commented Nov 29, 2016

Simple Error Model #5

Simple Error Model #5

Comments

HadrienG commented Nov 21, 2016 • edited Loading

Ackia commented Nov 21, 2016

HadrienG commented Nov 29, 2016

Ackia commented Nov 29, 2016

HadrienG commented Nov 29, 2016

HadrienG commented Nov 21, 2016 •

edited

Loading