Issue with Lighter performance #12

flashton2003 · 2014-12-23T16:57:57Z

Hello,

I'm not sure that Github issue is the best place for this, but it is the suggested channel for support so will give it a go.

I had some good initial experiences with Lighter, so ran it on a larger number of samples (n = 2000). The hypothesis of the experiment was that Lighter would help to reduce errors that were causing 'mixed positions', where the consensus base at a position had the support of less than 90% of the reads that mapped there.

However, my initial good experience was not continued. The image below is 100 randomly selected samples from our 2000. It shows the number of mixed positions obtained when reads that have just been quality trimmed (uncor_trimmed) and those that have been quality trimmed and Lighter corrected (cor_trimmed) are mapped vs reference.

As you can see, the general trend is for there to be more mixed positions in the alignments that have been Lightered, rather than those that have been just trimmed. This was not expected!

When I looked more closely at the positions that were mixed after Lighter, but not before, I saw something like.

Before

After

I was initially using an alpha of 0.05 and k = 17, changing this to alpha = 0.1 and k = 25 made no difference to this phenomenon. Do you have any insight into what might be causing this?

OS is Red Hat Enterprise Linux Server release 6.4 (Santiago).

mourisl · 2014-12-23T18:01:02Z

What is the average coverage for these data sets? It looks like the depths of coverage is much thinner than the figures showed before.

flashton2003 · 2014-12-24T11:10:26Z

All of these have an average coverage across the whole genome of greater than 30 fold (average 55.5).

mourisl · 2014-12-24T15:54:29Z

I just added a "-K" feature, which infers alpha from the total number of bases and genome size(very naive method). And it can take care of the different coverage between samples. Can you give this a try?
If this still have the problem, all the reads may have low quality covering the region miscorrected and Lighter will not store those kmers. If this is the case, you can try the parameter "-noQual".
If this could not solve the problem, I think you can try a more conservative correction parameter by setting "-maxcor" to 2 or even 1.

tseemann · 2016-05-13T23:01:29Z

@flashton2003 did you ever go back and try the auto-alpha mode?

flashton2003 · 2016-05-16T08:01:58Z

No, I never did :-(

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Lighter performance #12

Issue with Lighter performance #12

flashton2003 commented Dec 23, 2014

mourisl commented Dec 23, 2014

flashton2003 commented Dec 24, 2014

mourisl commented Dec 24, 2014

tseemann commented May 13, 2016

flashton2003 commented May 16, 2016

Issue with Lighter performance #12

Issue with Lighter performance #12

Comments

flashton2003 commented Dec 23, 2014

mourisl commented Dec 23, 2014

flashton2003 commented Dec 24, 2014

mourisl commented Dec 24, 2014

tseemann commented May 13, 2016

flashton2003 commented May 16, 2016