How many Short reads can I need? And How can I get the the Chimera trimmed and no LQ trimmed sequence? #75

52teth · 2016-08-25T01:59:12Z

Hi thackl ,
Sorry to bother, but I have several problem about the proovread usage.
Q1. I have read the intro to prooread, in the short read section it says "The recommended coverage for short reads data is around 30-50X and should be specified with --coverage. " , How can I figure the coverage of my short reads? I have used the "normalize-by-median.py" in khmer package to normalize the short reads, or does the coverage is the same with the --cutoff (when the median k-mer coverage level above is above this numer the read is not kept.) in the normalize-by-median.py ?
Q2. My long reads is the quivered result after ICE, So the Long reads is all full length isoform sequence. So I am confused is there any method to get the corrected reads with only Chimera trimmed but not low quality bases trimmed?
Q3. I have used the normalized short reads to correct my long consencus reads after ICE/Quiver, the command I used is : ./proovread -l ./LR/data/split.001.fq -s ./SR/normalize/interleaved.fastq --prefix ./LR/result/split.001 --threads 6 --coverage=50 --overwrite --no-sampling
here is the statistics,
[Wed Aug 24 17:51:32 2016] Running mode: sr
[Wed Aug 24 17:51:49 2016] Running task bwa-sr-1
[Wed Aug 24 18:44:18 2016] Masked : 61.6%
[Wed Aug 24 18:44:18 2016] Running task bwa-sr-2
[Wed Aug 24 19:35:54 2016] Masked : 69.0%
[Wed Aug 24 19:35:54 2016] Running task bwa-sr-3
[Wed Aug 24 20:17:53 2016] Masked : 71.6%
[Wed Aug 24 20:17:53 2016] Running task bwa-sr-finish
[Wed Aug 24 20:27:13 2016] Masked : 60.7%
Does this mean that my short reads is not enough?

Thanks a lot for your help!

thackl · 2016-08-26T20:29:59Z

Q1. If I understand correctly, you are working with RNA-seq reads, not genomic data? In that case the best way is to run proovread with --no-sampling, and make no assumptions about coverage.

Q2. It is possible, I have drafted a new FAQs section on chimeras etc., for now available here:
https://github.com/BioInf-Wuerzburg/proovread/tree/doc/refactor_and_new_faqs_TH#chimeras-siamaeras-and-so-on
later here:
https://github.com/BioInf-Wuerzburg/proovread#chimeras-siamaeras-and-so-on
Let me know if the explanations and commands help.

Q3. Your reads are probably enough but not well used because of --coverage settings. This should improve with --no-sampling.

52teth · 2016-08-26T22:18:18Z

thanks a lot! It does really help👍

thackl closed this as completed Oct 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How many Short reads can I need? And How can I get the the Chimera trimmed and no LQ trimmed sequence? #75

How many Short reads can I need? And How can I get the the Chimera trimmed and no LQ trimmed sequence? #75

52teth commented Aug 25, 2016 •

edited

Loading

thackl commented Aug 26, 2016

52teth commented Aug 26, 2016

How many Short reads can I need? And How can I get the the Chimera trimmed and no LQ trimmed sequence? #75

How many Short reads can I need? And How can I get the the Chimera trimmed and no LQ trimmed sequence? #75

Comments

52teth commented Aug 25, 2016 • edited Loading

thackl commented Aug 26, 2016

52teth commented Aug 26, 2016

52teth commented Aug 25, 2016 •

edited

Loading