Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about polishing in wtpoa-cns #55

Closed
TinaH10 opened this issue Dec 17, 2018 · 8 comments
Closed

question about polishing in wtpoa-cns #55

TinaH10 opened this issue Dec 17, 2018 · 8 comments

Comments

@TinaH10
Copy link

TinaH10 commented Dec 17, 2018

Hello, I have the following question. Can I use wtpoa-cns also to polish my contigs using paired end Illumina reads?

More specifically, can I use this command

minimap2 -ax sr prefix.ctg.lay.fa read1.fq read2.fq

instead of this

minimap2 -t 16 -x map-pb -a prefix.ctg.lay.fa reads.fa.gz

in the polishing step
?

Thank you very much for your help.

@ruanjue
Copy link
Owner

ruanjue commented Dec 18, 2018

For short reads, wtpoa-cns can success in multiple alignment, but the consensus codes may not cope with short reads. Thanks for your question, I think I will find time to make it works on short reads SAM from bwa or other alignment tools.

Jue

@TinaH10
Copy link
Author

TinaH10 commented Dec 19, 2018

I just ran it using my Illumina data to polish and it ran all the way through. It is hard to judge what it has done, as there is no stats file or such, that summarizes the changes after polishing. Or is there one? I will align some of the contigs and see what the differences are with polishing with Illumina. I'll update what I find.

@ruanjue
Copy link
Owner

ruanjue commented Dec 21, 2018

Sorry for the last email, I made a mistake.

@ruanjue
Copy link
Owner

ruanjue commented Dec 24, 2018

Please update 7aa6f87

  • View the multiple alignments
wtpoa-cns -t 1 -d dbg.ctg.lay.fa -i <(samtools view dbg.ctg.lay.fa.srt.bam) -w 200 -j 150 -rS 2 -R 0 -N 40 -vv 2>&1 | less -S
  • Run consensus
wtpoa-cns -t 32 -d dbg.ctg.lay.fa -i <(samtools view dbg.ctg.lay.fa.srt.bam) -w 200 -j 150 -rS 2 -R 0 -N 40 -fo dbg.ctg.lay.polished.fa

Please note that current code doesn't pay effort on heterozygote, you will find it make wrong decision on heter sites. So, please give me more time to find a good consensus calling method to perform well on both long noisy reads and short accurate reads.

Jue

@ruanjue
Copy link
Owner

ruanjue commented Jan 4, 2019

New consensus mode: dp-call_cns works on short reads polishing now.

wtdbg-2.3/wtpoa-cns -t 64 -d dbg.ctg.lay.fa -i <(samtools view dbg.ctg.lay.fa.srt.bam) -fo dbg.ctg.lay.fa.samlay.fa -b 1 -c 1 -w 200 -j 150 -N 40 -R 0 -rS 2

From my test in yeast, quast5 reported improvements on base accuracy.

# mismatches per 100 kbp     161.80        52.48
# indels per 100 kbp         129.97        23.67

@TinaH10
Copy link
Author

TinaH10 commented Jan 4, 2019

Excellent, thank you!

@TinaH10
Copy link
Author

TinaH10 commented Jan 4, 2019

Oh, one more question: Will it also correct assembled contigs based on paired end information? For example, if the Illumina PE reads consistently map to different contigs, will the contig be broken up accordingly? Or will it just correct errors in sequence?

@ruanjue
Copy link
Owner

ruanjue commented Jan 5, 2019

That is interesting, but better to leave it to other tools.

@ruanjue ruanjue closed this as completed Jan 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants