-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is *.unitigs.fasta? #286
Comments
'contigs' will span repeats, as long as the repeat is unambiguous. 'unitigs' are derived from contigs. Wherever a contig end intersects the middle of another contig, the contig is split. 'bubbles' are deprecated and will be removed in the next release. Treat them as contigs for now. 'unassembled' contains mostly reads that failed to assemble into a contig. There will be some assembled sequences, but these will be short and nearly the same as the longest read in them. Though out of date, and will probably move when it is updated, the relevant section in the docs is http://canu.readthedocs.io/en/latest/quick-start.html#find-the-output |
Brian, you say that 'unitigs' are derived from 'contigs', but how comes unitigs in my case is almost 900Mb, while contig is only 510Mb? My genome is 620Mb. |
Hi
Thank you in advance. Michal |
@StefanoLonardi: The unitigs are unfiltered contigs. There is some filtering on the contigs to remove ones composed primarily of a single read. There is no such filter on the unitigs (see this option: http://canu.readthedocs.io/en/latest/faq.html#my-asm-contigs-fasta-is-empty-why). @mictadlo PBJelly is primarily designed to close gaps in scaffolds, Canu doesn't currently produce scaffolds, only contigs. PBJelly can join some contigs but it is unlikely to make much difference. If you have scaffolded the contigs with another technology, then you can run PBJelly. You would use the uncorrected fastq reads (same as you input to Canu) for it not the trimmedReads nor the correctedReads. |
What other technology would you recommend to scaffold contigs? |
There are many options, here's a brief non-exhaustive list (see our goat publication: http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.3802.html for a few options (Bionano optical maps and Phase HiC)). There is also Dovetail which provides both HiC and Chicago libraries for scaffolding. There's also 10X for scaffolding as well. |
Thank you |
P.S. Why would you not use for PBjelly the trimmedReads or the correctedReads? |
Hi, Did you use the You wrote in your paper Thank you in advance Michal |
How can I annotate intergenic variants and variants from horizontally-acquired genes in the DBGWAS visualization, so that we can distinguish different types of variants easily? |
Hi,
Very excited with Canu assembler as I'm tackling to look into repeat-rich regions in a fish genome.
I found in the output directory *.unitigs.fasta. What is it and how does it relate to *.contigs.fasta, *.unassembled.fasta and *.bubbles.fasta?
Thank you for your help.
The text was updated successfully, but these errors were encountered: