Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typical assembly wokflow for ALLHiC scaffolding #15

Closed
baozg opened this issue Aug 9, 2019 · 3 comments
Closed

Typical assembly wokflow for ALLHiC scaffolding #15

baozg opened this issue Aug 9, 2019 · 3 comments

Comments

@baozg
Copy link

baozg commented Aug 9, 2019

Hi @tangerzhang ,

I have a high heterozygosity polypoidy genome to assembly. After reading the manusrcript of ALLHiC and issues in the github, I want to confirm the typical workflow for ALLHiC scaffoling assembly. Here is my thoughts. Do you think there are any step need to be added or corrected?

  • Using Canu to assembly allele contig by PB / ONT reads (100x for the ploypoidy genome size) . The parameters of Canu should be the canu-poly or others ?
  • purge_haplotigs to remove alternative haplotigs
  • 10x , BioNano, BAC etc. to improve the contig assembly
  • 3D-DNA pipeline to correct the misassembly in the contig
  • repeat masking and gene annotation on the contig assembly
  • get allelic table by blastn to the progenitor diploid genomes speceies (If have two progenitor diploid genomes , do I need align to the different genome to get two allleic table? )
  • ALLHiC pipeline

Best,
Zhigui

@tangerzhang
Copy link
Owner

Hi Zhigui,
Please see the following message for my suggestion:
Using Canu to assembly allele contig by PB / ONT reads (100x for the ploypoidy genome size) .

- Yes, please use canu to assemble the allele contigs. Pacbio reads are highly recommended. I have not tested ONT reads and have no idea how ONT will perform on polyploid genome.

The parameters of Canu should be the canu-poly or others ?

- Using canu-poly parameters will get better results, but this step is time-consuming. I do not recommend canu-poly parameters if your genome size is big (let's say more than 2 Gb).

purge_haplotigs to remove alternative haplotigs

- ALLHiC is designed for haplotype phasing and our program will perform better if haplotigs are assembled as many as possible. Therefore, please DO NOT use purge_haplotigs to remove alternative haplotigs.

10x , BioNano, BAC etc. to improve the contig assembly

- 10x, BioNano and BAC will definitely improve contig continuity but may introduce assembly error. I do not have much experience on 10x, Bionano assembly.

3D-DNA pipeline to correct the misassembly in the contig

- Correction of misassembled contigs will improve Hi-C scaffolding.

repeat masking and gene annotation on the contig assembly

- Yes, you can mask the genome and predict gene structure.

get allelic table by blastn to the progenitor diploid genomes speceies (If have two progenitor diploid genomes , do I need align to the different genome to get two allleic table? )

- Yes, you get allelic table using blast based method. One reference genome with chromosomal level assembly should be good enough.

@baozg
Copy link
Author

baozg commented Aug 10, 2019

Hi @tangerzhang

Thanks for the reply.

  1. canu-poly is not recommened for the >2G genome. So what other parameters you recommend? Or would you mind tell what parameters sugar cane genome use ?
  2. purge_haplotigs only suit for the diploidy genome. So after canu assembly, I can directly use HiC data?

Best,
Zhigui

@tangerzhang
Copy link
Owner

Hi Zhigu,
Below are the parameters that I used to assemble a polyploid genome:

canu -s spec.txt -p AP -d run1 genomeSize=3.2g -pacbio-raw input.fasta

Information in spec.txt file:

gridOptions="-N canu -l walltime=30000:00:00 -q all.q"
corOutCoverage=100
ovbMemory=8g
maxMemory=500g
maxThreads=48
ovsMemory=8-500g
ovsThreads=4
oeaMemory=32g
ovsMethod=parallel

For your second question, please use Hi-C data directly once you finish CANU assembly and Pilon polish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants