Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I get reads which used for trio assembly from HiFiasm #44

Closed
LQHHHHH opened this issue Sep 28, 2020 · 4 comments
Closed

How can I get reads which used for trio assembly from HiFiasm #44

LQHHHHH opened this issue Sep 28, 2020 · 4 comments

Comments

@LQHHHHH
Copy link

LQHHHHH commented Sep 28, 2020

Hi,
Recently, I have used Hifiasm to generate haplotype-resolved assemblies with trio binning. But I want to know which the reads name which used for assembly two haplotype-resolved genome separately. So I can get these two reads set from raw pacbio hifi reads then feed to Hicanu to compare the resultes.

Can you give me any suggestion or parameter?

@chhylp123
Copy link
Owner

chhylp123 commented Sep 28, 2020

In the GFA files, the 'HG:A' column of A-line indicates which haplotype does this read come from.

'HG:A:m': from maternal haplotype
'HG:A:p': from paternal haplotype
'HG:A:a': not binnable

HiCanu has its own trio-binning mode + read partition mode, it should be better to directly use HiCanu since there might be some bias. I guess the read partition steps of both HiCanu and hifiasm are comparable. The major difference is how to generate the final contigs. BTW, if you want to evaluate the phasing results, I'd like to suggest you to evaluate by yak and merqury, instead of only merqury. HiCanu shares the same partition parts with merqury. So it may have bias.

@LQHHHHH
Copy link
Author

LQHHHHH commented Sep 29, 2020

yes, I also used yak triobin and found contigs which belongs to one haplotype genome was marked as "a", "m" and "0" , but also "p" was found in this genome. Since I provided parental NGS data to yak then run hifiasm, why this genome still contain another haplotype reads which used to assemble.

here is my yak trio resultes:

h1tg000111l	p	63	0	64	2	0	0	27001	423
h1tg000112l	p	10552	0	13600	0	16	1	101022	169
h1tg000113l	p	2035	0	2057	2	4	7	39954	864
h1tg000114l	a	17	0	18	10	3	0	25055	256
h1tg000115l	p	121	0	183	0	0	0	37708	93
h1tg000116l	p	122	0	208	34	19	26	49258	1694
h1tg000117l	p	307	0	459	0	4	1	59201	89
h1tg000118l	m	0	19	1	23	4	0	353381	23
h1tg000119l	p	410	0	545	4	0	0	46710	98
h1tg000120l	p	11214	0	14826	8	5	1	114096	137
h1tg000121l	p	105	0	176	0	0	0	38161	70

Another question is reads used to for assembling two haplotypes are 213719 and 220259. 72583 were not binnable. But my total hifi ccs reads were 3703697.

@chhylp123
Copy link
Owner

Just make sure: are you only passing paternal index to hifiasm for assembling, and then evaluate by yak trioeval with both paternal and maternal indexes?

@LQHHHHH
Copy link
Author

LQHHHHH commented Oct 8, 2020

sorry, I used the wrong command. But I have evaluated phasing completeness using merqury. Both Hifiasm and Canu-trio got good results. I also used yak trioeval but cannot understand the results well even read this issuse. Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants