-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge SVs with high percentage of overlap #13
Comments
Hi Chengcheng, Thanks a lot for your interest in using Jasmine! As for your first question, the best approach would be to have the distance thresholds depend on the length of the variants using the max_dist_linear parameter. While it doesn't explicitly look at overlap, it will give these large variants large distance thresholds so that they can be correctly merged with each other. I recommend something like this (though the exact values depend on the organism being studied and the upstream pipeline you are using):
For your second question, that format is correct. Jasmine can infer the length from the REF and ALT fields if they are filled out (so if they are e.g. A and ATGTATGCGT it will automatically use 9 as the SVLEN value). But if not, it falls back to the SVLEN field. I hope that helps, and please don't hesitate to reach out with any other questions! Best, |
Hi Melanie, Thanks a lot for your clear explanation. I will try based on your suggestions. Best regards, |
Hi Melanie, I see you have added a new parameter (min_overlap) in Jasmine to set the minimum reciprocal overlap. I'm wondering how it works? If two variants have reciprocal overlap greater than "min_overlap", will Jasmine still take "max_dist_linear" or "max_dist" into account to decide whether to merge or not? Best, |
Hi Chengcheng,
When using this parameter, the overlap requirement is in addition to the breakpoint distance requirement. So Jasmine checks only variant pairs with breakpoints which are within the required merging distance of one another, and then among those only merges those with sufficient overlap.
I would still recommend using the max_dist_linear parameter to merge variant pairs which have high overlap but also large breakpoint distances, but this new setting is available in case you also want to avoid merging variants pairs with small breakpoint distances but little overlap.
Best,
Melanie
…________________________________
From: cai1991 ***@***.***>
Sent: Thursday, April 8, 2021 5:05:40 AM
To: mkirsche/Jasmine ***@***.***>
Cc: Melanie Kirsche ***@***.***>; Comment ***@***.***>
Subject: Re: [mkirsche/Jasmine] Merge SVs with high percentage of overlap (#13)
External Email - Use Caution
Hi Melanie,
I see you have added a new parameter (min_overlap) in Jasmine to set the minimum reciprocal overlap. I'm wondering how it works? If two variants have reciprocal overlap greater than "min_overlap", will Jasmine still take "max_dist_linear" or "max_dist" into account to decide whether to merge or not?
Best,
Chengcheng
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmkirsche%2FJasmine%2Fissues%2F13%23issuecomment-815591116&data=04%7C01%7Cmelaniekirsche%40jhu.edu%7C7e86dfa0bc7d42a5ad7d08d8fa6d7c05%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637534695434984699%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4Gz41enWgkozngizrOu1P%2FWi32y5CREfZ8OR%2FqOWCns%3D&reserved=0>, or unsubscribe<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACIYVSQW42AOAKCDIDQLNRLTHVWWJANCNFSM42LRW4DQ&data=04%7C01%7Cmelaniekirsche%40jhu.edu%7C7e86dfa0bc7d42a5ad7d08d8fa6d7c05%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637534695434984699%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8%2FcHbnBGLzYfPm1sHWtCXMyVcNQRBpGkYkush6KXqus%3D&reserved=0>.
|
Hi,
I'm trying your pipeline to merge my SVs, which were generated by whole genome comparisons among several de novo assemblies, into a single vcf file. I'm wondering:
C3 10180346 0_INV27953 N <INV> . PASS END=10361415;SVLEN=181069;SVTYPE=INV;AVG_LEN=181069.000000;AVG_START=10180346.000000;AVG_END=10361414.000000;SUPP_VEC_EXT=10;IDLIST_EXT=INV27953;SUPP_EXT=1;SUPP_VEC=10;SUPP=1;SVMETHOD=JASMINE;IDLIST=INV27953
C3 10192856 1_INV34939 N <INV> . PASS END=10361415;SVLEN=168559;SVTYPE=INV;AVG_LEN=168559.000000;AVG_START=10192856.000000;AVG_END=10361414.000000;SUPP_VEC_EXT=01;IDLIST_EXT=INV34939;SUPP_EXT=1;SUPP_VEC=01;SUPP=1;SVMETHOD=JASMINE;IDLIST=INV34939
C3 29342378 0_INV27963 N <INV> . PASS END=29948423;SVLEN=606045;SVTYPE=INV;AVG_LEN=606045.000000;AVG_START=29342378.000000;AVG_END=29948422.000000;SUPP_VEC_EXT=10;IDLIST_EXT=INV27963;SUPP_EXT=1;SUPP_VEC=10;SUPP=1;SVMETHOD=JASMINE;IDLIST=INV27963
C3 29342378 1_INV34950 N <INV> . PASS END=29973346;SVLEN=630968;SVTYPE=INV;AVG_LEN=630968.000000;AVG_START=29342378.000000;AVG_END=29973345.000000;SUPP_VEC_EXT=01;IDLIST_EXT=INV34950;SUPP_EXT=1;SUPP_VEC=01;SUPP=1;SVMETHOD=JASMINE;IDLIST=INV34950
C1 498768 INS37 N <INS> . PASS END=498768;ChrB=C1;StartB=496550;EndB=496651;Parent=SYN44;VarType=ShV;DupType=.;SVLEN=102;SVTYPE=INS;STRANDS=+
Thank you very much in advance for your help.
Best regards,
Chengcheng
The text was updated successfully, but these errors were encountered: