-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vcf verbose output #27
base: master
Are you sure you want to change the base?
Conversation
If `--output_all_positions` is passed to `merge_to_vcf`, the output VCF will include all positions covered by the reference: sites with intrahost variants and sites with consensus calls but no sub-consensus variation. Per-sample consensus calls (relative to the reference) will be included. Positions not represented in the assembly for each sample will be omitted. This flag is not included by default. A unit test is added to test_intrahost.py::test_output_all_positions.
….0, not . (missing)
call iSNVs via vPhaser2 within assemble_refbased.wdl. iSNV calling should be scheduled to occur in parallel with align-to-self depends on merge of this PR upstream and release of docker image: broadinstitute/viral-phylo#27
Oof... this is wading into the thick of some tricky code.. I think I might benefit from a live discussion on it as I try to remember all the bits going on here. One quick clarification: this code path is only (currently) used for intrahost variant calling via vphaser, right? Not used for assembly consensus calling, and in theory, is unnecesary for isnv calling by other variant callers that speak vcf more natively? |
self.assertEqual(rows[3].ref, 'G') | ||
self.assertEqual(rows[3].alt, 'C') | ||
self.assertEqual(':'.join(rows[3][0].split(':')[:2]), '1:1.0') | ||
self.assertEqual(':'.join(rows[3][1].split(':')[:2]), '0:0.0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should maybe be self.assertEqual(':'.join(rows[3][1].split(':')[:2]), '0:.')
for positions we are imputing from ref, since absence of iSNVs in the vphaser output does not necessarily mean iSNVs were absent, only that they did not warrant inclusion in the output.
Sure thing; happy to connect about this. To answer your question: yeah, this code path is only for vphaser currently. It's not used for assembly consensus calling, through the changes here are perhaps a step in that direction (outputting all positions in the VCF if this new toggle is set). We may still want to parse/interpret the output of other variant callers ourselves, either for consensus calling or for merging sample data (as an example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry this is so old but I like this!
If
--output_all_positions
is passed tomerge_to_vcf
, the output VCF will include all positions covered by the reference: sites with intrahost variants and sites with consensus calls but no sub-consensus variation. Per-sample consensus calls (relative to the reference) will be included. Positions not represented in the assembly for each sample will be omitted. This flag is not included by default. A unit test is added to test_intrahost.py::test_output_all_positions.