-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recreate vcf from variants in scout #6
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good! I think you can split the code up a bit, see comments
mutacc_auto/parse/parse_scout.py
Outdated
@@ -2,6 +2,7 @@ | |||
from datetime import datetime, timedelta | |||
|
|||
from mutacc_auto.commands.scout_command import ScoutExportCases | |||
from mutacc_auto.parse.vcf_constants import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean import all? If so, it is usually better to be explicit. Otherwise your namespace might be polluted without you knowing it.
mutacc_auto/parse/parse_scout.py
Outdated
|
||
#Write header of vcf | ||
for header_line in HEADER: | ||
vcf_string += header_line + '\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a named constant + NEWLINE
mutacc_auto/parse/parse_scout.py
Outdated
|
||
#Get samples | ||
samples = [sample['sample_id'] for sample in scout_vcf_output[0]['samples']] | ||
samples = '\t'.join(samples) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here TAB
mutacc_auto/parse/parse_scout.py
Outdated
samples = [sample['sample_id'] for sample in scout_vcf_output[0]['samples']] | ||
samples = '\t'.join(samples) | ||
|
||
vcf_string += f"#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT\t{samples}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like a simple "TAB.join(#CHROM, POS, ...etc)" if possible or even better
header_column_names = ["#CHROM", "POS", ..etc]
TAB.join(header_column_names, samples) + NEWLINE
That is a better description of what the parts are and more readable. You will have to excuse my python, but I think you get the idea
mutacc_auto/parse/parse_scout.py
Outdated
#Write variants | ||
for variant in scout_vcf_output: | ||
|
||
entry= [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entry = []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And a line is vcf is usually called a "record" I think...
mutacc_auto/parse/parse_scout.py
Outdated
for variant in scout_vcf_output: | ||
|
||
entry= [] | ||
entry.append(str(variant['chromosome'])) #CHROM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you not make this into a for loop and append each iteration using the key as iterator.
something like:
entry = []
for column in scout_header_column_names:
if column in variant:
entry.append(str(variant[column] or '.')) # Since it was only uninitilized place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle the PASS outside loop if it is not part of variant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this into a def
mutacc_auto/parse/parse_scout.py
Outdated
entry.append('PASS') #FILTER | ||
|
||
#write INFO | ||
info = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this into a def
mutacc_auto/parse/parse_scout.py
Outdated
|
||
entry.append(format) | ||
|
||
samples = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this into a def
Now recreates vcf files from the 'scout export variants --json --case-id ...' output