Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recreate vcf from variants in scout #6

Merged
merged 3 commits into from
Mar 13, 2019
Merged

Recreate vcf from variants in scout #6

merged 3 commits into from
Mar 13, 2019

Conversation

adrosenbaum
Copy link
Contributor

Now recreates vcf files from the 'scout export variants --json --case-id ...' output

Copy link
Collaborator

@henrikstranneheim henrikstranneheim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good! I think you can split the code up a bit, see comments

@@ -2,6 +2,7 @@
from datetime import datetime, timedelta

from mutacc_auto.commands.scout_command import ScoutExportCases
from mutacc_auto.parse.vcf_constants import *
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean import all? If so, it is usually better to be explicit. Otherwise your namespace might be polluted without you knowing it.


#Write header of vcf
for header_line in HEADER:
vcf_string += header_line + '\n'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a named constant + NEWLINE


#Get samples
samples = [sample['sample_id'] for sample in scout_vcf_output[0]['samples']]
samples = '\t'.join(samples)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here TAB

samples = [sample['sample_id'] for sample in scout_vcf_output[0]['samples']]
samples = '\t'.join(samples)

vcf_string += f"#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT\t{samples}\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like a simple "TAB.join(#CHROM, POS, ...etc)" if possible or even better
header_column_names = ["#CHROM", "POS", ..etc]
TAB.join(header_column_names, samples) + NEWLINE

That is a better description of what the parts are and more readable. You will have to excuse my python, but I think you get the idea

#Write variants
for variant in scout_vcf_output:

entry= []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entry = []

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a line is vcf is usually called a "record" I think...

for variant in scout_vcf_output:

entry= []
entry.append(str(variant['chromosome'])) #CHROM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you not make this into a for loop and append each iteration using the key as iterator.
something like:
entry = []
for column in scout_header_column_names:
if column in variant:
entry.append(str(variant[column] or '.')) # Since it was only uninitilized place

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle the PASS outside loop if it is not part of variant

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this into a def

entry.append('PASS') #FILTER

#write INFO
info = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this into a def


entry.append(format)

samples = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this into a def

@adrosenbaum adrosenbaum merged commit b38d31a into master Mar 13, 2019
@adrosenbaum adrosenbaum deleted the vcf_from_json branch March 13, 2019 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants