-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run post process variants #319
Comments
There are several questions here, and I can answer them all individually, but first can you help me understand overall what you are trying to do? For context, when you run run_deepvariant, it already includes make_examples, call_variants, and postprocess_variants. So are you trying to re-run postprocess_variants? |
@dhwani2410 my main aim to get a list of all variant as well as non-variant sites in VCF format and not g.vcf format. I thought running post-process variants may help me this. |
Okay, I see. The run_deepvariant script you ran already includes postprocess_variants as the last step, which is the stage that produced the VCF and optionally gVCF. As I answered in your other issue (#318), there unfortunately aren't any parameters in postprocess_variants that will generate a VCF of every base without variants. I'm including the below for reference in case you or others want to run individual steps or pass specific parameters into the make_examples, call_variants, or postprocess_variants stages. How to get usage information for run_deepvariant and other runner scriptsIf you need to add parameters for postprocess_variants for another reason, you can add a
And for the specific flags for postprocess_variants.py
Again, I want to make sure to reiterate that I'm including this for reference, but for your specific request, we already know that no parameters will give a VCF with all the non-variant bases in the genome. |
@dhwani2410 To do this manually, you can take the gVCF output by DeepVariant and use the following command from
Documentation at http://samtools.github.io/bcftools/bcftools.html#convert |
This is awesome, but are you sure this does what you think it does though? Because the gvcf from deepvariant is different from other gvcfs... |
i have used this command to run deep variant and generate VCF file
sudo docker run -v
pwd
:pwd
-wpwd
google/deepvariant:"${BIN_VERSION}" /opt/deepvariant/bin/run_deepvariant --model_type=WGS --ref=${base_path}/${ref_file_name}.fasta --reads=${base_path}/base_recalib/$i_vqsr.bam --output_vcf=deep_variant_results/$i.vcf.gz --output_gvcf=deep_variant_results/$i.g.vcf.gzI want to run post-process variants but cannot get it from the above command
Is there some way to add parameters to above command?
I found two links related to it:-
a) https://github.com/google/deepvariant/blob/r0.10/docs/deepvariant-gvcf-support.md
GVCF_TFRECORDS="${OUTPUT_DIR}/HG002.gvcf.tfrecord@${N_SHARDS}.gz"
( time seq 0 $((N_SHARDS-1)) |
parallel --halt 2 --joblog "${LOG_DIR}/log" --res "${LOG_DIR}"
python "${BIN_DIR}"/make_examples.zip
--mode calling
--ref "${REF}"
--reads "${BAM}"
--examples "${EXAMPLES}"
--gvcf "${GVCF_TFRECORDS}"
--task {}
) >"${LOG_DIR}/make_examples.log" 2>&1`
There is no make_examples.zip in the bin directory and what should be supplied to this parameter --examples. Can you please give more details about variables?
b) #103
sudo docker run -v ${HOME}:${HOME} gcr.io/deepvariant-docker/deepvariant:0.7.0 /opt/deepvariant/bin/postprocess_variants
--ref ${OUTDIR}/data/hg19.fa
--infile ${OUTDIR}/output/cvo.tfrecord.gz
--outfile ${OUTDIR}/output/output.vcf.gz
--nonvariant_site_tfrecord_path ${OUTDIR}/output/gvcf.tfrecord@8.gz
--gvcf_outfile ${OUTDIR}/output/output.gvcf.gz
This is another way and parameters are also different. how to define the infile here?
The text was updated successfully, but these errors were encountered: