Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key SOMATIC in field INFO is not defined in the VCF file header for of indel calls #38

Closed
hongenxu opened this issue Jul 18, 2016 · 7 comments

Comments

@hongenxu
Copy link

hongenxu commented Jul 18, 2016

Hi,

I used picard-tools-1.141 SortVcf to sort VCF file produced by lofreq (version 2.1.2). Then the sorted VCF file will be combined with indel calls from other algorithms (as suggested in somaticseq pipeline). Picard gave me
"Exception in thread "main" java.lang.IllegalStateException: Key SOMATIC found in VariantContext field INFO at 1:11113181 but this key isn't defined in the VCFHeader. We require all VCFs to have complete VCF headers by default."

My VCF file example:
`##fileformat=VCFv4.0

fileDate=20160713

source=lofreq call -d 101000 -f /home/proj/MDW_genomics/xu/galgal5/galgal5.fa --verbose --no-default-filter -b 1 --call-indels -a 0.010000 -C 6 -s -S /scratch/xu/MDV_project/bqsr/S10_normal_stringent.snvs.vcf.gz,/scratch/xu/MDV_project/bqsr/S10_normal_stringent.indels.vcf.gz --no-default-filter -r 1:1-98101272 -o /tmp/3464340.1.lofn.q/lofreq2_call_parallel1xcqSq/0.vcf.gz /scratch/xu/MDV_project/bqsr/918-3_S10.tumor.bam

reference=/home/proj/MDW_genomics/xu/galgal5/galgal5.fa

INFO=<ID=DP,Number=1,Type=Integer,Description="Raw Depth">

INFO=<ID=AF,Number=1,Type=Float,Description="Allele Frequency">

INFO=<ID=SB,Number=1,Type=Integer,Description="Phred-scaled strand bias at this position">

INFO=<ID=DP4,Number=4,Type=Integer,Description="Counts for ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">

INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">

INFO=<ID=CONSVAR,Number=0,Type=Flag,Description="Indicates that the variant is a consensus variant (as opposed to a low frequency variant).">

INFO=<ID=HRUN,Number=1,Type=Integer,Description="Homopolymer length to the right of report indel position">

FILTER=<ID=min_dp_6,Description="Minimum Coverage 6">

FILTER=<ID=max_dp_100000,Description="Maximum Coverage 100000">

FILTER=<ID=sb_fdr,Description="Strand-Bias Multiple Testing Correction: fdr corr. pvalue > 0.001000">

FILTER=<ID=indelqual_bonf,Description="Indel Quality Multiple Testing Correction: bonf corr. pvalue < 0.010000">

INFO=<ID=UNIQ,Number=0,Type=Flag,Description="Unique, i.e. not detectable in paired sample">

INFO=<ID=UQ,Number=1,Type=Integer,Description="Phred-scaled uniq score at this position">

FILTER=<ID=uq_fdr,Description="Uniq Multiple Testing Correction: fdr corr. pvalue < 0.000100">

CHROM POS ID REF ALT QUAL FILTER INFO

1 5177961 . GA G 110 PASS DP=12;AF=0.416667;SB=0;DP4=2,5,2,3;INDEL;HRUN=9;SOMATIC;UQ=63`

My question is how can I modify my VCF FILE header to solve this small problem?

Best regards,
Hongen

@andreas-wilm
Copy link
Contributor

Hey Hongen, sorry about this.

You can fix existing files by adding the following line to the vcf header: ##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Somatic event">

Commit 7086354 fixed the issue.

Thanks for reporting,
Andreas

@NitinMandloi
Copy link

lofreq call-parallel --pp-threads 8 --call-indels -l file.bed --ref hg19.fa -o out.vcf input.bam

WARNING [2016-10-24 13:24:13,952]: Regions getting too small to be efficiently processed
INFO [2016-10-24 13:24:13,963]: Adding 19 commands to mp-pool
Number of substitution tests performed: 4476
Number of indel tests performed: 285
INFO [2016-10-24 13:24:28,884]: Executing lofreq filter -i /tmp/lofreq2_call_parallelbAdZ7k/concat.vcf.gz -o 30277_10000x_7985.vcf --snvqual-thresh 57 --indelqual-thresh 45

Why lofreq is not reporting any indel for me.

@NitinMandloi
Copy link

Any specific format for BAM file required

@andreas-wilm
Copy link
Contributor

Did pre-process the BAM file with GATK's BSQR or lofreq indelqual so that indel qualities were inserted?

@NitinMandloi
Copy link

I just align the reads with BWA-MEM and sam to bam conversion and sorting with samtools.

@NitinMandloi
Copy link

Got it. I need to use --dindel option.

Thank you

Regards

@andreas-wilm
Copy link
Contributor

This won't work. You need GATK's BQSR or lofreq indelqual as mentioned in
the documentation

On 24 October 2016 at 17:00, NitinMandloi notifications@github.com wrote:

I just align the reads with BWA-MEM and sam to bam conversion and sorting
with samtools.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#38 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABC5CQDCN6blOUzq4HLUYu1nxEtgtGidks5q3HOpgaJpZM4JO2zG
.

Andreas Wilm
andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants