Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosmic Annotation is not parsed correctly #83

Closed
roselucia opened this issue Nov 10, 2019 · 2 comments
Closed

Cosmic Annotation is not parsed correctly #83

roselucia opened this issue Nov 10, 2019 · 2 comments

Comments

@roselucia
Copy link

Dear Kai,

I used your new Annovar Version successfully. I am facing a little parsing bug however. The Cosmic Annotation is not getting parsed correctly I think
"cosmic90_coding=ID\x3dCOSV63870864\x3bOCCURENCE\x3d2(haematopoietic_and_lymphoid_tissue),1(large_intestine)"

(1) command line argument
perl table_annovar.pl /Users/rosefroehlich/Desktop/TST170_SnpEffAnnotation/TST170_32a_SnpEffAnnotation.vcf humandb/ -buildver hg19 -out /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a -remove -protocol refGene,ensGene,cytoBand,exac03,gnomad211_genome,gnomad211_exome,1000g2015aug_all,1000g2015aug_eur,avsnp150,dbnsfp35a,cosmic90_coding,cosmic90_noncoding,clinvar_20190305 -operation g,g,r,f,f,f,f,f,f,f,f,f,f -nastring . -vcfinput -polish

(2) Content of Terminal Window:
$ cd /Users/rosefroehlich/Desktop/Annovar_Safari_Download/annovar
Rose-Frohlichs-MacBook-Pro:annovar rosefroehlich$ perl table_annovar.pl /Users/rosefroehlich/Desktop/TST170_SnpEffAnnotation/TST170_32a_SnpEffAnnotation.vcf humandb/ -buildver hg19 -out /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a -remove -protocol refGene,ensGene,cytoBand,exac03,gnomad211_genome,gnomad211_exome,1000g2015aug_all,1000g2015aug_eur,avsnp150,dbnsfp35a,cosmic90_coding,cosmic90_noncoding,clinvar_20190305 -operation g,g,r,f,f,f,f,f,f,f,f,f,f -nastring . -vcfinput -polish

NOTICE: Running with system command <convert2annovar.pl -includeinfo -allsample -withfreq -format vcf4 /Users/rosefroehlich/Desktop/TST170_SnpEffAnnotation/TST170_32a_SnpEffAnnotation.vcf > /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput>
NOTICE: Finished reading 4858 lines from VCF file
NOTICE: A total of 4798 locus in VCF file passed QC threshold, representing 4192 SNPs (2019 transitions and 2173 transversions) and 606 indels/substitutions
NOTICE: Finished writing allele frequencies based on 4192 SNP genotypes (2019 transitions and 2173 transversions) and 606 indels/substitutions for 1 samples

NOTICE: Running with system command <table_annovar.pl /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/ -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a -remove -protocol refGene,ensGene,cytoBand,exac03,gnomad211_genome,gnomad211_exome,1000g2015aug_all,1000g2015aug_eur,avsnp150,dbnsfp35a,cosmic90_coding,cosmic90_noncoding,clinvar_20190305 -operation g,g,r,f,f,f,f,f,f,f,f,f,f -nastring . -polish -otherinfo>

NOTICE: Processing operation=g protocol=refGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg19 -dbtype refGene -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.refGene -exonsort -nofirstcodondel /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/>
NOTICE: Output files are written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.refGene.variant_function, /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.refGene.exonic_variant_function
NOTICE: Reading gene annotation from humandb/hg19_refGene.txt ... Done with 72212 transcripts (including 17527 without coding sequence annotation) for 28250 unique genes
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Reading FASTA sequences from humandb/hg19_refGeneMrna.fa ... Done with 455 sequences
WARNING: A total of 446 sequences will be ignored due to lack of correct ORF annotation

NOTICE: Running with system command <coding_change.pl /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.refGene.exonic_variant_function.orig humandb//hg19_refGene.txt humandb//hg19_refGeneMrna.fa -alltranscript -out /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.refGene.fa -newevf /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.refGene.exonic_variant_function>

NOTICE: Processing operation=g protocol=ensGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg19 -dbtype ensGene -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.ensGene -exonsort -nofirstcodondel /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/>
NOTICE: Output files are written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.ensGene.variant_function, /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.ensGene.exonic_variant_function
NOTICE: Reading gene annotation from humandb/hg19_ensGene.txt ... Done with 196501 transcripts (including 101155 without coding sequence annotation) for 57905 unique genes
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Reading FASTA sequences from humandb/hg19_ensGeneMrna.fa ... Done with 586 sequences
WARNING: A total of 6780 sequences will be ignored due to lack of correct ORF annotation

NOTICE: Running with system command <coding_change.pl /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.ensGene.exonic_variant_function.orig humandb//hg19_ensGene.txt humandb//hg19_ensGeneMrna.fa -alltranscript -out /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.ensGene.fa -newevf /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.ensGene.exonic_variant_function>

NOTICE: Processing operation=r protocol=cytoBand

NOTICE: Running with system command <annotate_variation.pl -regionanno -dbtype cytoBand -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/>
NOTICE: Output file is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_cytoBand
NOTICE: Reading annotation database humandb/hg19_cytoBand.txt ... Done with 862 regions
NOTICE: Finished region-based annotation on 4798 genetic variants

NOTICE: Processing operation=f protocol=exac03
NOTICE: Finished reading 8 column headers for '-dbtype exac03'

NOTICE: Running system command <annotate_variation.pl -filter -dbtype exac03 -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/ -otherinfo>
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_exac03_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_exac03_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Database index loaded. Total number of bins is 749886 and the number of bins to be scanned is 472
NOTICE: Scanning filter database humandb/hg19_exac03.txt...Done

NOTICE: Processing operation=f protocol=gnomad211_genome
NOTICE: Finished reading 17 column headers for '-dbtype gnomad211_genome'

NOTICE: Running system command <annotate_variation.pl -filter -dbtype gnomad211_genome -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/ -otherinfo>
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_gnomad211_genome_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_gnomad211_genome_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Database index loaded. Total number of bins is 28119985 and the number of bins to be scanned is 917
NOTICE: Scanning filter database humandb/hg19_gnomad211_genome.txt...Done

NOTICE: Processing operation=f protocol=gnomad211_exome
NOTICE: Finished reading 17 column headers for '-dbtype gnomad211_exome'

NOTICE: Running system command <annotate_variation.pl -filter -dbtype gnomad211_exome -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/ -otherinfo>
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_gnomad211_exome_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_gnomad211_exome_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Database index loaded. Total number of bins is 773145 and the number of bins to be scanned is 474
NOTICE: Scanning filter database humandb/hg19_gnomad211_exome.txt...Done

NOTICE: Processing operation=f protocol=1000g2015aug_all

NOTICE: Running system command <annotate_variation.pl -filter -dbtype 1000g2015aug_all -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/>
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_ALL.sites.2015_08_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_ALL.sites.2015_08_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Database index loaded. Total number of bins is 2824642 and the number of bins to be scanned is 669
NOTICE: Scanning filter database humandb/hg19_ALL.sites.2015_08.txt...Done

NOTICE: Processing operation=f protocol=1000g2015aug_eur

NOTICE: Running system command <annotate_variation.pl -filter -dbtype 1000g2015aug_eur -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/>
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_EUR.sites.2015_08_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_EUR.sites.2015_08_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Database index loaded. Total number of bins is 2812033 and the number of bins to be scanned is 668
NOTICE: Scanning filter database humandb/hg19_EUR.sites.2015_08.txt...Done

NOTICE: Processing operation=f protocol=avsnp150

NOTICE: Running system command <annotate_variation.pl -filter -dbtype avsnp150 -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/>
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_avsnp150_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_avsnp150_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Database index loaded. Total number of bins is 28258790 and the number of bins to be scanned is 917
NOTICE: Scanning filter database humandb/hg19_avsnp150.txt...Done

NOTICE: Processing operation=f protocol=dbnsfp35a
NOTICE: Finished reading 70 column headers for '-dbtype dbnsfp35a'

NOTICE: Running system command <annotate_variation.pl -filter -dbtype dbnsfp35a -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/ -otherinfo>
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_dbnsfp35a_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_dbnsfp35a_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Database index loaded. Total number of bins is 550512 and the number of bins to be scanned is 456
NOTICE: Scanning filter database humandb/hg19_dbnsfp35a.txt...Done

NOTICE: Processing operation=f protocol=cosmic90_coding

NOTICE: Running system command <annotate_variation.pl -filter -dbtype cosmic90_coding -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/>
NOTICE: the --dbtype cosmic90_coding is assumed to be in generic ANNOVAR database format
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_cosmic90_coding_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_cosmic90_coding_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Scanning filter database humandb/hg19_cosmic90_coding.txt...Done

NOTICE: Processing operation=f protocol=cosmic90_noncoding

NOTICE: Running system command <annotate_variation.pl -filter -dbtype cosmic90_noncoding -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/>
NOTICE: the --dbtype cosmic90_noncoding is assumed to be in generic ANNOVAR database format
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_cosmic90_noncoding_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_cosmic90_noncoding_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Scanning filter database humandb/hg19_cosmic90_noncoding.txt...Done

NOTICE: Processing operation=f protocol=clinvar_20190305
NOTICE: Finished reading 5 column headers for '-dbtype clinvar_20190305'

NOTICE: Running system command <annotate_variation.pl -filter -dbtype clinvar_20190305 -buildver hg19 -outfile /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.avinput humandb/ -otherinfo>
NOTICE: the --dbtype clinvar_20190305 is assumed to be in generic ANNOVAR database format
NOTICE: Output file with variants matching filtering criteria is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_clinvar_20190305_dropped, and output file with other variants is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_clinvar_20190305_filtered
NOTICE: Processing next batch with 4798 unique variants in 4798 input lines
NOTICE: Database index loaded. Total number of bins is 45822 and the number of bins to be scanned is 317
NOTICE: Scanning filter database humandb/hg19_clinvar_20190305.txt...Done

NOTICE: Multianno output file is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_multianno.txt
NOTICE: Reading from /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_multianno.txt

NOTICE: VCF output is written to /Users/rosefroehlich/Desktop/TST170_Multianno/TST170_32a.hg19_multianno.vcf

(4) I use a MacBook Pro (13 inch, Early 2011), 2.3 GHz Intel Core i5, 8 GB 1333 MHz DDR3, Samsung SSD 840 EVO, Intel HD Graphics 3000 512 MB, C02FG0ENDH2L, macOS High Sierra 10.13.6)

Thanks a lot for your help.

All the best,
Rose

@hsiaoyi0504
Copy link

I thought this has been previously discussed: #41. Check this one as well: https://www.biostars.org/p/266798/

@roselucia
Copy link
Author

Yes, Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants