Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Oncotator input VCF REF/ALT alleles do not match output alleles #340

Closed
crutching opened this issue Feb 11, 2016 · 6 comments
Closed

Oncotator input VCF REF/ALT alleles do not match output alleles #340

crutching opened this issue Feb 11, 2016 · 6 comments

Comments

@crutching
Copy link

We ran across this issue working with a bare bones input VCF (CHROM, POS, ALT, and REF only, the rest '.'). For instance:

1 162740326 . GT C . . .

results in:

1 162740326 . CT C .

Similar behavior for C/TCT which becomes C/CCT. All of the examples of this behavior I have found to this point are indels.

@dwking2000
Copy link

the version of Oncotator we are running shows up in the output file as:

##oncotator_version=Oncotator_v1.8.0.0_|Flat_File_Reference_hg19|GENCODE_v19_CANONICAL|UniProt_AAxform_2014_12|COSMIC_v62_291112|dbNSFP_v2.4|1000gp3_20130502|dbSNP_build_142|ESP_6500SI-V2|ESP_6500SI-V2|ClinVar_12.03.20|UniProt_AA_2014_12|CCLE_By_GP_09292010|ORegAnno_UCSC_Track|Ensembl_ICGC_MUCOPA|TCGAScape_110405|HGNC_Sept172014|MutSig_Published_Results_20110905|Familial_Cancer_Genes_20110905|CCLE_By_Gene_09292010|gencode_xref_refseq_metadata_v19|HumanDNARepairGenes_20110905|COSMIC_FusionGenes_v62_291112|UniProt_2014_12|COSMIC_Tissue_291112|TUMORScape_20100104|CGC_full_2012-03-15|ACHILLES_Lineage_Results_110303

@LeeTL1220
Copy link
Contributor

Is this on the website or the standalone?

On Fri, Feb 12, 2016 at 5:26 PM, Doug King notifications@github.com wrote:

the version of Oncotator we are running shows up in the output file as:

##oncotator_version=Oncotator_v1.8.0.0_|Flat_File_Reference_hg19|
GENCODE_v19_CANONICAL|UniProt_AAxform_2014_12|COSMIC_v62_291112|
dbNSFP_v2.4|1000gp3_20130502|dbSNP_build_142|ESP_6500SI-V2|
ESP_6500SI-V2|ClinVar_12.03.20|UniProt_AA_2014_12|
CCLE_By_GP_09292010|ORegAnno_UCSC_Track|Ensembl_ICGC_MUCOPA|
TCGAScape_110405|HGNC_Sept172014|MutSig_Published_Results_20110905|
Familial_Cancer_Genes_20110905|CCLE_By_Gene_09292010|
gencode_xref_refseq_metadata_v19|HumanDNARepairGenes_20110905|
COSMIC_FusionGenes_v62_291112|UniProt_2014_12|COSMIC_Tissue_291112|
TUMORScape_20100104|CGC_full_2012-03-15|
ACHILLES_Lineage_Results_110303


Reply to this email directly or view it on GitHub
#340 (comment)
.

Lee Lichtenstein
Broad Institute
75 Ames Street, Room 7003EB
Cambridge, MA 02142
617 714 8632

@dwking2000
Copy link

Standalone

On Friday, February 12, 2016, Lee Lichtenstein notifications@github.com
wrote:

Is this on the website or the standalone?

On Fri, Feb 12, 2016 at 5:26 PM, Doug King <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

the version of Oncotator we are running shows up in the output file as:

##oncotator_version=Oncotator_v1.8.0.0_|Flat_File_Reference_hg19|
GENCODE_v19_CANONICAL|UniProt_AAxform_2014_12|COSMIC_v62_291112|
dbNSFP_v2.4|1000gp3_20130502|dbSNP_build_142|ESP_6500SI-V2|
ESP_6500SI-V2|ClinVar_12.03.20|UniProt_AA_2014_12|
CCLE_By_GP_09292010|ORegAnno_UCSC_Track|Ensembl_ICGC_MUCOPA|
TCGAScape_110405|HGNC_Sept172014|MutSig_Published_Results_20110905|
Familial_Cancer_Genes_20110905|CCLE_By_Gene_09292010|
gencode_xref_refseq_metadata_v19|HumanDNARepairGenes_20110905|
COSMIC_FusionGenes_v62_291112|UniProt_2014_12|COSMIC_Tissue_291112|
TUMORScape_20100104|CGC_full_2012-03-15|
ACHILLES_Lineage_Results_110303


Reply to this email directly or view it on GitHub
<
#340 (comment)

.

Lee Lichtenstein
Broad Institute
75 Ames Street, Room 7003EB
Cambridge, MA 02142
617 714 8632


Reply to this email directly or view it on GitHub
#340 (comment)
.

@LeeTL1220
Copy link
Contributor

what you've shown there is not a valid VCF. The ref and alt must start
with the reference base before the actual mutation. Oncotator is putting
the reference base as a prepend, which is correct.
On Feb 13, 2016 12:06 AM, "Doug King" notifications@github.com wrote:

Standalone

On Friday, February 12, 2016, Lee Lichtenstein notifications@github.com
wrote:

Is this on the website or the standalone?

On Fri, Feb 12, 2016 at 5:26 PM, Doug King <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

the version of Oncotator we are running shows up in the output file as:

##oncotator_version=Oncotator_v1.8.0.0_|Flat_File_Reference_hg19|
GENCODE_v19_CANONICAL|UniProt_AAxform_2014_12|COSMIC_v62_291112|
dbNSFP_v2.4|1000gp3_20130502|dbSNP_build_142|ESP_6500SI-V2|
ESP_6500SI-V2|ClinVar_12.03.20|UniProt_AA_2014_12|
CCLE_By_GP_09292010|ORegAnno_UCSC_Track|Ensembl_ICGC_MUCOPA|

TCGAScape_110405|HGNC_Sept172014|MutSig_Published_Results_20110905|
Familial_Cancer_Genes_20110905|CCLE_By_Gene_09292010|
gencode_xref_refseq_metadata_v19|HumanDNARepairGenes_20110905|

COSMIC_FusionGenes_v62_291112|UniProt_2014_12|COSMIC_Tissue_291112|
TUMORScape_20100104|CGC_full_2012-03-15|
ACHILLES_Lineage_Results_110303


Reply to this email directly or view it on GitHub
<

#340 (comment)

.

Lee Lichtenstein
Broad Institute
75 Ames Street, Room 7003EB
Cambridge, MA 02142
617 714 8632


Reply to this email directly or view it on GitHub
<
#340 (comment)

.


Reply to this email directly or view it on GitHub
#340 (comment)
.

@crutching
Copy link
Author

@LeeTL1220 Ah, yes, I overlooked that. I will have to bring this back to the developers of this particular variant caller.

I would say that changing GT/C to CT/C does not completely make sense. If you look at the reference, the base immediately preceding GT is G (AGGGTGT). So, you would write this as GGT/GC, certainly not CT/C. I understand that Oncotator is assuming the input VCF follows the spec, but I would think some quick validation should occur before modifying these fields and potentially grabbing incorrect annotations.

@dwking2000
Copy link

You can close this issue, we have determined the input VCF is not valid. It would be good to open another issue that addresses Oncotator allowing invalid data to be processed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants