Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect translation when the HGVS string does not contain a reference or alt allele #32

Open
melissacline opened this issue Feb 15, 2017 · 1 comment

Comments

@melissacline
Copy link

I've come across this problem with strings such as NM_007294.3:c.1209dup - which IMHO should actually be NM_007294.3:c.1209dupT (which is how ClinVar represents the variant), but mutalyzer claims that NM_007294.3:c.1209dup is valid HGVS... When I parse its name with

chrom, offset, ref, alt = hgvs.parse_hgvs_name(variant, genome, get_transcript=get_transcript)

I get the results that ref and alt are both 'C', where alt should be 'CC'. If there's a way around this, please let me know!

Thanks!

@mbosio85
Copy link

Hi,
I had the same issue and edited the init.py file in a couple of points
A disclaimer: It works for me because I have a small set to convert, I use genomic coordinates and no transcript in my conversion. If you apply these changes double check that it works fine before implementing in production
1- insert a clause in get_vcf_alleles() [the if=='dup' part)

def get_vcf_allele(hgvs, genome, transcript=None):
    """Get an VCF-style allele from a HGVSName, a genome, and a transcript."""
    chrom, start, end = hgvs.get_vcf_coords(transcript)
    _, alt = hgvs.get_ref_alt(
        transcript.tx_position.is_forward_strand if transcript else True)
    ref = get_genomic_sequence(genome, chrom, start, end)
    if hgvs.mutation_type == 'dup' and alt=='':
       alt = 2*ref[1:]
    if hgvs.mutation_type in _indel_mutation_types:
        # Left-pad alternate allele.
        alt = ref[0] + alt
    return chrom, start, end, ref, alt`

2- Commented out some lines in get_ref_alt()

  def get_ref_alt(self, is_forward_strand=True):
        """Return reference and alternate alleles."""
        if self.kind == 'p':
            raise NotImplementedError(
                'get_ref_alt is not implemented for protein HGVS names')
        alleles = [self.ref_allele, self.alt_allele]

        # Represent duplications are inserts.
        #if self.mutation_type == "dup":
        #    alleles[0] = ""
        #    alleles[1] = alleles[1][:len(alleles[1]) / 2]

        if is_forward_strand:
            return alleles
        else:
            return tuple(map(revcomp, alleles))


If you plan to do the same remember to reinstall the package once modified.
hope it helps,
Mattia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants