# 4 Exploring the AlleleTranslator
There are four variant nomenclatures available in the vrs-python *AlleleTranslator*: SPDI, gnomad/VCF, Beacon and HGVS. In this notebook we will perform a simple Allele translation for each. We will use each of the four nomenclatures translating variants to VRS.

#### Step 1 - Setup Data Proxy Access
The *DataProxy* provides access to sequence references.

In [1]:
from ga4gh.vrs.dataproxy import create_dataproxy
seqrepo_rest_service_url = "seqrepo+https://services.genomicmedlab.org/seqrepo"
seqrepo_dataproxy = create_dataproxy(uri=seqrepo_rest_service_url)

Import the *AlleleTranslator* class.

In [2]:
from ga4gh.vrs.extras.translator import AlleleTranslator

The UTA server is required in the environment since we are translating from/to HGVS.

In [3]:
import os
os.environ["UTA_DB_URL"] = "postgresql://anonymous:anonymous@uta.biocommons.org:5432/uta/uta_20210129b"

#### From/To HGVS
This example will translate an HGVS variant to VRS using the *AlleleTranslator* *translate_from* method. This variant can be viewed in [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/variation/989417/?oq=989417)

In [4]:
allele_translator = AlleleTranslator(data_proxy=seqrepo_dataproxy)
allele = allele_translator.translate_from("NC_000017.11:g.61860483_61863459delinsTGCGCCACCACGCCCAGCTAATTTTGGATTTTTAG", "hgvs")
allele.model_dump(exclude_none=True)

{'id': 'ga4gh:VA._Dy-JtNisSoxwmkOonFAjywkrVrtBz1d',
 'type': 'Allele',
 'digest': '_Dy-JtNisSoxwmkOonFAjywkrVrtBz1d',
 'location': {'id': 'ga4gh:SL.ggzAgF2Nwx-oLD2NFXtSX4lEjtVN-YPK',
  'type': 'SequenceLocation',
  'digest': 'ggzAgF2Nwx-oLD2NFXtSX4lEjtVN-YPK',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.dLZ15tNO1Ur0IcGjwc3Sdi_0A6Yf4zm7'},
  'start': 61860482,
  'end': 61863459},
 'state': {'type': 'LiteralSequenceExpression',
  'sequence': 'TGCGCCACCACGCCCAGCTAATTTTGGATTTTTAG'}}

The output from above is the VRS representation of the *Allele*. Using the *AlleleTranslator* *translate_to* method we can get back to the HGVS representation.

In [5]:
allele_translator.translate_to(allele, "hgvs")

['NC_000017.11:g.61860483_61863459delinsTGCGCCACCACGCCCAGCTAATTTTGGATTTTTAG']

The AlleleTranslator class by default will use "GRCh38" as the default assembly when performing translation. But the actual assembly used for translation will be inferred from the reference sequence passed as part of the HGVS variant. A specific default assembly may be specified when creating an AlleleTranslator by passing in the keyword argument "default_assembly_name" with the assembly:
> AlleleTranslator(data_proxy=seqrepo_dataproxy, default_assembly_name="GRCh37")

This example is using the GRCh37 representation of the variant. This variant can be viewed in [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/variation/2461826) 

In [6]:
allele = allele_translator.translate_from("NC_000022.10:g.24379392C>T", "hgvs")
allele.model_dump(exclude_none=True)

{'id': 'ga4gh:VA.4XLIRag1TPKSGV8YkIBtOJTOH5ETOg0p',
 'type': 'Allele',
 'digest': '4XLIRag1TPKSGV8YkIBtOJTOH5ETOg0p',
 'location': {'id': 'ga4gh:SL.GcUa9awkL-Tw3vwvnKO19bO_DpatpIbr',
  'type': 'SequenceLocation',
  'digest': 'GcUa9awkL-Tw3vwvnKO19bO_DpatpIbr',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.XOgHwwR3Upfp5sZYk6ZKzvV25a4RBVu8'},
  'start': 24379391,
  'end': 24379392},
 'state': {'type': 'LiteralSequenceExpression', 'sequence': 'T'}}

#### From/To SPDI
Example of translation a SPDI representation of a variant to and from VRS. This variant can be viewed in [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/variation/2161661).

In [7]:
allele = allele_translator.translate_from("NC_000014.9:95116662:G:A","spdi")
allele.model_dump(exclude_none=True)

{'id': 'ga4gh:VA.VC-fz_GLT49WeUj2umwzGPxgldJHsq85',
 'type': 'Allele',
 'digest': 'VC-fz_GLT49WeUj2umwzGPxgldJHsq85',
 'location': {'id': 'ga4gh:SL.mjTPtghQ7gXU8J-S7q1c8ksTaVRTADXn',
  'type': 'SequenceLocation',
  'digest': 'mjTPtghQ7gXU8J-S7q1c8ksTaVRTADXn',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.eK4D2MosgK_ivBkgi6FVPg5UXs1bYESm'},
  'start': 95116662,
  'end': 95116663},
 'state': {'type': 'LiteralSequenceExpression', 'sequence': 'A'}}

In [8]:
allele_translator.translate_to(allele, "spdi")

['NC_000014.9:95116662:1:A']

#### From Beacon (VCF-like)
For variants represented in the Beacon nomenclature, the *AlleleTranslator* currently only supports *translate_from* to convert to VRS. *translate_to* is not yet supported.

In [9]:
allele = allele_translator.translate_from("13 : 32936732 G > C", "beacon")
allele.model_dump(exclude_none=True)

{'id': 'ga4gh:VA.GJ2JySBMXePcV2yItyvCfbGBUoawOBON',
 'type': 'Allele',
 'digest': 'GJ2JySBMXePcV2yItyvCfbGBUoawOBON',
 'location': {'id': 'ga4gh:SL.28YsnRvD40gKu1x3nev0gRzRz-5OTlpS',
  'type': 'SequenceLocation',
  'digest': '28YsnRvD40gKu1x3nev0gRzRz-5OTlpS',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ._0wi-qoDrvram155UmcSC-zA5ZK4fpLT'},
  'start': 32936731,
  'end': 32936732},
 'state': {'type': 'LiteralSequenceExpression', 'sequence': 'C'}}

#### From gnomAD style VCF
For variants represented in the gnomad nomenclature, the *AlleleTranslator* currently only supports *translate_from* to convert to VRS. *translate_to* is not yet supported.
This variant can be viewed in [gnomAD](https://gnomad.broadinstitute.org/variant/1-55051215-G-GA)

In [10]:
# allele_translator = AlleleTranslator(data_proxy=seqrepo_dataproxy, default_assembly_name="GRCh37")
allele = allele_translator.translate_from("1-55051215-G-GA", "gnomad")
allele.model_dump(exclude_none=True)

{'id': 'ga4gh:VA.FgKdbB-uC6xIU2j8pVlCw04f8KsJUBcg',
 'type': 'Allele',
 'digest': 'FgKdbB-uC6xIU2j8pVlCw04f8KsJUBcg',
 'location': {'id': 'ga4gh:SL.N8koWyjagSVChg4LKVmq9XepNlOrIPt6',
  'type': 'SequenceLocation',
  'digest': 'N8koWyjagSVChg4LKVmq9XepNlOrIPt6',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO'},
  'start': 55051215,
  'end': 55051215},
 'state': {'type': 'LiteralSequenceExpression', 'sequence': 'A'}}