# 5. Exploring the CnvTranslator
The vrs-python model supports two classes of copy number variation: 
* CopyNumberChange -  an assessment of loss or gain relative to a location within a system, where loss or gain is represented by the following [EMBL-EBI Experimental Factor Ontology](https://www.ebi.ac.uk/efo/) (EFO) codes:
    * EFO:0030064 - regional base ploidy
    * EFO:0030067 - loss
    * EFO:0030068 - low-level loss
    * EFO:0030069 - complete genomic loss
    * EFO:0030070 - gain
    * EFO:0030071 - low level gain
    * EFO:0030072 - high-level gain
    * EFO:0020073 - high-level loss 
* CopyNumberCount - an absolute count of discrete copies of a location within a gene or system
For the CnvTranslator, only HGVS nomenclature is used to describe the variation.

#### Step 1 - Setup Data Proxy Access
The *DataProxy* provides access to sequence references.

In [1]:
from ga4gh.vrs.dataproxy import create_dataproxy
seqrepo_rest_service_url = "seqrepo+https://services.genomicmedlab.org/seqrepo"
seqrepo_dataproxy = create_dataproxy(uri=seqrepo_rest_service_url)

Import the *CnvTranslator* class.

In [2]:
from ga4gh.vrs.extras.translator import CnvTranslator

The UTA server is required in the environment since we are translating from/to HGVS.

In [3]:
import os
os.environ["UTA_DB_URL"] = "postgresql://anonymous:anonymous@uta.biocommons.org:5432/uta/uta_20210129b"

#### Step 2 - CopyNumberChange examples

This example depicts a *CopyNumberChange* representing a deletion, or copy number loss. The Experimental Factor Ontology code specifying the type of copy number change is passed as a keyword argument "copy_change" to *translate_from*. This variant can be viewed in [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/variation/984438).

In [4]:
cnv_translator = CnvTranslator(data_proxy=seqrepo_dataproxy)
cnc = cnv_translator.translate_from("NC_000014.9:g.45002867_45015056del", "hgvs", copy_change="EFO:0030067")
cnc.model_dump(exclude_none=True)

{'id': 'ga4gh:CX.XQt04FoCIptvgp6GtE2qjEaUJC7cr1wo',
 'type': 'CopyNumberChange',
 'digest': 'XQt04FoCIptvgp6GtE2qjEaUJC7cr1wo',
 'location': {'id': 'ga4gh:SL.GSJAEJXFDz7Nq6VlJj5NTEku48MmteUU',
  'type': 'SequenceLocation',
  'digest': 'GSJAEJXFDz7Nq6VlJj5NTEku48MmteUU',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.eK4D2MosgK_ivBkgi6FVPg5UXs1bYESm'},
  'start': 45002866,
  'end': 45015056},
 'copyChange': 'EFO:0030067'}

This example depicts a CopyNumberChange* representing a duplication, or copy number gain. This variant can be viewed in [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/variation/549625).

In [6]:
cnx = cnv_translator.translate_from("NC_000009.12:g.75502958_76045032dup", "hgvs", copy_change="EFO:0030070")
cnx.model_dump(exclude_none=True)

{'id': 'ga4gh:CX.3eGz_p3ufUDGtk87RYBI22dfihLInCOa',
 'type': 'CopyNumberChange',
 'digest': '3eGz_p3ufUDGtk87RYBI22dfihLInCOa',
 'location': {'id': 'ga4gh:SL.tydo6UFL8Y60L5Me3k8AJfljURO9vYn9',
  'type': 'SequenceLocation',
  'digest': 'tydo6UFL8Y60L5Me3k8AJfljURO9vYn9',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.KEO-4XBcm1cxeo_DIQ8_ofqGUkp4iZhI'},
  'start': 75502957,
  'end': 76045032},
 'copyChange': 'EFO:0030070'}

#### Step 3 - CopyNumberCount examples

This example depicts a *CopyNumberCount* with a copy number gain. With copy number count variation, the "copies" keyword argument is passed to *translate_from* with the appropriate "EFO" ontology code. This variant can be viewed in [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/variation/2579174/).

In [7]:
cnc = cnv_translator.translate_from("NC_000004.12:g.85624_57073230dup", "hgvs", copies="3")
cnc.model_dump(exclude_none=True)

{'id': 'ga4gh:CN.O_QHImmfErh9jDFkJaypPPvUmnj7EM70',
 'type': 'CopyNumberCount',
 'digest': 'O_QHImmfErh9jDFkJaypPPvUmnj7EM70',
 'location': {'id': 'ga4gh:SL.hBVWalem_rNclxjmUuT9CHbEGCdlqW9L',
  'type': 'SequenceLocation',
  'digest': 'hBVWalem_rNclxjmUuT9CHbEGCdlqW9L',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.HxuclGHh0XCDuF8x6yQrpHUBL7ZntAHc'},
  'start': 85623,
  'end': 57073230},
 'copies': 3}

This example depicts a *CopyNumberCount* with a copy number loss. This variant can be viewed in [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/variation/2579226/).

In [8]:
cnc = cnv_translator.translate_from("NC_000021.9:g.46111353_46119948del", "hgvs", copies="1")
cnc.model_dump(exclude_none=True)

{'id': 'ga4gh:CN.WDzlT9oUq4IcQrVRWGH0dZnARnFBotCS',
 'type': 'CopyNumberCount',
 'digest': 'WDzlT9oUq4IcQrVRWGH0dZnARnFBotCS',
 'location': {'id': 'ga4gh:SL.H1Zh5xdBqamBjwVE9orWdY_uBkpEMH1V',
  'type': 'SequenceLocation',
  'digest': 'H1Zh5xdBqamBjwVE9orWdY_uBkpEMH1V',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.5ZUqxCmDDgN4xTRbaSjN8LwgZironmB8'},
  'start': 46111352,
  'end': 46119948},
 'copies': 1}