### VrsTranslate Module Overview

The `VrsTranslate` module facilitates the translation of VRS expressions to SPDI and HGVS formats. It utilizes external APIs for translation and validation and the vrs-python translator module.

#### Features
- **Translation to HGVS**: Translates VRS expressions to HGVS using the vrs-python translator module.
- **Translation to SPDI**: Translates VRS expressions to SPDI using the vrs-python translator module.

#### Dependencies
- **External APIs**:
  - Biocommons SeqRepo API
- **Python Packages**:
  - vrs-python


In [29]:
import json
from src.vrs.vrs_utils import VrsTranslate
vrs_translate = VrsTranslate()

from ga4gh.vrs import models

In [30]:
vrs_example_data = [{
  "_id": "ga4gh:VA.BmF3zr2l6XLpLaK8GInM6Q3Emc3JyPD3",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.i6Of9s2jVDuJ4vwU6sCeG-jT7ygmlfx6",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "start": {
        "type": "Number",
        "value": 1014263
      },
      "end": {
        "type": "Number",
        "value": 1014265
      }
    }
  },
  "state": {
    "type": "LiteralSequenceExpression",
    "sequence": "C"
  }
},
{
  "_id": "ga4gh:VA.J9BMdktHGGjE843oD0T_bwUV6WxojkCW",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.TMxdXtmi4ctcTRipHMD6py1Nv1kLMyJd",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "start": {
        "type": "Number",
        "value": 113901365
      },
      "end": {
        "type": "Number",
        "value": 113901365
      }
    }
  },
  "state": {
    "type": "LiteralSequenceExpression",
    "sequence": "ATA"
  }
},
{
  "_id": "ga4gh:VA.OpO3jwlmnhvpmEs2v9orWvMIa7UPb1To",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.veKlh4sQPAIr1HNoqjmsm7qZa0FNfjI9",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "start": {
        "type": "Number",
        "value": 5880117
      },
      "end": {
        "type": "Number",
        "value": 5880127
      }
    }
  },
  "state": {
    "type": "LiteralSequenceExpression",
    "sequence": "TGAGCTTCCATGAGCTTCCA"
  }
}]

vrs_objects = [models.Allele(**data) for data in vrs_example_data]

In [34]:
# Translate a VRS object to a SPDI string 
for allele in vrs_objects:
    print(f'VRS Expression:\n{json.dumps(allele.as_dict(), indent=2)}')
    print(f'Translated to SPDI: {vrs_translate.from_vrs_to_spdi(allele)}\n')

VRS Expression:
{
  "_id": "ga4gh:VA.BmF3zr2l6XLpLaK8GInM6Q3Emc3JyPD3",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.i6Of9s2jVDuJ4vwU6sCeG-jT7ygmlfx6",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "start": {
        "type": "Number",
        "value": 1014263
      },
      "end": {
        "type": "Number",
        "value": 1014265
      }
    }
  },
  "state": {
    "type": "LiteralSequenceExpression",
    "sequence": "C"
  }
}
Translated to SPDI: NC_000001.11:1014263:2:C

VRS Expression:
{
  "_id": "ga4gh:VA.J9BMdktHGGjE843oD0T_bwUV6WxojkCW",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.TMxdXtmi4ctcTRipHMD6py1Nv1kLMyJd",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "start": {
        "type": "Number",
        "value": 113901365
      },
      "

In [36]:
# Translate a VRS dictionary to HGVS expression string 
for allele in vrs_objects:
    print(f'VRS Expression:\n{json.dumps(allele.as_dict(), indent=2)}')
    print(f'Translated to HGVS: {vrs_translate.from_vrs_to_hgvs(allele)}\n') 

VRS Expression:
{
  "_id": "ga4gh:VA.BmF3zr2l6XLpLaK8GInM6Q3Emc3JyPD3",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.i6Of9s2jVDuJ4vwU6sCeG-jT7ygmlfx6",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "start": {
        "type": "Number",
        "value": 1014263
      },
      "end": {
        "type": "Number",
        "value": 1014265
      }
    }
  },
  "state": {
    "type": "LiteralSequenceExpression",
    "sequence": "C"
  }
}
Translated to HGVS: NC_000001.11:g.1014265del

VRS Expression:
{
  "_id": "ga4gh:VA.J9BMdktHGGjE843oD0T_bwUV6WxojkCW",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.TMxdXtmi4ctcTRipHMD6py1Nv1kLMyJd",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "start": {
        "type": "Number",
        "value": 113901365
      },
      

### CVCTranslator Module Overview

The `CVCTranslator` module offers functionality for translating variations from HGVS, SPDI, or VRS formats into a standardized representation known as `CoreVariantClass`.

#### Features
- **SPDI to CoreVariantClass Translation**: Translates SPDI expressions into CoreVariantClass objects.

- **HGVS to CoreVariantClass Translation**: Translates HGVS expressions into CoreVariantClass objects.

- **VRS to CoreVariantClass Translation**: Translates VRS expressions into CoreVariantClass objects.

#### Dependencies
- **External APIs**:
  - Biocmmons SeqRepo API
  - NCBI Variation Services API

- **Python Packages**:
  - bioutils.normalize
  - hgvs

In [25]:
from src.variant_to_cvc_translate import CVCTranslator
cvc_translator = CVCTranslator()

In [38]:
for allele in vrs_objects:
    print(f'VRS Expression:\n{json.dumps(allele.as_dict(), indent=2)}')
    print(f'Translated to CVC:\n{cvc_translator.vrs_to_cvc(allele)}\n')

VRS Expression:
{
  "_id": "ga4gh:VA.BmF3zr2l6XLpLaK8GInM6Q3Emc3JyPD3",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.i6Of9s2jVDuJ4vwU6sCeG-jT7ygmlfx6",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "start": {
        "type": "Number",
        "value": 1014263
      },
      "end": {
        "type": "Number",
        "value": 1014265
      }
    }
  },
  "state": {
    "type": "LiteralSequenceExpression",
    "sequence": "C"
  }
}
Translated to CVC:
CoreVariantClass(0-based interbase,DNA,CC,C,1014263,1014265,None,None,None,None,None,NC_000001.11,{})

VRS Expression:
{
  "_id": "ga4gh:VA.J9BMdktHGGjE843oD0T_bwUV6WxojkCW",
  "type": "Allele",
  "location": {
    "_id": "ga4gh:VSL.TMxdXtmi4ctcTRipHMD6py1Nv1kLMyJd",
    "type": "SequenceLocation",
    "sequence_id": "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "interval": {
      "type": "SequenceInterval",
      "sta