### Overview of the Jupyter Notebook and Utils Module

I have developed several **Jupyter Notebooks** that demonstrate key functionalities related to **FHIR and VRS** schema interactions, including:  

- Creating a **FHIR Allele**  
- Creating a **FHIR Sequence**  
- Constructing a **MolecularDefinition** resource  
- Performing **bidirectional translation** between **VRS and FHIR**  

However, these notebooks require users to have a **working knowledge of Jupyter Notebooks, Python, and the schema of both standards (FHIR & VRS)**.  
This notebook simplifies our workflow by reducing the need for extensive background knowledge. It provides a structured approach for efficiently generating and translating data between FHIR and VRS.

### Introducing the Utils Module: `allele_factory.py`

To streamline the creation of FHIR Allele objects, we developed the **`allele_factory.py`** module, located in the **utils directory**. This module simplifies the process by allowing users to generate **FHIR Allele** and **VRS Allele** objects with only **five key attributes**.  

### Functions in `allele_factory.py`  

#### **`create_fhir_allele()` – Generates a FHIR Allele**  

This function constructs a **FHIR Allele** using the following attributes:  

- `context_sequence_id` (**str**): Accession number of the reference sequence. Supported prefixes include: ("NC_", "NG_", "NM_", "NR_", "NP_")
- `start` (**int**): Start position of the allele  
- `end` (**int**): End position of the allele  
- `allele_state` (**str**): Literal value of the allele sequence state (e.g., ACGT)  
- `id_value` (**str**, optional): The unique identifier for the Allele instance. If not provided, a default ID will be generated in the format 'ref-to-{context_sequence_id}'

#### **`create_vrs_allele()` – Generates a VRS Allele**  

This function constructs a **VRS Allele** using the following attributes:  
- `context_sequence_id` (**str**): Accession number of the reference sequence. Supported prefixes include: ("NC_", "NG_", "NM_", "NR_", "NP_")
- `start` (**int**): Start position of the sallele  
- `end` (**int**): End position of the allele  
- `allele_state` (**str**): Literal value of the allele sequence state (e.g., ACGT)  
- `normalize` (**bool**, default=`True`): Option to normalize the VRS object  

### What This Notebook Demonstrates

This notebook outlines a structured **workflow** to:

1. **Set Up & Import Modules**  
   - Load the `AlleleFactory` and `VrsFhirAlleleTranslator` modules.

2. **Generate VRS and Translate to FHIR**  
   - Create a **VRS Allele object** and convert it from **VRS → FHIR**.

3. **Round-Trip Translation: VRS → FHIR → VRS**  
   - Perform a **round-trip translation** back to VRS (**VRS → FHIR → VRS**).
   
4. **Generate FHIR and Translate to VRS**  
   - Create a **FHIR Allele object** and convert it from **FHIR → VRS**.

5. **Round-Trip Translation: FHIR → VRS → FHIR**  
   - Perform a **round-trip translation** back to VRS (**FHIR → VRS → FHIR**).

### Set up and import modules

In [1]:
# Importing the `AlleleFactory` class from the utils module
from utils.allele_factory import AlleleFactory
# Importing the `VrsFhirAlleleTranslator` class from the `translators` module
from translators.vrs_fhir_translator import VrsFhirAlleleTranslator

# Creating an instance of `AlleleFactory` to generate FHIR and VRS Allele objects
build_allele = AlleleFactory()

# Creating an instance of `VrsFhirAlleleTranslator` to enable bidirectional translation 
# between GA4GH VRS and HL7 FHIR Allele representations
allele_translator= VrsFhirAlleleTranslator()

### Create VRS, translate to FHIR

In [2]:
# Creating a GA4GH VRS Allele (Version 2.0) using the `create_vrs_allele` function
example_vrs_allele = build_allele.create_vrs_allele(
    context_sequence_id="NC_000002.12",
    start=27453448,
    end=27453449,
    allele_state="T",
    normalize=True
)

# Converting the VRS Allele object into a dictionary representation for easy viewing
example_vrs_allele.model_dump(exclude_none=True)

{'id': 'ga4gh:VA.xfKU4c8mG_yegL5ZOL26JDiznySNkoMl',
 'type': 'Allele',
 'digest': 'xfKU4c8mG_yegL5ZOL26JDiznySNkoMl',
 'location': {'id': 'ga4gh:SL.y0ckc1_lhMYKnh0f6FAEoEpgHyfX13OW',
  'type': 'SequenceLocation',
  'digest': 'y0ckc1_lhMYKnh0f6FAEoEpgHyfX13OW',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.pnAqCRBrTsUoBghSD1yp_jXWSmlbdh4g'},
  'start': 27453448,
  'end': 27453449},
 'state': {'type': 'LiteralSequenceExpression', 'sequence': 'T'}}

In [3]:
# Translating a GA4GH VRS Allele into an HL7 FHIR Allele
# This function takes a VRS Allele object and converts it into its corresponding FHIR representation
vrs_to_fhir_translation_example = allele_translator.translate_allele_to_fhir(example_vrs_allele)

# Converting the translated Allele object into a dictionary representation for easy viewing
vrs_to_fhir_translation_example.model_dump()

{'resourceType': 'MolecularDefinition',
 'contained': [{'resourceType': 'MolecularDefinition',
   'id': 'ref-to-nc000002',
   'moleculeType': {'coding': [{'system': 'http://hl7.org/fhir/sequence-type',
      'code': 'dna',
      'display': 'DNA Sequence'}]},
   'representation': [{'code': [{'coding': [{'system': 'http://www.ncbi.nlm.nih.gov/refseq',
         'code': 'NC_000002.12'}]}]}]}],
 'moleculeType': {'coding': [{'system': 'http://hl7.org/fhir/sequence-type',
    'code': 'dna',
    'display': 'DNA Sequence'}]},
 'location': [{'sequenceLocation': {'sequenceContext': {'reference': '#ref-to-nc000002',
     'type': 'MolecularDefinition'},
    'coordinateInterval': {'coordinateSystem': {'system': {'coding': [{'system': 'http://loinc.org',
         'code': 'LA30100-4',
         'display': '0-based interval counting'}]}},
     'startQuantity': {'value': 27453448.0},
     'endQuantity': {'value': 27453449.0}}}}],
 'representation': [{'focus': {'coding': [{'system': 'http://hl7.org/fhir/m

### Round-Trip Translation: VRS → FHIR → VRS

In [4]:
# Translate the FHIR Allele profile back to a VRS Allele object
back_to_vrs = allele_translator.translate_allele_to_vrs(vrs_to_fhir_translation_example)

print("Check if the original and round-tripped VRS Allele are identical.")
print(example_vrs_allele == back_to_vrs)

Check if the original and round-tripped VRS Allele are identical.
True


### Create FHIR, translate to VRS

In [5]:
# Creating an HL7 FHIR Allele using the `create_fhir_allele` function
example_fhir_allele = build_allele.create_fhir_allele(
    context_sequence_id="NC_000002.12",
    start=27453448,
    end=27453449,
    allele_state="T",
)

# Converting the Allele object into a dictionary representation for easy viewing
example_fhir_allele.model_dump(exclude_none=True)

{'resourceType': 'MolecularDefinition',
 'contained': [{'resourceType': 'MolecularDefinition',
   'id': 'ref-to-nc000002',
   'moleculeType': {'coding': [{'system': 'http://hl7.org/fhir/sequence-type',
      'code': 'dna',
      'display': 'DNA Sequence'}]},
   'representation': [{'code': [{'coding': [{'system': 'http://www.ncbi.nlm.nih.gov/refseq',
         'code': 'NC_000002.12'}]}]}]}],
 'moleculeType': {'coding': [{'system': 'http://hl7.org/fhir/sequence-type',
    'code': 'dna',
    'display': 'DNA Sequence'}]},
 'location': [{'sequenceLocation': {'sequenceContext': {'reference': '#ref-to-nc000002',
     'type': 'MolecularDefinition'},
    'coordinateInterval': {'coordinateSystem': {'system': {'coding': [{'system': 'http://loinc.org',
         'code': 'LA30100-4',
         'display': '0-based interval counting'}]}},
     'startQuantity': {'value': 27453448.0},
     'endQuantity': {'value': 27453449.0}}}}],
 'representation': [{'focus': {'coding': [{'system': 'http://hl7.org/fhir/m

In [6]:
# Translating an HL7 FHIR Allele into a GA4GH VRS Allele
# This function converts a FHIR Allele object into its corresponding VRS representation
fhir_to_vrs_translation_example = allele_translator.translate_allele_to_vrs(example_fhir_allele)

# Printing the type of the translated object to confirm the output class
print(type(fhir_to_vrs_translation_example))

# Converting the translated VRS Allele object into a dictionary representation for easy viewing
fhir_to_vrs_translation_example.model_dump(exclude_none=True)

<class 'ga4gh.vrs.models.Allele'>


{'id': 'ga4gh:VA.xfKU4c8mG_yegL5ZOL26JDiznySNkoMl',
 'type': 'Allele',
 'digest': 'xfKU4c8mG_yegL5ZOL26JDiznySNkoMl',
 'location': {'id': 'ga4gh:SL.y0ckc1_lhMYKnh0f6FAEoEpgHyfX13OW',
  'type': 'SequenceLocation',
  'digest': 'y0ckc1_lhMYKnh0f6FAEoEpgHyfX13OW',
  'sequenceReference': {'type': 'SequenceReference',
   'refgetAccession': 'SQ.pnAqCRBrTsUoBghSD1yp_jXWSmlbdh4g'},
  'start': 27453448,
  'end': 27453449},
 'state': {'type': 'LiteralSequenceExpression', 'sequence': 'T'}}

### Round-Trip Translation: FHIR → VRS → FHIR
- The `create_fhir_allele()` function supports round-trip compatibility between FHIR and VRS.
- If you want to enable full round-trip compatibility, **do not provide a custom `id_value`** when constructing an AlleleProfile.
- When `id_value` is omitted, a default identifier is automatically generated in the format:  
  `ref-to-{context_sequence_id}`
- This approach ensures consistent and lossless translation from FHIR → VRS → FHIR.

In [7]:
# Translate the VRS Allele object back to FHIR Allele 
back_to_fhir = allele_translator.translate_allele_to_fhir(fhir_to_vrs_translation_example)

print("Check if the original and round-tripped FHIR Allele are identical.")
print(example_fhir_allele == back_to_fhir)

Check if the original and round-tripped FHIR Allele are identical.
True


### Conclusion

For a more detailed exploration of the implementation, refer to the other notebooks that provide an in-depth, step-by-step guide on creating these objects and performing translations between FHIR and VRS.  