### Overview of the Jupyter Notebook and Utils Module

I have developed several **Jupyter Notebooks** that demonstrate key functionalities related to **FHIR and VRS** schema interactions, including:  

- Creating a **FHIR AlleleProfile**  
- Creating a **FHIR SequenceProfile**  
- Constructing a **MolecularDefinition** resource  
- Performing **bidirectional translation** between **VRS and FHIR**  

However, these notebooks require users to have a **working knowledge of Jupyter Notebooks, Python, and the schema of both standards (FHIR & VRS)**.  
This notebook simplifies our workflow by reducing the need for extensive background knowledge. It provides a structured approach for efficiently generating and translating data between FHIR and VRS.

### Introducing the Utils Module: `allele_factory.py`

To streamline the creation of AlleleProfile objects, we developed the **`allele_factory.py`** module, located in the **utils directory**. This module simplifies the process by allowing users to generate **FHIR AlleleProfile** and **VRS Allele** objects with only **five key attributes**.  

### Functions in `allele_factory.py`  

#### **`create_fhir_allele()` – Generates a FHIR AlleleProfile**  

This function constructs a **FHIR AlleleProfile** using the following attributes:  

- `context_sequence_id` (**str**) → Accession number of the reference sequence. Supported prefixes include: ("NC_", "NG_", "NM_", "NR_", "NP_")
- `start` (**int**) → Start position of the allele  
- `end` (**int**) → End position of the allele  
- `allele_state` (**str**) → Literal value of the allele sequence state (e.g., ACGT)  
- `id_value` (**str**, optional, default= None) → The unique identifier for the AlleleProfile instance. If not provided, it defaults to None

#### **`create_vrs_allele()` – Generates a VRS Allele**  

This function constructs a **VRS Allele** using the following attributes:  
- `context_sequence_id` (**str**) → Accession number of the reference sequence. Supported prefixes include: ("NC_", "NG_", "NM_", "NR_", "NP_")
- `start` (**int**) → Start position of the allele  
- `end` (**int**) → End position of the allele  
- `allele_state` (**str**) → Literal value of the allele sequence state (e.g., ACGT)  
- `normalize` (**bool**, default=`True`) → Option to normalize the VRS object  

### What This Notebook Demonstrates

This notebook provides a structured **workflow** to:  

1. **Set Up & Import Modules**  
   - Load `AlleleFactory` and `VrsFhirAlleleTranslation`.  

2. **Create Allele Objects**  
   - Generate **FHIR AlleleProfiles** and **VRS Alleles** using `AlleleFactory`.  

3. **Perform Bidirectional Translation**  
   - Convert **FHIR → VRS** and **VRS → FHIR** using `VrsFhirAlleleTranslation`.  

In [8]:
# Importing the `AlleleFactory` class from the utils module
from utils.allele_factory import AlleleFactory

# Creating an instance of `AlleleFactory` to generate FHIR and VRS AlleleProfile objects
build_allele = AlleleFactory()

In [9]:
# Creating an HL7 FHIR AlleleProfile using the `create_fhir_allele` function
example_allele_profile = build_allele.create_fhir_allele(
    context_sequence_id="NC_000002.12",
    start=27453448,
    end=27453449,
    allele_state="T",
    id_value="example-allele-profile"
)

# Converting the AlleleProfile object into a dictionary representation for easy viewing
example_allele_profile.model_dump()

{'resourceType': 'MolecularDefinition',
 'identifier': [{'value': 'example-allele-profile'}],
 'moleculeType': {'coding': [{'system': 'http://hl7.org/fhir/sequence-type',
    'code': 'dna',
    'display': 'DNA Sequence'}]},
 'location': [{'sequenceLocation': {'sequenceContext': {'display': 'NC_000002.12'},
    'coordinateInterval': {'coordinateSystem': {'system': {'coding': [{'system': 'http://loinc.org',
         'code': 'LA30100-4',
         'display': '0-based interval counting'}]}},
     'startQuantity': {'value': 27453448.0},
     'endQuantity': {'value': 27453449.0}}}}],
 'representation': [{'focus': {'coding': [{'system': 'http://hl7.org/fhir/moleculardefinition-focus',
      'code': 'allele-state'}]},
   'literal': {'value': 'T'}}]}

In [10]:
# Creating a GA4GH VRS Allele (Version 1.3) using the `create_vrs_allele` function
example_vrs_allele = build_allele.create_vrs_allele(
    context_sequence_id="NC_000002.12",
    start=27453448,
    end=27453449,
    allele_state="T",
    normalize=True
)

# Converting the VRS Allele object into a dictionary representation for easy viewing
example_vrs_allele.as_dict()

{'_id': 'ga4gh:VA.fXvhngewkkyVwzEeSJRr5tro8Jcol6Q-',
 'type': 'Allele',
 'location': {'_id': 'ga4gh:VSL.nLMbYalHO4OEI2axqkyTMCQxrH98UpDN',
  'type': 'SequenceLocation',
  'sequence_id': 'ga4gh:SQ.pnAqCRBrTsUoBghSD1yp_jXWSmlbdh4g',
  'interval': {'type': 'SequenceInterval',
   'start': {'type': 'Number', 'value': 27453448},
   'end': {'type': 'Number', 'value': 27453449}}},
 'state': {'type': 'LiteralSequenceExpression', 'sequence': 'T'}}

In [11]:
# Importing the `VrsFhirAlleleTranslation` class from the `moldeftranslator` module
from moldeftranslator.allele_translator import VrsFhirAlleleTranslation

# Creating an instance of `VrsFhirAlleleTranslation` to enable bidirectional translation 
# between GA4GH VRS and HL7 FHIR AlleleProfile representations
alleleTranslator= VrsFhirAlleleTranslation()

In [12]:
# Translating a GA4GH VRS Allele into an HL7 FHIR AlleleProfile
# This function takes a VRS Allele object and converts it into its corresponding FHIR representation
vrs_to_fhir_translation_example = alleleTranslator.vrs_allele_to_allele_profile(example_vrs_allele)

# Printing the type of the translated object to confirm the output class
print(type(vrs_to_fhir_translation_example))

# Converting the translated AlleleProfile object into a dictionary representation for easy viewing
vrs_to_fhir_translation_example.model_dump()

<class 'profiles.alleleprofile.AlleleProfile'>


{'resourceType': 'MolecularDefinition',
 'identifier': [{'value': 'ga4gh:VA.fXvhngewkkyVwzEeSJRr5tro8Jcol6Q-',
   'assigner': {'display': 'Global Alliance for Genomics and Health'}}],
 'moleculeType': {'coding': [{'system': 'http://hl7.org/fhir/sequence-type',
    'code': 'dna',
    'display': 'DNA Sequence'}]},
 'location': [{'sequenceLocation': {'sequenceContext': {'display': 'NC_000002.12'},
    'coordinateInterval': {'coordinateSystem': {'system': {'coding': [{'system': 'http://loinc.org',
         'code': 'LA30100-4',
         'display': '0-based interval counting'}]}},
     'startQuantity': {'value': 27453448.0},
     'endQuantity': {'value': 27453449.0}}}}],
 'representation': [{'focus': {'coding': [{'system': 'http://hl7.org/fhir/moleculardefinition-focus',
      'code': 'allele-state'}]},
   'literal': {'value': 'T'}}]}

In [13]:
# Translating an HL7 FHIR AlleleProfile into a GA4GH VRS Allele
# This function converts a FHIR AlleleProfile object into its corresponding VRS representation
fhir_to_vrs_translation_example = alleleTranslator.translate_allele_profile_to_vrs_allele(example_allele_profile)

# Printing the type of the translated object to confirm the output class
print(type(fhir_to_vrs_translation_example))

# Converting the translated VRS Allele object into a dictionary representation for easy viewing
fhir_to_vrs_translation_example.as_dict()

<class 'abc.Allele'>


{'_id': 'ga4gh:VA.fXvhngewkkyVwzEeSJRr5tro8Jcol6Q-',
 'type': 'Allele',
 'location': {'_id': 'ga4gh:VSL.nLMbYalHO4OEI2axqkyTMCQxrH98UpDN',
  'type': 'SequenceLocation',
  'sequence_id': 'ga4gh:SQ.pnAqCRBrTsUoBghSD1yp_jXWSmlbdh4g',
  'interval': {'type': 'SequenceInterval',
   'start': {'type': 'Number', 'value': 27453448},
   'end': {'type': 'Number', 'value': 27453449}}},
 'state': {'type': 'LiteralSequenceExpression', 'sequence': 'T'}}

### **Disclaimer on FHIR ↔ VRS Round-Trip Conversions**

When converting a **FHIR MolecularDefinition** object into a **GA4GH VRS Allele** object and then converting it back into **FHIR**, the resulting FHIR object will not be **exactly identical** to the original.

#### **Key Differences**
- The **identifier** field will change because the second FHIR object will now contain a **computed GA4GH VRS identifier** rather than the original identifier.
- The **assigner** field may also be added to reflect the **source of the VRS-computed identifier**.

This means that if you compare the **initial FHIR object** with the **final FHIR object** after a round-trip conversion through VRS, you will see differences in these fields.

#### **Consistency in VRS ↔ FHIR ↔ VRS Conversions**
However, if you start with a **VRS Allele object**, convert it into **FHIR MolecularDefinition**, and then convert it back into **VRS**, the two **VRS objects will be identical**.


### Conclusion

For a more detailed exploration of the implementation, refer to the other notebooks that provide an in-depth, step-by-step guide on creating these objects and performing translations between FHIR and VRS.  