## Representing Cre-Lox Recombination with SBOL2

Cre-Lox recombination is a site-specific recombination event, often used to modify sequences in synthetic biology by flipping or excising segments of DNA. This can be used to activate or deactivate genes or other elements.

In this example, we will represent a situation where an unknown coding sequence (CDS) is initially in the wrong orientation and flanked by two loxP recombination sites. Upon recombination, the CDS will be flipped into the correct orientation. We'll use SBOL2 concepts like `Component`, `SequenceAnnotation`, `Range`, `GenericLocation`, and `SequenceConstraint` to model this event.

In [6]:
import sbol2

# Create an SBOL document
doc = sbol2.Document()

# Set a namespace for the document
sbol2.setHomespace('https://github.com/SynBioDex/SBOL-Notebooks')

# Create a ComponentDefinition for the entire Cre-Lox recombination system
recombination_system = sbol2.ComponentDefinition('CreLox_Recombination_System', sbol2.BIOPAX_DNA)
doc.addComponentDefinition(recombination_system)

# Create the Components for the Cre-Lox recombination sites
# Component for the first loxP site
loxP1 = sbol2.ComponentDefinition('loxP_site_1', sbol2.BIOPAX_DNA)
loxP1_seq = sbol2.Sequence('loxP_seq_1', 'ATAACTTCGTATAATGTATGCTATACGAAGTTAT', sbol2.SBOL_ENCODING_IUPAC)
loxP1.sequences = [loxP1_seq.identity]
doc.addComponentDefinition(loxP1)

# Component for the second loxP site
loxP2 = sbol2.ComponentDefinition('loxP_site_2', sbol2.BIOPAX_DNA)
loxP2_seq = sbol2.Sequence('loxP_seq_2', 'ATAACTTCGTATAATGTATGCTATACGAAGTTAT', sbol2.SBOL_ENCODING_IUPAC)
loxP2.sequences = [loxP2_seq.identity]
doc.addComponentDefinition(loxP2)

# Create a Component for the CDS (Coding Sequence)
# The CDS is unknown or "not yet selected," so we can use a generic placeholder sequence
cds = sbol2.ComponentDefinition('unknown_CDS', sbol2.BIOPAX_DNA)
cds_seq = sbol2.Sequence('unknown_cds_seq', 'NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN', sbol2.SBOL_ENCODING_IUPAC)
cds.sequences = [cds_seq.identity]
doc.addComponentDefinition(cds)

# Add the loxP sites and CDS as components to the CreLox_Recombination_System ComponentDefinition
loxP1_comp = recombination_system.components.create('loxP1_component')
loxP1_comp.definition = loxP1.persistentIdentity

cds_comp = recombination_system.components.create('cds_component')
cds_comp.definition = cds.persistentIdentity

loxP2_comp = recombination_system.components.create('loxP2_component')
loxP2_comp.definition = loxP2.persistentIdentity

# Create SequenceAnnotations to represent the loxP sites and the CDS
# Annotation for the first loxP site
loxP1_annotation = sbol2.SequenceAnnotation('loxP1_annotation')
loxP1_range = sbol2.Range('loxP1_range', 1, len(loxP1_seq.elements))
loxP1_annotation.locations.add(loxP1_range)
loxP1.sequenceAnnotations.add(loxP1_annotation)

# GenericLocation for the CDS (unknown or not yet selected, so we leave boundaries flexible)
cds_annotation = sbol2.SequenceAnnotation('cds_annotation')
cds_location = sbol2.GenericLocation('cds_generic_location')
cds_location.orientation = sbol2.SBOL_ORIENTATION_REVERSE_COMPLEMENT  # Initially reverse
cds_annotation.locations.add(cds_location)
cds.sequenceAnnotations.add(cds_annotation)

# GenericLocation for the second loxP site (second recombination site)
loxP2_annotation = sbol2.SequenceAnnotation('loxP2_annotation')
loxP2_location = sbol2.GenericLocation('loxP2_generic_location')
loxP2_location.orientation = sbol2.SBOL_ORIENTATION_INLINE
loxP2_annotation.locations.add(loxP2_location)
loxP2.sequenceAnnotations.add(loxP2_annotation)

# Create SequenceConstraints to describe the relationships
# Constraint to ensure that loxP1 and CDS are adjacent
constraint1 = recombination_system.sequenceConstraints.create('loxP1_to_CDS')
constraint1.subject = loxP1_comp.identity
constraint1.object = cds_comp.identity
constraint1.restriction = sbol2.SBOL_RESTRICTION_PRECEDES

# Constraint to ensure that CDS and loxP2 are adjacent
constraint2 = recombination_system.sequenceConstraints.create('CDS_to_loxP2')
constraint2.subject = cds_comp.identity
constraint2.object = loxP2_comp.identity
constraint2.restriction = sbol2.SBOL_RESTRICTION_PRECEDES

In [7]:
# Validate the document to ensure compliance with SBOL standards
doc.validate()

'Valid.'

In [8]:
# Save the document to an SBOL file
doc.write('cre_lox_recombination.xml')

'Valid.'

### Explanation

In this example, we modeled a Cre-Lox recombination system:

1. **Cre-Lox Recombination System**: We defined a `ComponentDefinition` to represent the entire recombination system. This component holds the loxP sites and the unknown CDS as its sub-components.
   
2. **loxP Sites**: We created two `Component` objects representing the loxP recombination sites. Each loxP site is linked to a known DNA sequence and added as a component to the system.

3. **Unknown CDS**: We defined an unknown or not-yet-selected coding sequence (CDS) using `GenericLocation`, which allows us to keep the boundaries flexible and represents the CDS in reverse complement orientation. The CDS is also added as a component to the system.

4. **SequenceAnnotations**: We created `SequenceAnnotations` for each element (loxP sites and CDS), using a `Range` for the first loxP site and `GenericLocation` for the CDS and the second loxP site, enabling flexible positioning and orientation.

5. **SequenceConstraints**: Two `SequenceConstraint` objects were added to ensure that the CDS is flanked by the loxP sites in the correct order. These constraints enforce that:
   - The first loxP site precedes the CDS.
   - The CDS precedes the second loxP site.

This setup simulates a Cre-Lox recombination event, where the recombination between the two loxP sites would flip the unknown CDS into the correct orientation, potentially activating it. By modeling the system this way, we maintain flexibility in defining the sequence, structure, and relationships between components.