# Basic Example

This example demonstrates key concepts of the mdmodels library:

- Loading and parsing data model definitions from markdown files
- Creating and working with model objects (ChemicalProject, Molecules, etc.)
- Validating data and references between objects
- Serializing data to different formats (JSON, XML)

We'll cover:
1. Loading a chemical project data model
2. Creating model instances and populating with data
3. Working with nested objects and collections
4. Data validation and reference checking
5. Converting objects to JSON and XML formats


In [1]:
import rich
import mdmodels

mdmodels.patch_nest_asyncio()

from mdmodels import DataModel

### Parsing the markdown file

The `DataModel.from_markdown` method parses the markdown file and returns a `Library` object. The `Library` object contains the objects defined in the markdown file and can be used to create a `ChemicalProject`. The classes in the `Library` object are not yet instantiated. You can use these classes to create new objects that are populated with data.

In [2]:
dm = DataModel.from_markdown("model.md")

# Unsure which objects are available? Print the library.
rich.print(dm)

# Optionally, you can use the `info` method to get a list of all available objects and attributes.
# We will omit this for now, as it will print a lot of information.
# dm.info()

# Unsure which attributes are available for a given object? Use the `info` method.
dm.ChemicalProject.info()


### Creating objects

The goal of the mdmodels library is to generate a so called object-oriented data model from the markdown file. This object-oriented data model can then be used to create and manage complex datasets. What does this mean?

- Each object in the data model corresponds to a class in Python.
- Each attribute of an object corresponds to a field/property in the class.
- Each object can have multiple attributes, which can be used to group related data.
- Attributes can be either singular or collections.

Let's create a `ChemicalProject` object!

In [3]:
# This instantiates a `ChemicalProject` object with the title 'My Project'.
project = dm.ChemicalProject(title="My Project")

# Each attribute of an object, which is a collection, can be added to the object using the `add_to_<attribute>` method.
# Here we add two molecules to the `molecules` collection of the `project` object.
project.add_to_molecules(id="mol1", name="Molecule 1", formula="CCO")
project.add_to_molecules(id="mol2", name="Molecule 2", formula="CCN")

# Sometimes we are working with more nested objects and want to build them up step by step.
experiment = project.add_to_experiments(id="exp1")
experiment.add_to_initial_concentrations(molecule_id="mol1", value=1.0, unit="mmol/L")
experiment.add_to_initial_concentrations(molecule_id="mol2", value=2.0, unit="mmol/L")

# This is what the project looks like after adding the experiment.
rich.print(project)

### Validation

By default, the `DataModel` object validates the objects against the schema defined in the markdown file. This is useful to catch type errors and other issues early on. However, this is limited to very basic checks. For instance, we are not able to check that the molecules used in the experiment are indeed part of the project.

To solve this issue, you can use the attribute option `References`, which allows you to specify references to other objects in the library across different objects. Even if the objects are in a very different branch of the object tree, you can still reference them. Here is the `molecule_id` attribute of the `Concentration` object:

```markdown
- molecule_id
    - Type: string
    - Description: The identifier of the molecule.
    - References: ChemicalProject.molecules.id
```

In this case, we are telling the `DataModel` object that the `molecule_id` attribute of the `Concentration` object must reference the `id` attribute of the `Molecule` object in the `ChemicalProject` object. Otherwise, the dataset is not valid.

In our example, we can use this to check that the molecules used in the experiment are indeed part of the project. This ensures consistency of the dataset and gives us a way to detect errors early on. For the sake of this example, lets introduce an error by adding a molecule that is not part of the project.

In [4]:
from pydantic import ValidationError

# This molecule does not exist in the project, so the dataset is not valid.
experiment.add_to_initial_concentrations(molecule_id="mol3", value=3.0, unit="mmol/L")
experiment.add_to_initial_concentrations(molecule_id="mol4", value=4.0, unit="mmol/L")

# Lets validate the project again.
try:
    project.validate()
except ValidationError as e:
    rich.print(e)


In [5]:
# Lets fix the error by adding the missing molecule to the project.
project.add_to_molecules(id="mol3", name="Molecule 3", formula="CCC")
project.add_to_molecules(id="mol4", name="Molecule 4", formula="CCO")

# Now the dataset is valid again. Should not raise any errors.
project.validate()


### Serialization

Now that our dataset is validated, it's the perfect time to immortalize it! With just a snap, the `model_dump_json` method transforms your dataset into a sleek JSON string. Want it in XML? No problem! The `xml` method has got you covered.

In [6]:
# To JSON
rich.print(project.model_dump_json(indent=2))

# To XML
rich.print(project.xml())

### Conversion

The `convert_to` method allows you to convert the generated `Library` (your data model) object to a different format and programming language. For instance, you can convert the object to an XSD schema, Mermaid Class Diagram, or even Rust code.

In [7]:
from mdmodels import Templates

# Convert to XSD schema
with open("model.xsd", "w") as f:
    f.write(dm.convert_to(Templates.XML_SCHEMA))

# Convert to Mermaid Class Diagram
with open("uml_diagram.md", "w") as f:
    f.write(dm.convert_to(Templates.MERMAID))

# Convert to Rust code
with open("model.rs", "w") as f:
    f.write(dm.convert_to(Templates.RUST))

# Convert to Julia code
with open("model.jl", "w") as f:
    f.write(dm.convert_to(Templates.JULIA))
