# Usage

In [1]:
# Setup colab
try:
    from calipytion import Calibrator

except ImportError:
    print("Installing CaliPytion...")
    !git clone --quiet https://github.com/FAIRChemistry/CaliPytion.git
    %cd CaliPytion
    !git checkout dev > /dev/null 2>&1
    !pip install . > /dev/null 2>&1
    %cd docs

## The `Calibrator`

The `Calibrator` class is the main entry point to create and use calibration models for concentration calculation. It harbors all functionalities to load data, add calibration models, fit calibration models, calculate concentrations, and serialize serialization.  
The `Calibrator` contains a `Standard`, which harbors the data structure structuring all aspects of a fitted calibration model, measured standards, and measurement conditions.

### Initialization

The `Calibrator` can be initialized by providing the `concentrations` and respective measured `signals` as lists. Additionally, the `molecule_id` and `pubchem_cid` and the respective `conc_unit` need to be provided. The `molecule_id` serves as an internal id to reference the molecule in equations (E.g. 's1'). The `pubchem_cid` serves as a global identifier, to uniquely reference a molecule in the PubChem database. Optionally, a `molecule_name` and a `wavelength` can be provided if the signal data originates from a spectrophotometric measurement. Setting the `cutoff` can be provided to exclude signals above a certain value.
Units are handled as predefined objects which can be imported from the `calipytion.units` module.  

In [2]:
from calipytion import Calibrator

calibrator = Calibrator(
    molecule_id="s0",
    pubchem_cid=439153,
    molecule_name="NADH",
    conc_unit="mmol / l",
    concentrations=[0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5],
    signals=[0.02, 0.509, 1.076, 1.534, 2.008, 2.482, 2.898, 3.176],
    wavelength=420,
)

#### From `.xlsx` file

Alternatively, the calibrator can be initialized by loading the data from a sheet of an Excel file, using the `from_excel` method. By default, the first sheet is loaded. Otherwise, the `sheet_name` can be provided, to specify the sheet to be loaded. Alongside the `path` of the file, the `molecule_id`, `molecule_name`, and `conc_unit` need to be provided. Optionally, the `wavelength` can be provided. Furthermore, using `skip_rows` the number of rows to be skipped can be specified. By default, the first row is skipped, as it is assumed to contain the column names. 


In [3]:
from calipytion import Calibrator

calibrator = Calibrator.from_excel(
    path="data/cal_data.xlsx",
    molecule_id="s0",
    pubchem_cid=439153,
    molecule_name="NADH",
    conc_unit="mmol/l",
    wavelength=420,
    skip_rows=1,
)

In [4]:
calibrator


[1;35mCalibrator[0m[1m([0m
    [33mmolecule_id[0m=[32m's0'[0m,
    [33mpubchem_cid[0m=[1;36m439153[0m,
    [33mmolecule_name[0m=[32m'NADH'[0m,
    [33mconcentrations[0m=[1m[[0m[1;36m0.0[0m, [1;36m0.5[0m, [1;36m1.0[0m, [1;36m1.5[0m, [1;36m2.0[0m, [1;36m2.5[0m, [1;36m3.0[0m, [1;36m3.5[0m[1m][0m,
    [33mwavelength[0m=[1;36m420[0m[1;36m.0[0m,
    [33mconc_unit[0m=[1;35mUnitDefinition[0m[1m([0m
        [33mid[0m=[32m'mmol / l'[0m,
        [33mname[0m=[32m'mmol / l'[0m,
        [33mbase_units[0m=[1m[[0m
            [1;35mBaseUnit[0m[1m([0m[33mkind[0m=[1m<[0m[1;95mUnitType.MOLE:[0m[39m [0m[32m'mole'[0m[39m>, [0m[33mexponent[0m[39m=[0m[1;36m1[0m[39m, [0m[33mmultiplier[0m[39m=[0m[1;36m1[0m[1;36m.0[0m[39m, [0m[33mscale[0m[39m=[0m[1;36m-3.0[0m[1;39m)[0m[39m,[0m
[39m            [0m[1;35mBaseUnit[0m[1;39m([0m[33mkind[0m[39m=<UnitType.LITRE: [0m[32m'litre'[0m[1m>[0m, [33mexponen

### Adding calibration models
By default, the `Calibrator` is initialized with three polynomial calibration models, a linear, quadratic, and cubic model. These models do not include a parameter for the intercept. Thus blanked signals are assumed to be zero.  
Additional models can be added by using the `add_model` method. The method requires a `name` and a `signal_law`, describing the mathematical relationship between the signal and the concentration. Only the right side of the equation needs to be provided. The left side is assumed to be the signal. Additionally, the variable of the concentration must match the `molecule_id` provided during initialization.
Furthermore, an `initial_value`, the `lower_bound`, and the `upper_bound` can be provided for all parameters. 


In [5]:
# delete default models
calibrator.models = []

# add a linear, quadratic, cubic, and rational model with intercept
linear = calibrator.add_model(
    name="linear",
    signal_law="a * s0 + b",
    upper_bound=1e8,
    lower_bound=-1e8,
)
quadratic = calibrator.add_model(
    name="quadratic",
    signal_law="a * s0 ** 2 + b * s0 + c",
    upper_bound=1e8,
    lower_bound=-1e8,
)
cubic = calibrator.add_model(
    name="cubic",
    signal_law="a * s0 ** 3 + b * s0 ** 2 + c * s0 + d",
    upper_bound=1e8,
    lower_bound=-1e8,
)
rational = calibrator.add_model(
    name="rational",
    signal_law="a * s0 / (b + s0) + c",
    upper_bound=1e8,
    lower_bound=-1e8,
)

### Fitting calibration models

After specifying the calibration models, the models can be fitted to the data using the `fit_models` method. This fits all models that were added to the calibrator. After fitting, a table with fit statistics such as Akaike Information Criterion (AIC), R<sup>2</sup>, and Root Mean Squared Deviation (RMSD) is printed alongside the equation and relative parameter standard errors of each model.


In [6]:
calibrator.fit_models()

✅ Models have been successfully fitted.


### Visualizing calibration models

After fitting, the calibration models can be visualized using the `visualize` method. The method returns an interactive figure, allowing the user to inspect the calibration models and the residuals and to compare the models. 

In [7]:
calibrator.visualize()

### Concentration calculation

For concentration calculation, one of the fitted models can be chosen and given to the `calculate_concentrations` method. The data of unknown signals is provided as a list. By default, concentrations are only calculated if the signal is within the calibration range, which is based on the calibration data. Optionally, extrapolation can be enabled by setting `extrapolate=True`. 


In [8]:
concentrations = calibrator.calculate_concentrations(
    model=cubic, signals=[0.509, 0.9, 3.54], extrapolate=True
)

print(concentrations)

[32m2025-05-11 20:57:17.484[0m | [34m[1mDEBUG   [0m | [36mcalipytion.tools.fitter[0m:[36mcalculate_critical_points[0m:[36m201[0m - [34m[1mCritical points: [(-3.06895648248798, -1.81591111283094), (4.73655996101835, 3.54974739038730)][0m


[0.4857660054942056, 0.8658856333797266, 4.542875000854823]


The `calculate_concentrations` method returns a list of concentrations. If a signal is outside the calibration range, the concentration is set to `nan`.
Extrapolation can be enabled by setting `extrapolate=True`.

### Serialization

Finally, the data of the calibrator can be enriched with additional information on the calibrated molecule and the measurement conditions to form a valid `Standard`. This is done by using the `create_standard` method, as well as the chosen `model`. This method expects a `ph`, `temperature`, `temp_unit`. Furthermore, the `retention_time` can be provided to further characterize the context of the calibration.  


In [9]:
standard = calibrator.create_standard(
    model=cubic,
    ph=7.4,
    temperature=25,
    temp_unit="C",
    retention_time=7.53,
)

# save the standard as a JSON file
with open("data/abts_standard.json", "w") as f:
    f.write(standard.model_dump_json(indent=4))

### Applying a calibrator to an EnzymeML Document

The `Calibrator` can be applied to an `EnzymeMLDocument` to calculate concentrations of signals. This is done by using the `apply_to_enzymeml` method. The method expects an `EnzymeMLDocument` as input. Optionally, the `extrapolate` parameter can be set to enable extrapolation.  
In order to apply a calibrator to an `EnzymeMLDocument`, the measurement data within the `EnzymeMLDocument` must reference the same molecule. Therefore, the `species_id` in the `EnzymeMLDocument` must match the `molecule_id` of the `Calibrator`. The `EnzymeMLDocument` must be an instance of `pyenzyme.EnzymeMLDocument`.


In [10]:
import json
from pyenzyme import EnzymeMLDocument

with open("data/enzymeml.json", "r") as f:
    enzmldoc = EnzymeMLDocument(**json.load(f))

calibrator.apply_to_enzymeml(enzmldoc, extrapolate=False)

✅ Applied calibration to 2 measurements


In [11]:
animl_document = calibrator.to_animl()

with open("data/data.animl", "wb") as f:
    f.write(
        animl_document.to_xml(
            pretty_print=True,
            encoding="utf-8",
            standalone=True,
            xml_declaration=True,
            skip_empty=True,
            exclude_none=True,
        )
    )

✅ Created AnIML document



Only one type is supported for adder methods. IndividualValueSet.values has multiple types. Skipping.

