# Low level API

Most users will use the `create_script_from_data()` to create the INCAScript, but if one wants more control over the script it is good to known the how the underlying api works.

In [26]:
import pandas as pd
import pathlib
from incawrapper import (
    define_flux_measurements, 
    define_ms_data, 
    define_experiment, 
    define_model, 
    define_tracers, 
    define_options, 
    define_runner,
    define_reactions,
    INCAScript,
    run_inca
)
import ast
data_folder = pathlib.Path("./examples/Literature data/simple model")

To illustrate how to use the INCAWrapper, we will use a small toy model with 5 reactions original used as a test model in [1,2]. As described in the Input data notebook, the INCAWrapper takes inputs in the form of pandas dataframes, which has to obey specific structure schema's. Let's have a look at the reactions data of our simple model.

In [27]:
reactions_data = pd.read_csv(data_folder / "reactions.csv")
reactions_data.head()

Unnamed: 0,model,rxn_id,rxn_eqn
0,simple_model,R1,A (abc) -> B (abc)
1,simple_model,R2,B (abc) <-> D (abc)
2,simple_model,R3,B (abc) -> C (bc) + E (a)
3,simple_model,R4,B (abc) + C (de) -> D (bcd) + E (a) + E (e)
4,simple_model,R5,D (abc) -> F (abc)


We see that the simple toy model consist of 5 reactions each defined with an atom map and has a unique identifier. Lets move on to the tracer data.

In [28]:
tracers_data = pd.read_csv(data_folder / "tracers.csv", converters={'atom_ids': ast.literal_eval, 'atom_mdv':ast.literal_eval}) # remove id, add prurity
tracers_data.head()

Unnamed: 0,experiment_id,met_id,tracer_id,atom_ids,ratio,atom_mdv,enrichment
0,exp1,A,[2-13C]A,[2],1.0,"[0, 1]",1
1,exp2,A,"[1,2-13C]A","[1, 2]",0.5,"[0.05, 0.95]",1


Our data set contains two experiments each carried out with a different labelled substrate of metabolite A. In this data set we had measurements of exchange fluxes and ms data of some metabolites. Notice that we use the converters argument to properly read some the data (see XX for more information).

In [29]:
flux_data = pd.read_csv(data_folder / "flux_measurements.csv")
ms_data = pd.read_csv(data_folder / "ms_measurements.csv", 
   converters={'labelled_atom_ids': ast.literal_eval, 'idv': ast.literal_eval, 'idv_std_error': ast.literal_eval}
)
flux_data.head()

Unnamed: 0,experiment_id,rxn_id,flux,flux_std_error
0,exp1,R1,10.0,1e-05


In [30]:
ms_data.head()

Unnamed: 0,experiment_id,met_id,ms_id,measurement_replicate,labelled_atom_ids,unlabelled_atoms,mass_isotope,intensity,intensity_std_error,time
0,exp1,F,F1,1,"[1, 2, 3]",,0,0.0001,2e-06,0
1,exp1,F,F1,1,"[1, 2, 3]",,1,0.8008,0.016016,0
2,exp1,F,F1,1,"[1, 2, 3]",,2,0.1983,0.003966,0
3,exp1,F,F1,1,"[1, 2, 3]",,3,0.0009,1.8e-05,0
4,exp2,F,F1,1,"[1, 2, 3]",,0,0.0002,2e-06,0


## Generate the MATLAB script
To prepare a model and data for 13C-MFA in INCA, we need to write a matlab script, which specifies to the model, the tracers and the measured data. The script is constructed incrementally, line by line, within an INCAScript object. When the INCAWrapper executes INCA, it essentially runs this script within the MATLAB environment. 

To ensure the orderly execution of the MATLAB code, the structure of the INCAScript object is organized into discrete code blocks. Throughout this workflow, users progressively populate these code blocks one at a time until the script is fully formed. The central mechanism for this procedure involves employing the .add_to_block() method of the INCAScript object, in combination with a function that generates the corresponding MATLAB code string.

Now, let's delve into the process of defining the model within the INCAScript. To achieve this, we make use of a script-writing function named define_reactions(). This function operates on a dataframe detailing the reactions and outputs a MATLAB code string that effectively defines the reactions within an INCA model.

In [31]:
print(define_reactions(reactions_data))

% Create reactions
r = [...
reaction('A (abc) -> B (abc)', 'id', 'R1'),...
reaction('B (abc) <-> D (abc)', 'id', 'R2'),...
reaction('B (abc) -> C (bc) + E (a)', 'id', 'R3'),...
reaction('B (abc) + C (de) -> D (bcd) + E (a) + E (e)', 'id', 'R4'),...
reaction('D (abc) -> F (abc)', 'id', 'R5'),...
];


To add the reaction definition to the `INCAScript` we use the `.add_to_block()` method.

In [32]:
script = INCAScript()
script.add_to_block('reactions', define_reactions(reactions_data))

We can view the reactions block in the `INCAScript`.

In [16]:
print(script.blocks['reactions'])

% REACTION BLOCK
% Create reactions
r = [...
reaction('A (abc) -> B (abc)', 'id', 'R1'),...
reaction('B (abc) <-> D (abc)', 'id', 'R2'),...
reaction('B (abc) -> C (bc) + E (a)', 'id', 'R3'),...
reaction('B (abc) + C (de) -> D (bcd) + E (a) + E (e)', 'id', 'R4'),...
reaction('D (abc) -> F (abc)', 'id', 'R5'),...
];


There are similar script writing functions to generate the other parts of the Matlab script. The experimental data is added to the `INCAScript` on a per experiment basis. Thus, the functions which adds experimental data also takes a argument for the experiment id. One example is the `define_tracers`:

In [17]:
print(define_tracers(tracers_data, 'exp1'))

% define tracers used in exp1
t_exp1 = tracer({...
'[2-13C]A: A @ 2',...
});
t_exp1.frac = [1.0 ];
t_exp1.atoms.it(:,1) = [0,1];



Because the `.add_to_block()` appends to the code block is best practice to include the `INCAScript` instantiation and the `.add_to_block()` calls within the same Jupyter-notebook cell. This will avoid adding the same code multiple times, when working a Jupyter-notebook file. In the following we add tracers, flux measurements, and ms measurements from the experiment with experiment id exp1.

In [18]:
script = INCAScript()
script.add_to_block('reactions', define_reactions(reactions_data))
script.add_to_block('tracers', define_tracers(tracers_data, 'exp1'))
script.add_to_block('fluxes', define_flux_measurements(flux_data, 'exp1'))
script.add_to_block('ms_fragments', define_ms_data(ms_data, 'exp1'))
script.add_to_block('experiments', define_experiment('exp1', measurement_types=['data_flx', 'data_ms']))

Notice that that addition of data is a two step procedure. First, the data is added to the script and second the data is added to the experiment using the `define_experiment()` function. This function takes the experiment id and a list of what measurement types are associated with this experiment. 

We can view the script that we have build by simply printing the script object.

In [None]:
print(script)

clear functions

% REACTION BLOCK
% Create reactions
r = [...
reaction('A (abc) -> B (abc)', ['id'], ['R1']),...
reaction('B (abc) <-> D (abc)', ['id'], ['R2']),...
reaction('B (abc) -> C (bc) + E (a)', ['id'], ['R3']),...
reaction('B (abc) + C (de) -> D (bcd) + E (a) + E (e)', ['id'], ['R4']),...
reaction('D (abc) -> F (abc)', ['id'], ['R5']),...
];

% TRACERS BLOCK
% define tracers used in exp1
t_exp1 = tracer({...
'[1-13C]A: A @ 1',...
});
t_exp1.frac = [1.0 ];
t_exp1.atoms.it(:,1) = [0,1];


% FLUXES BLOCK

% define flux measurements for experiment exp1
f_exp1 = [...
data('R1', 'val', 10.0, 'std', 1e-05),...
];


% MS_FRAGMENTS BLOCK

% define mass spectrometry measurements for experiment exp1
ms_exp1 = [...
msdata('F1: F @ 1 2 3', 'more', 'C3H5O1'),...
];

% define mass spectrometry measurements for experiment exp1
ms_exp1{'F1'}.idvs = idv([[0.01;0.8;0.1;0.0009]], 'id', {'exp1_F1_0_0_1'}, 'std', [[0.0003;0.003;0.0008;0.001]], 'time', 0.0)


% EXPERIMENTAL_DATA BLOCK
e_exp1 = exper

We see that the three last blocks model, options and runner, have not been populated yet. To populate the model block we need to define what experiments should be included in the model. In the options block, we can changes the options/settings which influence how INCA is run, for example we can increase the number of restarts during the flux estimation procedure and turn off the natural abundance correction. Finally, the runner defines what algorithms INCA should run on the model. In this example we will run the estimation and the simulation algorithm.

In [None]:
# same as previous code
script = INCAScript()
script.add_to_block('reactions', define_reactions(reactions_data))
script.add_to_block('tracers', define_tracers(tracers_data, 'exp1'))
script.add_to_block('fluxes', define_flux_measurements(flux_data, 'exp1'))
script.add_to_block('ms_fragments', define_ms_data(ms_data, 'exp1'))
script.add_to_block('experiments', define_experiment('exp1', ['data_flx', 'data_ms']))

# new code to define model, options and runner
script.add_to_block('model', define_model(['exp1']))
script.add_to_block('options', define_options(fit_starts=20, sim_na=False))
script.add_to_block('runner', define_runner("/path/to/output/file.mat", run_estimate=True, run_simulation=True))

In [None]:
print(script)

clear functions

% REACTION BLOCK
% Create reactions
r = [...
reaction('A (abc) -> B (abc)', ['id'], ['R1']),...
reaction('B (abc) <-> D (abc)', ['id'], ['R2']),...
reaction('B (abc) -> C (bc) + E (a)', ['id'], ['R3']),...
reaction('B (abc) + C (de) -> D (bcd) + E (a) + E (e)', ['id'], ['R4']),...
reaction('D (abc) -> F (abc)', ['id'], ['R5']),...
];

% TRACERS BLOCK
% define tracers used in exp1
t_exp1 = tracer({...
'[1-13C]A: A @ 1',...
});
t_exp1.frac = [1.0 ];
t_exp1.atoms.it(:,1) = [0,1];


% FLUXES BLOCK

% define flux measurements for experiment exp1
f_exp1 = [...
data('R1', 'val', 10.0, 'std', 1e-05),...
];


% MS_FRAGMENTS BLOCK

% define mass spectrometry measurements for experiment exp1
ms_exp1 = [...
msdata('F1: F @ 1 2 3', 'more', 'C3H5O1'),...
];

% define mass spectrometry measurements for experiment exp1
ms_exp1{'F1'}.idvs = idv([[0.01;0.8;0.1;0.0009]], 'id', {'exp1_F1_0_0_1'}, 'std', [[0.0003;0.003;0.0008;0.001]], 'time', 0.0)


% EXPERIMENTAL_DATA BLOCK
e_exp1 = exper

In [33]:
# replace fake path with actual path, this is just to hide my actual path from being displayed in the docs.
# It is not necessary in the actual workflow
script.blocks['runner'] = script.blocks['runner'].replace("'/path/to/output/file.mat'", "'" + str((data_folder / 'simple_model_inca_from_low_level_api.mat').resolve()) + "'")

Now we are ready to run the flux estimation algorithm in INCA.

In [None]:
import dotenv
inca_directory = pathlib.Path(dotenv.get_key(dotenv.find_dotenv(), "INCA_base_directory")) # Simply replace this with the path to your INCA installation
run_inca(script, INCA_base_directory=inca_directory)

INCA script saved to /var/folders/z6/mxpxh4k56tv0h0ff41vmx7gdwtlpvp/T/tmpvymjjsts/inca_script.m.
Starting MATLAB engine...
 
ms_exp1 = 1x1 msdata object
 
fields: atoms  id  [idvs]  more  on  state  
 
F1
 
 
m = 1x1 model object
 
fields: [expts]  [mets]  notes  [options]  [rates]  [states]  
 
	5 reactions (6 fluxes)                                  
	6 states (3 balanced, 1 source, 2 sink and 0 unbalanced)
	6 metabolites                                           
	1 experiments                                           
 

                                         Directional 
 Iteration      Residual     Step-size    derivative        Lambda
     0       9.99461e+11
     1       9.97934e+11      0.000764     -9.99e+11       1.60177
     2       1.42897e+06             1     -3.06e+06       1.60177
     3           80454.8             1     -1.92e+05      0.533923
     4            8606.5             1     -4.85e+03      0.177974
     5           8153.72             1         -3.28  

INCA has now done flux estimation on our model and saved it to a .mat file, which was specified in the `INCAScript` (here simple_model_inca.mat). This file can be open using the `INCAResults` workflow described in a later notebook or directly in the INCA GUI using "Open fluxmap".

### Note about opening models in INCA GUI
There are multiple ways to open models in the INCA GUI: "Open model", "Open fluxmap". Both methods opens the .mat file with is generated by `run_inca()`. "Open model" will open the model with any associated experiments and data, but it will not include the results of flux estimation. To get the results of the estimate, continuation or monte carlo algorithms use the "Open fluxmap" instead. This will load both the model and the results of the algorithms which were applied. One thing to know is that "Open fluxmap" in the INCA GUI fails if there is no simulation in the .mat file. Thus, to be able to open the results in the INCA GUI it is required to set `run_simulation=True` in the `define_runner()`.

## References
[1] M. R. Antoniewicz, J. K. Kelleher, and G. Stephanopoulos, “Determination of confidence intervals of metabolic fluxes estimated from stable isotope measurements,” Metabolic Engineering, vol. 8, no. 4, pp. 324–337, Jul. 2006, doi: 10.1016/j.ymben.2006.01.004.

[2] M. R. Antoniewicz, J. K. Kelleher, and G. Stephanopoulos, “Elementary metabolite units (EMU): A novel framework for modeling isotopic distributions,” Metabolic Engineering, vol. 9, no. 1, pp. 68–86, Jan. 2007, doi: 10.1016/j.ymben.2006.09.001.
