# MFA using INCA in MATLAB - now all in Python!

This is an example notebook that makes use of functions that can write a MATLAB script that runs an MFA analysis using INCA. INCA is a MATLAB software. To make it easier for users, MATLAB can be run here in the notebook using an engine. The functions can be found in the "[INCA_script_generator.py](INCA_script_generator.py)" file. See the instructions below on how to use this notebook

# MATLAB

Get a free academic licence and install MATLAB from https://www.mathworks.com. Then, install the engine API following the guide provided [under this link](https://www.mathworks.com/help/matlab/matlab_external/install-the-matlab-engine-for-python.html). In short, you will have to go to your MATLAB root folder (find your installation and open that folder) and go to "/extern/engines/python" and run "python setup.py install" from the command line.

# INCA

"INCA (Isotopomer Network Compartmental Analysis) is a MATLAB-based software package for isotopomer network modeling and metabolic flux analysis." You can read more about it in [Young, 2014](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3998137/pdf/btu015.pdf).

You have to get a free academic licence for INCA from [the Vanderbilt University website](http://mfa.vueinnovations.com/licensing) (the second option is the relevant one) and install it. Note the path to the base directory of your INCA installation, you will need it later.

#### Import customized functions for INCA utilization

In [66]:
from AutoFlow_OmicsDataHandling.INCA_script_generator import *
import pandas as pd
import numpy as np
import time
import ast
import matlab.engine

ModuleNotFoundError: No module named 'AutoFlow_OmicsDataHandling'

#### Import the data

In [59]:
%pwd

'/Users/matmat/Documents/GitHub/AutoFlow-OmicsDataHandling/examples'

In [60]:
%cd ../../../examples/

[Errno 2] No such file or directory: '../../../examples/'
/Users/matmat/Documents/GitHub/AutoFlow-OmicsDataHandling/examples


In [61]:
# measured fragments/MS data, tracers and measured fluxes should be limited to one experiment

atomMappingReactions_data_I = pd.read_csv('data/MFA_modelInputsData/data_stage02_isotopomer_atomMappingReactions2.csv')
modelReaction_data_I = pd.read_csv('data/MFA_modelInputsData/data_stage02_isotopomer_modelReactions.csv')
atomMappingMetabolite_data_I = pd.read_csv('data/MFA_modelInputsData/data_stage02_isotopomer_atomMappingMetabolites.csv')
measuredFluxes_data_I = pd.read_csv('data/MFA_modelInputsData/data_stage02_isotopomer_measuredFluxes.csv')
experimentalMS_data_I = pd.read_csv('data/MFA_modelInputsData/data-1604345289079.csv')
tracer_I = pd.read_csv('data/MFA_modelInputsData/data_stage02_isotopomer_tracers.csv')

#### Exclude data for irreleavnt experiments and models

In [62]:
# The files need to be limited by model id and mapping id, I picked "ecoli_RL2013_02" here
atomMappingReactions_data_I = limit_to_one_model(atomMappingReactions_data_I, 'mapping_id', 'ecoli_RL2013_02')
modelReaction_data_I = limit_to_one_model(modelReaction_data_I, 'model_id', 'ecoli_RL2013_02')
atomMappingMetabolite_data_I = limit_to_one_model(atomMappingMetabolite_data_I, 'mapping_id', 'ecoli_RL2013_02')
measuredFluxes_data_I = limit_to_one_model(measuredFluxes_data_I, 'model_id', 'ecoli_RL2013_02')

# Limiting fluxes, fragments and tracers to one experiment
measuredFluxes_data_I = limit_to_one_experiment(measuredFluxes_data_I, 'experiment_id', 'WTEColi_113C80_U13C20_01')
experimentalMS_data_I = limit_to_one_experiment(experimentalMS_data_I, 'experiment_id', 'WTEColi_113C80_U13C20_01')
tracer_I = limit_to_one_experiment(tracer_I, 'experiment_id', 'WTEColi_113C80_U13C20_01')

In [63]:
biomass_function = "0.176*phe_DASH_L_c + 0.443*mlthf_c + 0.34*oaa_c + 0.326*lys_DASH_L_c + 33.247*atp_c + 0.205*ser_DASH_L_c + 0.129*g3p_c + 0.131*tyr_DASH_L_c + 0.051*pep_c + 0.146*met_DASH_L_c + 0.205*g6p_c + 0.087*akg_c + 0.25*glu_DASH_L_c + 0.25*gln_DASH_L_c + 0.754*r5p_c + 0.071*f6p_c + 0.083*pyr_c + 0.582*gly_c + 0.241*thr_DASH_L_c + 0.229*asp_DASH_L_c + 5.363*nadph_c + 0.087*cys_DASH_L_c + 0.619*3pg_c + 0.402*val_DASH_L_c + 0.488*ala_DASH_L_c + 0.276*ile_DASH_L_c + 0.229*asn_DASH_L_c + 0.09*his_DASH_L_c + 0.428*leu_DASH_L_c + 2.51*accoa_c + 0.281*arg_DASH_L_c + 0.21*pro_DASH_L_c + 0.054*trp_DASH_L_c -> 1.455*nadh_c + 39.68*Biomass_c"

#### Generate the MATLAB script and save it in your working directory. The last argument in the script_generator function will name your future .mat file

In [64]:
script = script_generator(modelReaction_data_I, atomMappingReactions_data_I, biomass_function,
                          atomMappingMetabolite_data_I,
                          measuredFluxes_data_I, experimentalMS_data_I, tracer_I)
save_INCA_script(script, "testscript")
runner = runner_script_generator('TestFile', 10)
print(runner)
save_runner_script(runner=runner, scriptname="testscript")

TypeError: script_generator() takes 6 positional arguments but 7 were given

#### Provide the path to you INCA installation, your working directory and the name of the previously generated MATLAB script

In [19]:
INCA_base_directory = # ADD YOUR BASE DIRECTORY HERE, e.g. "/Users/Username/Documents/INCAv1.9"
script_folder = %pwd
matlab_script = "testscript"
runner_script = matlab_script + "_runner"

#### INCA will be started and your script run in MATLAB. This will produce the .mat file specified above

In [21]:
run_INCA_in_MATLAB(INCA_base_directory, script_folder, matlab_script, runner_script)

--- 165.43198895454407 seconds -
