# Tutorial 1 – Import a GEM, Set Parameters and Run FBA

This is a short introduction that shows how to load a genome-scale metabolic model (GEM), set reaction constraints, objective function and perform an optimization through flux balance analysis (FBA). The resulting fluxes are visualized and exported to a PDF file.

A GEM for the filamentous fungus Penicillium chrysogenum is used in this tutorial. The model can be found in a Microsoft Excel file under the name iAL1006 v1.00.xlsx and in SBML file iAL1006 v1.00.xml.
Open tutorial1.m file in MATLAB to begin this exercise. To run a section of code in MATLAB, highlight it, press right mouse button on it and choose an option “Evaluate selection”.

NOTE: the user must be able to successfully import the GEMs in Excel format with the RAVEN function importExcelModel. Although, this functionality is not necessary for this exercise, the users without such ability would not be able to do Tutorials 2-4, which involve working with GEMs in RAVEN compatible Excel format.


In [40]:
setRavenSolver('gurobi')


- [importExcelModel](https://sysbiochalmers.github.io/RAVEN/doc/io/importExcelModel.html)

```MATLAB
function model=importExcelModel(fileName,removeExcMets,printWarnings,ignoreErrors)
```
Imports a constraint-based model from a Excel file

- [importModel](https://sysbiochalmers.github.io/RAVEN/doc/io/importModel.html)

```MATLAB
function model=importModel(fileName,removeExcMets,isSBML2COBRA,supressWarnings)
```
Import a constraint-based model from a SBML file

---

> removeExcMets: true if exchange metabolites should be removed. This is needed to be able to run simulations, but it could also be done using simplifyModel at a later stage (opt, default true)"

The tutorial file explains:<br>
*"The "false" flag imports a model with exchange reactions in their "closed" form. This makes the model unsuited for modelling, but it is useful for some quality control steps."*

In [41]:
% How to load a model
model=importModel('tutorial_data/iAL1006 v1.00.xml',false)

The model contains 47 errors.

Error encountered during read.


	LPE



In [42]:
%The following function prints some properties of the model. The two "true"
%flags say that it should also list potential problems such as dead-end
%reactions or unconnected metabolites.

- [printModelStats](https://sysbiochalmers.github.io/RAVEN/doc/core/printModelStats.html)

```MATLAB
function printModelStats(model, printModelIssues, printDetails)
```

Prints some statistics about a model to the screen

In [43]:
% Usefull for model validation. Notice the two "true" flags.
printModelStats(model,true,true);

Network statistics for iAL1006: Penicillium chrysogenum genome-scale model
EC-numbers			626
Genes*				1006
	Peroxisome	38
	Mitochondria	219
	Cytosol	802
	Extracellular	95
	Boundary	0

Reactions*			1632
	Peroxisome	137
	Mitochondria	324
	Cytosol	1144
	Extracellular	336
	Boundary	161
Unique reactions**	1449

Metabolites			1395
	Peroxisome	105
	Mitochondria	242
	Cytosol	728
	Extracellular	160
	Boundary	160
Unique metabolites	849

* Genes and reactions are counted for each compartment if any of the corresponding metabolites are in that compartment. The sum may therefore not add up to the total number.
** Unique reactions are defined as being biochemically unique (no compartmentalization)

Short model quality summary for iAL1006: Penicillium chrysogenum genome-scale model
Dead-end reactions	66
	r0035
	r0036
	r0040
	r0061
	r0071
	r0108
	r0161
	r0162
	r0169
	r0170
	r0174
	r0213
	r0224
	r0235
	r0238
	r0240
	r0356
	r0405
	r0407
	r0424
	r0427
	r0429
	r0447
	r0454
	r0514
	r0515
	r0516
	r0533
	r05

From the tutorial:

```MATLAB
%Most modelling approaches using GEMs are based on the mass balancing
%around the internal metabolites in the system. However, in order for the
%system to uptake or excrete metabolites, some metabolites have been
%defined as "unconstrained". In order to simulate something, those
%metabolites have to be removed from the model. The function simplifyModel
%is a general-purpose function for making models smaller. This includes the
%options such as grouping linear reactions and deleting reactions which
%cannot carry flux. Here it is chosen to delete the exchange metabolites,
%all reactions that are constrained to zero (mainly uptake of non-standard
%carbon sources), and all reactions that cannot carry flux (mainly
%reactions that were dependent on any of those non-standard carbons
%sources).
```

- [simplifyModel](https://sysbiochalmers.github.io/RAVEN/doc/core/simplifyModel.html)

```MATLAB
function [reducedModel, deletedReactions, deletedMetabolites]=simplifyModel(model,deleteUnconstrained, 
deleteDuplicates, deleteZeroInterval, deleteInaccessible, deleteMinMax, groupLinear, constrainReversible, reservedRxns, suppressWarnings)
```

Simplifies a model by deleting reactions/metabolites

In [44]:
model=simplifyModel(model,true,false,true,true);
model

- [setParam](https://sysbiochalmers.github.io/RAVEN/doc/core/setParam.html)


```MATLAB
function model=setParam(model, paramType, rxnList, params, var)
```

Sets parameters for reactions

In [45]:
%%%% validating the model using the theoretical yield of carbon dioxide from glucose. %%%%
% First, set the uptake of carbon to glucose
model=setParam(model,'ub',{'glcIN' 'etohIN'},[1 0]); % remember, the units are mmol/gDW/h
% Second, set the objective for the simulation to maximize CO2 production
model=setParam(model,'obj',{'co2OUT'},1);

- [solveLP](https://sysbiochalmers.github.io/RAVEN/doc/solver/solveLP.html)

``` MATLAB
function [solution, hsSolOut]=solveLP(model,minFlux,params,hsSol)
```

Solves a linear programming problem

In [46]:
% Third, get the maximum given the setted constrains
sol=solveLP(model);
disp(sol);

         x: [1305x1 double]
         f: -6.0000
      stat: 1
       msg: 'Optimal solution found'
    sPrice: [1037x1 double]
     rCost: [1305x1 double]



- [printFluxes](https://sysbiochalmers.github.io/RAVEN/doc/core/printFluxes.html)

```MATLAB
function printFluxes(model, fluxes, onlyExchange, cutOffFlux, outputFile,outputString,metaboliteList)
```

Prints reactions and fluxes to the screen or to a file.

> onlyExchange: only print exchange fluxes (opt, default true)

In [47]:
% Print the fluxes in console. The true flag indicates to only print exchange fluxes
printFluxes(model, sol.x, true, 10^-7);

EXCHANGE FLUXES:
co2OUT	(production of CO2):	6
h2oOUT	(production of H2O):	6
glcIN	(uptake of alpha-D-glucose):	1
o2IN	(uptake of O2):	6


In [48]:
% Exploring other fluxes. Notice that some are "infinite" (1000)
printFluxes(model, sol.x, false, 10^-7);

FLUXES:
r0001	(spontaneous conversion):	333.3333
r0002	(spontaneous conversion):	-333.3333
r0004	(ATP:alpha-D-glucose 6-phosphotransferase):	1
r0006	(alpha-D-glucose 6-phosphate ketol-isomerase):	-999
r0007	(beta-D-glucose 6-phosphate ketol-isomerase):	-1000
r0008	(alpha-D-glucose 6-phosphate ketol-isomerase):	1000
r0009	(phosphofructokinase):	1
r0010	(fructose-bisphosphate aldolase):	1
r0011	(D-glyceraldehyde-3-phosphate aldose-ketose-isomerase):	-1
r0012	(glyceraldehyde-3-phosphate dehydrogenase):	2
r0013	(phosphoglycerate kinase):	2
r0014	(phosphoglycerate mutase):	2
r0015	(2-phospho-D-glycerate hydro-lyase (phosphoenolpyruvate-forming)):	2
r0016	(pyruvate kinase):	2
r0019	(ethanol:NADP+ oxidoreductase):	8
r0020	(ethanol:NAD+ oxidoreductase):	-1000
r0021	(ethanol:NAD+ oxidoreductase):	992
r0022	(pyruvate:[dihydrolipoyllysine-residue acetyltransferase]-lipoyllysine 2-oxidoreductase (decarboxylating, acceptor-acetylating)):	2
r0023	(acetyl-CoA:enzyme N6-(dihydrolipoyl)lysine S-acetylt

"Infinite" (±1000) fluxes suggest and non physiologicla behaviour. The tutorial states:

```MATLAB
%The results show many reactions that have -1000 or 1000 flux. This is
%because there are loops in the solution. In order to clean up the solution
%one can minimize the sum of all the fluxes. This is done by setting the
%third flag to solveLP to true (take a look at solveLP, there are other
%options as well).
```

In [49]:
% minFLux = 1 indicates that a second optimization performed to get rid of loops in the flux distribution will
% adress to minimize the sum of abs(fluxes). 
sol=solveLP(model,1);
printFluxes(model, sol.x, false, 10^-7);

FLUXES:
r0003	(D-glucose 1-epimerase):	1
r0006	(alpha-D-glucose 6-phosphate ketol-isomerase):	-2.5446
r0011	(D-glyceraldehyde-3-phosphate aldose-ketose-isomerase):	0.37625
r0012	(glyceraldehyde-3-phosphate dehydrogenase):	0.41584
r0013	(phosphoglycerate kinase):	0.41584
r0014	(phosphoglycerate mutase):	0.41584
r0015	(2-phospho-D-glycerate hydro-lyase (phosphoenolpyruvate-forming)):	0.41584
r0022	(pyruvate:[dihydrolipoyllysine-residue acetyltransferase]-lipoyllysine 2-oxidoreductase (decarboxylating, acceptor-acetylating)):	0.20792
r0023	(acetyl-CoA:enzyme N6-(dihydrolipoyl)lysine S-acetyltransferase):	0.20792
r0024	(dihydrolipoamide:NAD+ oxidoreductase):	0.83167
r0027	(glucose-6-phosphate 1-dehydrogenase):	2.5446
r0028	(6-phospho-D-glucono-1,5-lactone lactonohydrolase):	2.5446
r0029	(6-phospho-D-gluconate:NADP+ 2-oxidoreductase (decarboxylating)):	3.5446
r0030	(D-ribulose-5-phosphate 3-epimerase):	2.1683
r0031	(D-ribose-5-phosphate aldose-ketose-isomerase):	1.3762
r0032	(sedoheptulose-

In [50]:
% Lets repeat, but change the objective function to biomass production
% Notice the change in metabolic requirements (exchange IN reactions)
model=setParam(model,'obj',{'bmOUT'},1);
sol=solveLP(model,1);
printFluxes(model, sol.x, true, 10^-7);

EXCHANGE FLUXES:
c4odOUT	(production of 2-oxy-but-3-enoate):	8.4803e-06
bmOUT	(production of biomass):	0.084803
co2OUT	(production of CO2):	3.061
h2oOUT	(production of H2O):	5.586
glcIN	(uptake of alpha-D-glucose):	1
piIN	(uptake of phosphate):	0.027889
nh3IN	(uptake of NH3):	0.59384
o2IN	(uptake of O2):	2.937
slfIN	(uptake of sulfate):	0.022888
thmIN	(uptake of thiamin):	8.4803e-06
pimIN	(uptake of pimelate):	8.4803e-06


In [51]:
% To compare carbon flux swapping glucose by ethanol, remembear that
% ethanol have 2 carbons and glucose 6; so, 3 * 2 = carbon number in glucose
modelETH=setParam(model,'eq',{'glcIN' 'etohIN'},[0 3]);
solETH=solveLP(modelETH,1);
printFluxes(modelETH, solETH.x, true, 10^-7);

EXCHANGE FLUXES:
c4odOUT	(production of 2-oxy-but-3-enoate):	1.0816e-05
bmOUT	(production of biomass):	0.10816
co2OUT	(production of CO2):	2.2514
h2oOUT	(production of H2O):	8.4719
piIN	(uptake of phosphate):	0.035572
nh3IN	(uptake of NH3):	0.75743
o2IN	(uptake of O2):	5.0932
slfIN	(uptake of sulfate):	0.029193
thmIN	(uptake of thiamin):	1.0816e-05
pimIN	(uptake of pimelate):	1.0816e-05
etohIN	(uptake of ethanol):	3


To make more clear the comparisson between two conditions:

- [followChanged](https://sysbiochalmers.github.io/RAVEN/doc/core/followChanged.html)

```MATLAB
function followChanged(model,fluxesA,fluxesB, cutOffChange, cutOffFlux, cutOffDiff, metaboliteList)
```

Prints fluxes and reactions for each of the reactions that results in different fluxes compared to the reference case (fluxesB).


In [52]:
% fluxesA: flux vector for the test case.fluxesB: flux vector for the reference test
followChanged(modelETH,sol.x,solETH.x, 50, 0.5, 0.5);

These reactions have flux values that differ by more than 50 percent, absolute values above 0.5, and a total difference above 0.5 (64 reactions)

r0004: alpha-D-glucose[c] + ATP[c] => ADP[c] + alpha-D-glucose 6-phosphate[c]
	ATP:alpha-D-glucose 6-phosphotransferase
	Flux: 1 Reference flux: 0 Difference: 1

r0006: alpha-D-glucose 6-phosphate[c] <=> beta-D-fructofuranose 6-phosphate[c]
	alpha-D-glucose 6-phosphate ketol-isomerase
	Flux: 0.91584 Reference flux: -0.10723 Difference: 1.0231

r0009: ATP[c] + beta-D-fructofuranose 6-phosphate[c] => ADP[c] + beta-D-fructofuranose 1,6-bisphosphate[c]
	phosphofructokinase
	Flux: 0.87408 Reference flux: 0 Difference: 0.87408

r0010: beta-D-fructofuranose 1,6-bisphosphate[c] <=> D-glyceraldehyde 3-phosphate[c] + glycerone phosphate[c]
	fructose-bisphosphate aldolase
	Flux: 0.87408 Reference flux: -0.16057 Difference: 1.0347

r0011: D-glyceraldehyde 3-phosphate[c] <=> glycerone phosphate[c]
	D-glyceraldehyde-3-phosphate aldose-ketose-isomerase
	Flu

In [53]:
% You can filter using a metabolite list:
followChanged(modelETH,sol.x,solETH.x, 30, 0.4, 0.4,{'ATP'});

These reactions have flux values that differ by more than 30 percent, absolute values above 0.4, and a total difference above 0.4 (10 reactions)

Only prints reactions involving one or more of the following metabolites:
ATP 

r0004: alpha-D-glucose[c] + ATP[c] => ADP[c] + alpha-D-glucose 6-phosphate[c]
	ATP:alpha-D-glucose 6-phosphotransferase
	Flux: 1 Reference flux: 0 Difference: 1

r0009: ATP[c] + beta-D-fructofuranose 6-phosphate[c] => ADP[c] + beta-D-fructofuranose 1,6-bisphosphate[c]
	phosphofructokinase
	Flux: 0.87408 Reference flux: 0 Difference: 0.87408

r0013: 3-phospho-D-glyceroyl phosphate[c] + ADP[c] <=> 3-phospho-D-glycerate[c] + ATP[c]
	phosphoglycerate kinase
	Flux: 1.6507 Reference flux: -0.44547 Difference: 2.0962

r0016: ADP[c] + phosphoenolpyruvate[c] => ATP[c] + pyruvate[c]
	pyruvate kinase
	Flux: 1.567 Reference flux: 0 Difference: 1.567

r0018: ATP[c] + oxaloacetate[c] => ADP[c] + CO2[c] + phosphoenolpyruvate[c]
	phosphoenolpyruvate carboxykinase (ATP)
	Flux: 0 R

To understand better the underlying flux distributions the fluxes can be visualized in a map:

- [load](https://www.mathworks.com/help/matlab/ref/load.html)<br>

```MATLAB
load(filename,variables)
```
Loads the specified variables from the MAT-file filename.

- [drawMap](https://sysbiochalmers.github.io/RAVEN/doc/plotting/drawMap.html)<br>

```MATLAB
function notMapped=drawMap(title,pathway,modelA,conditionA,conditionB,modelB,filename,cutOff,supressOpen)
```
Imports a previously drawn map of the metabolic network and plots the fluxes on that map. If the pathway contains expression data the log-fold changes are plotted as well.

In [54]:
load('./tutorial_data/pcPathway.mat', "pathway");
drawMap('Glucose vs ethanol',pathway,model,sol.x,solETH.x,modelETH,'./output/tutorial1_GLCvsETH.pdf',10^-5);

File saved as ./output/tutorial1_GLCvsETH.pdf
