# Tutorial 4 – Fix an Erroneous Model

The whole point of GEMs is that they are large. It is by incorporating the entire known metabolism of any given organism that complexity arises. However, this makes it almost certain that all models will contain errors. This is true regardless of whether one builds the model or if one uses a model from someone else. One of the issues is that if one group publishes a model for some specific purpose it is likely to function well in that specific part of metabolism, but it may not function at all for some problems. It is therefore a good idea to perform a round of error checking even if it is a published model one uses.

Model validation is an iterative process because some errors might not have an effect until some other errors have been fixed. It is not uncommon that the model “works” well in the beginning of the reconstruction process because there are errors that let it cheat on things like redox or energy balance. The model then works worse and worse as the errors are dealt with until all or most errors are fixed, after which it will start to work again. As RAVEN developers, we believe that it is much more important to try to make the model do something it should not be allowed to do rather than to test for the stuff it should do.

There is a version of the small yeast model with errors inserted (smallYeastBad.xlsx). The task for this exercise is to find and fix them. Some errors will be obvious (it is rather difficult to introduce errors in such a small model, because there is extraordinarily little redundancy in it), but it is strongly recommended not to fix them until they are “found” during the following steps, because otherwise one might get unpredictable results. Most of the stuff done here can be done with the gapReport function, but it is strongly recommended to do them step by step. 

In [212]:
setRavenSolver('gurobi')
% values lower than tolerance are considered zero
tolerance = 10^-7;

1.	The first thing to check for is that the model cannot make something from nothing, i.e. no metabolites should be produced if one does not give the model access to any carbon sources (this should be done for all elements, but carbon is the most important). A simple way to do this would be to optimize for the sum of all the producing exchange reactions, while keeping the consuming reactions closed. Any solution other than 0 would then be bad. Try that.

In [213]:
% import and basic inspection
model = importExcelModel("./tutorial_data/smallYeastBad.xlsx");
printModelStats(model);

NOTE: DEFAULT LOWER not supplied. Uses -1000
NOTE: DEFAULT UPPER not supplied. Uses 1000

	1S/C6H14O12P2/c7-4-3(1-16-19(10,11)12)18-6(9,5(4)8)2-17-20(13,14)15/h3-5,7-9H,1-2H2,(H2,10,11,12)(H2,13,14,15)/t3-,4-,5+,6-/m1/s1

Network statistics for smallYeastBad: Central carbon metabolism for yeast
Genes*				61
	cytosol	52
	mitochondria	17

Reactions*			54
	cytosol	46
	mitochondria	19
Unique reactions**	54

Metabolites			52
	cytosol	35
	mitochondria	17
Unique metabolites	45

* Genes and reactions are counted for each compartment if any of the corresponding metabolites are in that compartment. The sum may therefore not add up to the total number.
** Unique reactions are defined as being biochemically unique (no compartmentalization)


In [214]:
exchangeIndexes = getIndexes(model, getExchangeRxns(model), 'rxns');
disp(table(...
        model.rxnNames(exchangeIndexes), model.lb(exchangeIndexes), model.ub(exchangeIndexes), ...
        'VariableNames',  {'ExchangeReaction', 'LB', 'UB'}));

         ExchangeReaction         LB     UB 
    __________________________    __    ____

    {'Production of acetate' }    0     1000
    {'Production of biomass' }    0     1000
    {'Production of CO2'     }    0     1000
    {'Production of ethanol' }    0     1000
    {'Production of glycerol'}    0     1000
    {'Uptake of glucose'     }    0     1000
    {'Uptake of O2'          }    0     1000



In [215]:
% close uptake reactions
model = setParam(model, 'eq', getExchangeRxns(model, 'in'), 0);
% set optimization to max the sum of production reactions
model = setParam(model, 'obj', getExchangeRxns(model, 'out'), 1);
% Check if the flux is zero
solution = solveLP(model);
disp(solution);
printFluxes(model, solution.x, false, tolerance);

         x: [54x1 double]
         f: 0
      stat: 1
       msg: 'Optimal solution found'
    sPrice: [52x1 double]
     rCost: [54x1 double]

FLUXES:


2.	The previous step did not provide any non-zero solutions, right? That is good, but there could be other factors that prevent the error from showing its ugly face. Maybe it costs energy or redox power for example. Or maybe the necessary reactions are in different compartments. It is generally a good idea to relax as many constraints as possible when searching for errors. For instance, one can include a temporary reaction like “ATP + H2O ⇔ ADP + Pi” and similar reactions with NADH and NADPH. Remember that the aim here is to try to “provoke” the model to show the errors. Add these reactions and try again.

❗ **Personal note**: seems that [getIndexes](https://sysbiochalmers.github.io/RAVEN/doc/core/getIndexes.html) do not use regex (except for 'metcomps'), then you need to know the EXACT name or ID. This is a strong limitation, what if I am exploring a model I am not familiar with?

For example:

In [216]:
% Example of the commented limitation
disp(getIndexes(model, {'atp', 'AT'}, 'metnames'));
disp(getIndexes(model, {'atp', 'AT'}, 'mets'));

    {0x1 double}
    {0x1 double}



Error using getIndexes
Could not find object 'atp' in the model

In [217]:
% Workaround 🤓
% regex for looking forr ATP, NADH or NADPH
pattern = '^(a[dt]p|nadp?h?|pi|h|h2o).*_c$';
% Get cells that match the query, then filter them to get the cells that have matches
indexesToCheck = find(~cellfun(@isempty, regexpi(model.mets, pattern, 'match')'));
tableToCheck = table(model.mets(indexesToCheck), model.metNames(indexesToCheck), 'VariableNames', {'ID', 'metName'});
disp(tableToCheck);

        ID            metName   
    ___________    _____________

    {'ADP_c'  }    {'ADP'      }
    {'ATP_c'  }    {'ATP'      }
    {'NAD_c'  }    {'NAD(+)'   }
    {'NADH_c' }    {'NADH'     }
    {'NADP_c' }    {'NADP(+)'  }
    {'NADPH_c'}    {'NADPH'    }
    {'PI_c'   }    {'phosphate'}



- [addRxns](https://sysbiochalmers.github.io/RAVEN/doc/core/addRxns.html)

```MATLAB
function newModel=addRxns(model,rxnsToAdd,eqnType,compartment,allowNewMets,allowNewGenes)
```

Adds reactions to a model

In [218]:
% cell array with unique strings that identifies each reaction
freeReactions.rxns = {
    'FREE_ATP';
    'FREE_NADH';
    'FREE_NADPH'
    };

% cell array with equation strings. Decimal coefficients are expressed as "1.2". Reversibility is indicated by "<=>" or "=>"
freeReactions.equations = {
    'ATP[c] <=> ADP[c] + phosphate[c]';
    'NAD(+)[c] <=> NADH[c]';
    'NADP(+)[c] <=> NADPH[c]'
    };

% eqnType 3: The metabolites are written as "metNames[comps]". Only compartments in model.comps are allowed
model = addRxns(model, freeReactions, 3);
printModel(model, freeReactions.rxns );

FLUXES:
FREE_ATP ()
	ATP[c] <=> ADP[c] + phosphate[c] [-1000 1000]
FREE_NADH ()
	NAD(+)[c] <=> NADH[c] [-1000 1000]
FREE_NADPH ()
	NADP(+)[c] <=> NADPH[c] [-1000 1000]


❗ Pay attention to "minFLux" option:

In [219]:
% minFlux 1: the sum of abs(fluxes) is minimized. This is the fastest way of getting rid of loops
modelSolution = solveLP(model, 1);
disp(modelSolution);

       x: [57x1 double]
       f: -1000
    stat: 1
     msg: 'Optimal solution found'



3.	Did one get the production of ethanol? If so, print the resulting fluxes and see if it is possible to find the error. GEMs are normally very underdetermined, which means that there are infinite numbers of solutions to any given problem. When one solves using solveLP(model) one just gets a random solution which meets the objective and satisfies the constraints. These solutions often contain loops and are therefore difficult to interpret. One can read more about the solveLP function by typing “help solveLP” in MATLAB, but here it is chosen to solve using solveLP(model,1). This minimizes the sum of fluxes to have more easily interpreted results. Find and fix the error and rerun.

> Question 2: what modification is needed to prevent the of ethanol from nothing?

In [220]:
% onlyExchange  false:  only print exchange fluxes
printFluxes(model, modelSolution.x, false);

FLUXES:
ethOUT	(Production of ethanol):	999.999
ADH1	(Alcohol dehydrogenase):	999.999
ADH2	(Alcohol dehydrogenase rev):	999.999
FREE_NADH	():	999.999
FREE_NADPH	():	-999.999


Aditional to ethOUT and the added testing reactions, there are two with infinite flux: ADH1 and ADH2

In [221]:
reactionsToEvalue = {'ADH1', 'ADH2'};
printModel(model, reactionsToEvalue);

FLUXES:
ADH1 (Alcohol dehydrogenase)
	acetaldehyde[c] + NADH[c] => 2 ethanol[c] + NAD(+)[c] [0 1000]
ADH2 (Alcohol dehydrogenase rev)
	ethanol[c] + NADP(+)[c] => acetaldehyde[c] + NADPH[c] [0 1000]


The ethanol stoichiometric coefficient between the two reactions are inconsistent. At least one of the reactions is unbalanced. Let's check it:

```MATLAB
function balanceStructure=getElementalBalance(model,rxns,printUnbalanced,printUnparsable)
```
Checks a model to see if the reactions are elementally balanced.

In [222]:
getElementalBalance(model, reactionsToEvalue, true, true);






- [changeRxns](https://sysbiochalmers.github.io/RAVEN/doc/core/changeRxns.html)

```MATLAB
function model=changeRxns(model,rxns,equations,eqnType,compartment,allowNewMets)
```

Modifies the equations of reactions

In [223]:
rxnsToChange.rxns = {'ADH1'}%, 'ADH2'};
rxnsToChange.equations = {
    % acetaldehyde[c] + NADH[c] + H+[c] => ethanol[c] + NAD(+)[c]
    'acetaldehyde[c] + NADH[c] => ethanol[c] + NAD(+)[c]',
    % ethanol[c] + NADP(+)[c] => acetaldehyde[c] + NADPH[c] + H+[c]
    };
% update model
model = changeRxns(model, rxnsToChange.rxns, rxnsToChange.equations, 3);
% verify balance

disp(table(reactionsToEvalue', ...
        getElementalBalance(model, reactionsToEvalue, true, true).balanceStatus, ...
        'VariableNames', {'reaction', 'isBalanced'} ...
        ));
% optimize again
modelSolution = solveLP(model, 1);
disp(modelSolution);
printFluxes(model, modelSolution.x, false);

    reaction    isBalanced
    ________    __________

    {'ADH1'}        1     
    {'ADH2'}        1     

       x: [57x1 double]
       f: 0
    stat: 1
     msg: 'Optimal solution found'

FLUXES:


4.	In GEMs it is normal to have excretion of only a few metabolites while having very many internal metabolites. A common case is that one has an error that would like to produce something from nothing, but to do so it also must produce some other metabolite for which there is no exchange reaction. A convenient way to test this is to allow all metabolites to be excreted. One can do this by changing the model.b structure. Normally it is always a vector of zeros, but if one adds a second column RAVEN will interpret it as lower and upper bound on the equality constraints. So if one puts model.b=[model.b inf(numel(model.b),1)]; one can now excrete anything. Do this and see if the model can produce anything. For instance, one should get ethanol, glycerol, and CO2. Look at the fluxes and find the error. One can get a clue by looking at the warnings from SBMLFromExcel. Since this is a problem that comes from reactions being unbalanced, the problematic ones must be in one of the warnings. Which was the metabolite that had to be excreted for the error to appear? Do this step two times to find both errors.

> Question 3: what two modifications are needed to fix the warnings?

In [224]:
% In RAVEN: if size(b) = [n, 1], it means Sx= b. If size(b) = [n, 2], it means b(:,1) <= S <= b(:,2)
model.b = [model.b, 1000 * ones(length(model.b),1)];
modelSolution = solveLP(model, 1);
disp(modelSolution);
printFluxes(model, modelSolution.x, false, tolerance,[],'▪ %flux\n%rxnID (%rxnName):\n\t%eqn\n\n');

       x: [57x1 double]
       f: -2.2083e+03
    stat: 1
     msg: 'Optimal solution found'

FLUXES:
▪ 500
co2OUT (Production of CO2):
	CO2[c] => 

▪ 708.3311
ethOUT (Production of ethanol):
	ethanol[c] => 

▪ 1000
glyOUT (Production of glycerol):
	glycerol[c] => 

▪ -500
PGI (Glucose-6-phosphate isomerase):
	alpha-D-glucose 6-phosphate[c] <=> beta-D-fructofuranose 6-phosphate[c]

▪ 1000
PFK (Phosphofructokinase):
	ATP[c] + beta-D-fructofuranose 6-phosphate[c] => ADP[c] + 2 beta-D-fructofuranose 1,6-bisphosphate[c]

▪ 583.3378
FBP (Fructose-1,6-bisphosphatase):
	beta-D-fructofuranose 1,6-bisphosphate[c] => 2 beta-D-fructofuranose 6-phosphate[c] + phosphate[c]

▪ 1000
FBA (Fructose-bisphosphate aldolase):
	beta-D-fructofuranose 1,6-bisphosphate[c] <=> D-glyceraldehyde 3-phosphate[c] + glycerone phosphate[c]

▪ 1000
GLD (Triosephosphate dehydrogenase):
	D-glyceraldehyde 3-phosphate[c] + NAD(+)[c] + phosphate[c] <=> 3-phospho-D-glyceroyl phosphate[c] + NADH[c]

▪ 708.3311
PGK (Phosphogly

❗ The import function did not raise the warnings the tutorial refers to; then, I will verify mass balance by each non-exchange reaction with non-zero flux (exchange reactions do not need to be mass balanced).

In [225]:
% get reactions with non zero flux, then
% filter out exchange reactions and fake generating free energy and reductor power reactions
reactionsToEvalue = model.rxns(modelSolution.x >= tolerance);
reactionsToEvalue = setdiff(reactionsToEvalue, [getExchangeRxns(model); freeReactions.rxns]);
% check unbalanced reactions, organice data in a table and filter out balanced equations
balanceCheck = getElementalBalance(model, reactionsToEvalue, false, true);
printModel(model, reactionsToEvalue(balanceCheck.balanceStatus == 0));

FLUXES:
PFK (Phosphofructokinase)
	ATP[c] + beta-D-fructofuranose 6-phosphate[c] => ADP[c] + 2 beta-D-fructofuranose 1,6-bisphosphate[c] [0 1000]
FBP (Fructose-1,6-bisphosphatase)
	beta-D-fructofuranose 1,6-bisphosphate[c] => 2 beta-D-fructofuranose 6-phosphate[c] + phosphate[c] [0 1000]
ENO (Enolase)
	2-phospho-D-glycerate[c] <=> phosphoenolpyruvate[c] [-1000 1000]
PGL (6-phosphogluconolactonase)
	6-O-phosphono-D-glucono-1,5-lactone[c] => 6-phospho-D-gluconate[c] [0 1000]
TAL1 (Transaldolase)
	D-glyceraldehyde 3-phosphate[c] + sedoheptulose 7-phosphate[c] <=> beta-D-fructofuranose 6-phosphate[c] + D-erythrose 4-phosphate[c] [-1000 1000]
TKI1TKI2b (Transketolase)
	D-erythrose 4-phosphate[c] + D-xylulose 5-phosphate[c] <=> beta-D-fructofuranose 6-phosphate[c] + D-glyceraldehyde 3-phosphate[c] [-1000 1000]
GPP (sn-glycerol-3-phosphate phosphohydrolase)
	glycerol monophosphate[c] => glycerol[c] + phosphate[c] [0 1000]
PDC (Pyruvate decarboxylase)
	pyruvate[c] => acetaldehyde[c] [0 1000]
A

Let's organize a bit of the data.

The sign indicates which side has more mass. Unbalanced elements hint at the nature of the compound. For example, in ATPX, O: -1 and H: -2 clearly indicate water; in FBP, the stoichiometry suggests a carbohydrate and a phosphate compound, and the latter cannot be an adenosine compound since N is balanced.

In [226]:
balanceCheckTable = array2table(balanceCheck.leftComp - balanceCheck.rightComp, ...
        'VariableNames', balanceCheck.elements.abbrevs, ...
        'RowNames', reactionsToEvalue);
unbalancedTable = balanceCheckTable(reactionsToEvalue(balanceCheck.balanceStatus == 0), :);
fprintf('Table rows: %i', height(unbalancedTable));
disp(unbalancedTable)

Table rows: 9                 C     N     O     P      H 
                 __    _    ___    __    ___

    ATPX          0    0     -1     0     -2
    ENO           0    0      1     0      2
    FBP          -6    0    -16    -3    -17
    GPP           0    0     -1     0     -2
    PDC           1    0      2     0      0
    PFK          -6    0     -9    -1    -13
    PGL           0    0     -1     0     -2
    TAL1          0    0     -3    -1     -1
    TKI1TKI2b     0    0     -3    -1     -1



The model is missing water:

In [227]:
pattern = 'h2o';
indexesToCheck = find(~cellfun(@isempty, regexpi(model.mets, pattern, 'match')'));
fprintf('Found indexes for %s: %i\n', pattern, length(indexesToCheck));
% add water
metsToAdd.mets = 'H2O_c';
metsToAdd.metNames = 'H2O';
metsToAdd.compartments = 'c';
metsToAdd.metFormulas = 'H2O';
model = addMets(model,metsToAdd);
% verify it is added
disp(model.mets(getIndexes(model, metsToAdd.mets, 'mets')));

Found indexes for h2o: 0
    {'H2O_c'}



The equations for TAL1 ([R08575](https://www.genome.jp/entry/R08575)) and TKI1TKI2b ([R01067](https://www.kegg.jp/entry/R01067)) are correct. Both have the same elemental imbalance (HPO3) and share a compound, the beta-D-fructofuranose 6-phosphate. Likely the compound has a incorrect formula.

beta-D-fructofuranose 6-phosphate have the formula C6H13O9P ([KEGG: C05345](https://www.genome.jp/entry/C05345))

In [228]:
compoundName = 'beta-D-fructofuranose 6-phosphate';
compoundFormula = 'C6H13O9P';
compoundIndex = getIndexes(model, compoundName, 'metnames');
formulaToCheck = model.metFormulas(compoundIndex);
boolStr = {'false', 'true'};
fprintf('The formula %s is correct? %s.', ...
    formulaToCheck{:}, ...
    boolStr{strcmp(formulaToCheck, compoundFormula) + 1});

The formula C6H14O12P2 is correct? false.

In [229]:
% replace and verify
model.metFormulas(compoundIndex) = {compoundFormula};
formulaToCheck = model.metFormulas(compoundIndex);
fprintf('The formula %s is correct? %s.', ...
    formulaToCheck{:}, ...
    boolStr{strcmp(formulaToCheck, compoundFormula) + 1});

The formula C6H13O9P is correct? true.

Now correct the equations:

In [230]:
rxnsToChange.rxns = {'ATPX'; 'ENO'; 'FBP'; 'GPP'; 'PDC'; 'PFK'; 'PGL'};

rxnsToChange.equations = {
    'ATP[c] + H2O[c] => ADP[c] + phosphate[c]'; % ATPX
    '2-phospho-D-glycerate[c] <=> phosphoenolpyruvate[c]  + H2O[c]'; % ENO
    'beta-D-fructofuranose 1,6-bisphosphate[c] + H2O[c] => beta-D-fructofuranose 6-phosphate[c] + phosphate[c]'; % FBP
    'glycerol monophosphate[c] + H2O[c] => glycerol[c] + phosphate[c]'; % GPP
    'pyruvate[c] => acetaldehyde[c] + CO2[c]'; % PDC
    'ATP[c] + beta-D-fructofuranose 6-phosphate[c] => ADP[c] + beta-D-fructofuranose 1,6-bisphosphate[c]', % PFK
    '6-O-phosphono-D-glucono-1,5-lactone[c] + H2O[c] => 6-phospho-D-gluconate[c]'; % PGL
    };

model = changeRxns(model, rxnsToChange.rxns, rxnsToChange.equations, 3);

In [231]:
% verify the new formulas work OK
balanceCheck = getElementalBalance(model, reactionsToEvalue, false, true);
balanceCheckTable = array2table(balanceCheck.leftComp - balanceCheck.rightComp, ...
        'VariableNames', balanceCheck.elements.abbrevs, ...
        'RowNames', reactionsToEvalue);
unbalancedTable = balanceCheckTable(reactionsToEvalue(balanceCheck.balanceStatus == 0), :);
fprintf('Table rows: %i', height(unbalancedTable));
disp(unbalancedTable);

Table rows: 0

In [232]:
% Verify with solve if flux is zero
modelSolution = solveLP(model, 1);
disp(modelSolution);
printFluxes(model, modelSolution.x, false, tolerance,[],'▪ %flux\n%rxnID (%rxnName):\n\t%eqn\n\n');

       x: [57x1 double]
       f: 0
    stat: 1
     msg: 'Optimal solution found'

FLUXES:


5.	The same thing done in step 4 can be done with the function canProduce. There is a sister function called canConsume. It checks which metabolites can be consumed by the model. Change so that no production is allowed and run canConsume. One should see that 12 metabolites could be consumed even though the model is not allowed to produce anything. Pick one of them, force uptake of it by setting the lower bound to non-zero. If one does this one may not be able to get a feasible solution. That is because the problem solved by canConsume allows input of all metabolites, but the current model allows input for only O2 and glucose. Modify the model.b variable to allow for uptake of all metabolites.

> Question 4: study the fluxes and try to find the wrong one. What fix should be applied to the corresponding reaction?

Even if one fixes the problem one will see that the model can still get rid of O2. This is because of the reactions that were included for testing (NAD ⇔ NADH is not elementally or redox balanced). This part of the exercise is done, so those reactions can now be deleted.


- [canConsume](https://sysbiochalmers.github.io/RAVEN/doc/core/canConsume.html)

```MATLAB
function consumed=canConsume(model,mets)
```

Checks which metabolites that can be consumed by a model using the specified constraints.

In [234]:
freeReactions.rxns

In [235]:
% first, close production. This include BOTH, exchange reactions, and the modification to the b vector
model.b = model.b(:, 1);
model = setParam(model, 'eq', getExchangeRxns(model, 'out'), 0);
% remove free ATP and electron carriers reactions
model = removeReactions(model, freeReactions.rxns);

In [238]:
% second, use canConsume
consumed = canConsume(model);
disp(model.mets(consumed));

NOTE: The exchange reactions are assigned to the first compartment


I cannot get any of the 12 metabolites that the tutorial says. Checking the file 'tutorial4_solutions.m', I see that the expected result is the following:

```shell
    {'MAL_m'}
    {'AKG_m'}
    {'ACA_c'}
    {'GLC_c'}
    {'CO2_c'}
    {'CO2_m'}
    {'ETH_c'}
    {'FUM_m'}
    {'O2_c' }
    {'OAA_m'}
    {'PYR_c'}
    {'SUC_m'}
```



I noticed that I made additional changes to the model in step 4 than those that appear in the solution file: adding water, balancing all problematic reactions, and fixing the chemical formula of some compounds. Apparently I already curated the chemical reaction the tutorial expect me to solve here 😅.

There is not point undoing my work. I will just copy that section of the solution file to study how they use the canConsume function.

```MATLAB
I=canConsume(model);
disp(model.mets(I)); %These 12 metabolites can be consumed without any production

%Allow all uptake
model.b=[ones(numel(model.b),1)*-1000 model.b];

%Pick CO2 and force uptake of it
model=setParam(model,'eq',{'co2OUT'},-1); %Negative output means input
sol=solveLP(model);
printFluxes(model,sol.x,false,10^-5,[],'%rxnID (%rxnName):\n\t%eqn\n\t%flux\n'); %Now it works

%See that PDC converts pyruvate (3 carbons) to acetaldehyde (2 carbons)
%without any other products. If one googles, one may realize that CO2 is
%missing. This would be simpler to change in the Excel file (or using
%changeRxns), but one can change it here as an exercise. One therefore
%needs to find the index of the reactions and the index of cytosolic CO2 in
%order to change the reaction
Irxn=ismember(model.rxns,'PDC');
Imet=ismember(model.mets,'CO2_c');
model.S(Imet,Irxn)=1; %The coefficient is 1.0

%Display the new equation just to be sure
constructEquations(model,Irxn)

%The solution is now not feasible, meaning that it is no longer possible to
%force uptake of CO2 without any output
sol=solveLP(model);
```

In [250]:
% The reaction was already curated in step 4
[rxnsToChange.rxns(5), rxnsToChange.equations(5)]

6.	Unbalanced reactions are a relatively small problem, since they are so easy to find. A much bigger problem is when metabolites are named differently even though they are meant to be the same. Use smallYeastBad2.xlsx from here on. A first check is to see which reactions can carry flux when one allows for all uptakes and outputs of exchange metabolites. There are several ways to check this but use the function simplifyModel here. The primary purpose of this function is to remove unnecessary stuff from a model to make it smaller, but since it removes “bad” reactions one can use it for error identification as well. If one runs it like it is in tutorial4.m one will see that there are about 20 metabolites and reactions that are dead ends. That is quite a lot, so take a look at the warnings from importExcelModel and see if it is possible to catch the obvious spelling error.

> Question 5: what correction should be applied to fix the spelling error?


In [44]:
% here we use the other model
% smallYeastBad2.xlsx

7.	That did not help very much. Sometimes it is exceedingly difficult to find out where the root of the problem is. This is particularly true if it is in a region with many interconversions between metabolites and no clear input/output (Figure 3).

![figure1](./tutorial_data/tutorial_4_fig_1.png)

**Figure 3.** An example of pathway featuring many interconversions between metabolites and unclear input/output. If one reaction is wrong here it will be difficult to find since everything looks so connected because it is produced and consumed in many reactions.

A powerful but somewhat tricky function is checkProduction. It helps to identify metabolites needed to synthesize in order to to have the net synthesis of everything. Look at the suggestions from checkProduction if when running it like in the tutorial4.m. The function minToConnect tells that is needed to synthesize 12 metabolites to have the net synthesis of everything. However, 8 of them are co-factors or contain co-factors. Since there is no net synthesis of co-factors in this small model those are not interesting (coenzyme A or ATP are not synthesized from glucose). One should look at the top one that is not a co-factor. This one is a bit tricky, and one might want to look it up in KEGG.

> Question 6: what is the suspicious similarity between some metabolites?

In [None]:
model = importExcelModel("./tutorial_data/smallYeastBad.xlsx");
printModelStats(model);

8.	Still quite a bit of dead ends, and nothing that immediately looks like it would fix everything. It could be that some reactions are missing. One could try to include reactions from a set of other models to fill the gaps. This is a computationally expensive task for a large network, but for this small model it is easy. One could use any model structure, but here one can take the small yeast model from Tutorial 3. Run the code and include the suggested reaction(s). Run the previous tests to make sure that everything works. 

9.	Finished! And do not forget that the gapReport function does all these things.