In [67]:
import cobra
from cobra.flux_analysis import flux_variability_analysis
import pandas as pd
import csv


# Assignment 1: Metabolic Modelling

12/09/2025

## Task 1

- **1A. In the interactive session, we saw reaction fluxes in a linear pathway being equal to each other due to mass balance constraints. Do you observe the same thing now for the maximal reaction activities? Explain your observations.**

    In this case, the maximal reaction activities in a linear pathway are not equal to each other. If for example we look at the lower glycolysis pathway we get the following sequence of reaction activities:
    - GAPD: 26.1
    - PGK: 23.1
    - PGM: 20.0
    - ENO: 28.8

- **1B. Some reactions have no maximal reaction activity data. Identify at least two different kinds of such reactions, and explain why gene expression-derived data would not be applicable for these kinds of reactions.**

    The first obvious reaction type are exchange reactions. If we specifically look at Ex_glc_D_e and EX_fru_e for example, they are empty and have no data. These specific reactions show the transport of that substance across the cell boundary. Gene expression data is not useful in this case for maximal activity as their rate is not only dependant on the number of expressed transport proteins. Also if there is no glucose or fructose in this case coming from an external environment the reaction rate is always zero.   

    The second reaction type would be Biomass, in this case we very clearly have BIOMASS_Ecoli_cor_w_GAM that has no data at all. The process it describes the one of cell growth, using precursor metabolites to simulate that. Since there is no gene that encodes such a process it is therefore not possible to get a maximal activity value from it only based on gene expression data. 

## Task 2

We will now establish an enzyme activity-constrained metabolic model in COBRApy. Please implement the maximal reaction activity data in the E. coli core model from the practical per the below instructions.

- For reversible reactions, set the lower and upper flux bound to -value and +value, respectively (“value” as the maximal reaction activity).

- For irreversible reactions, only set the maximal flux bound to value and leave the minimal flux bound at zero.

- For reactions without data, leave their default constraints.

- For the glucose exchange reaction (see practical), please remove the maximal absolute flux bound and use the high absolute default bound instead. (We assume the subsequent transporter maximal activity accounts for maximal glucose uptake.)

- For the ATPM energy maintenance reaction (see practical), leave the lower flux as-is, since it describes cellular energy requirements separate from any maximal reaction activities.

Please print a table listing each reaction’s lower and upper flux bound after implementing the above (no formatting requirements, simple printout sufficient).

In [68]:
# Load the E. coli core model
model = cobra.io.load_json_model('e_coli_core.json')

# Read the CSV file and populate the dictionary
filename = 'e_coli_core_expression.csv'
estimated_maximal_activity = {}
with open(filename, mode='r', newline='') as file:
    reader = csv.reader(file)   
    for row in reader:
        
        try:
            num = float(row[1])
            estimated_maximal_activity[row[0]] = num
        except ValueError:
            continue

# Set reaction bounds based on estimated maximal activity
high_absolute_default_bound = 1000
for reaction in model.reactions:

    # Skip the ATPM reaction
    if (reaction.id=="ATPM"):
        print("ATPM reaction unchanged")
        continue

    # use high absolute default bound for exchange reaction of glucose
    if (reaction.id=="EX_glc__D_e"):
        print("EX_glc__D_e bounds changed to high absolute default bound")
        reaction.lower_bound = -high_absolute_default_bound
        reaction.upper_bound = high_absolute_default_bound
        continue

    # Skip reactions that are not in the estimated maximal activity dictionary
    try:
        value = estimated_maximal_activity[reaction.id]
    except KeyError:
        continue

    # Reversible reaction: set upper and lower bound to estimated maximal activity
    if (reaction.reversibility == True):
        reaction.lower_bound = -value
        reaction.upper_bound = value

    # Irreversible reaction: set upper bound to estimated maximal activity and lower bound to 0
    elif (reaction.reversibility == False):
        reaction.lower_bound = 0.0
        reaction.upper_bound = value

ATPM reaction unchanged
EX_glc__D_e bounds changed to high absolute default bound


In [69]:
# Print the reaction bounds
for reaction in model.reactions:
    print(f'{reaction.id}: UB:{reaction.upper_bound}, LB:{reaction.lower_bound}')

PFK: UB:12.12, LB:0.0
PFL: UB:1.0, LB:0.0
PGI: UB:13.12, LB:-13.12
PGK: UB:23.13, LB:-23.13
PGL: UB:8.12, LB:0.0
ACALD: UB:1.16, LB:-1.16
AKGt2r: UB:3.1, LB:-3.1
PGM: UB:20.01, LB:-20.01
PIt2r: UB:6.03, LB:-6.03
ALCD2x: UB:9.01, LB:-9.01
ACALDt: UB:2.29, LB:-2.29
ACKr: UB:1.19, LB:-1.19
PPC: UB:2.56, LB:0.0
ACONTa: UB:25.35, LB:-25.35
ACONTb: UB:25.35, LB:-25.35
ATPM: UB:1000.0, LB:8.39
PPCK: UB:25.23, LB:0.0
ACt2r: UB:3.23, LB:-3.23
PPS: UB:2.5, LB:0.0
ADK1: UB:30.57, LB:-30.57
AKGDH: UB:24.35, LB:0.0
ATPS4r: UB:60.5, LB:-60.5
PTAr: UB:4.47, LB:-4.47
PYK: UB:26.78, LB:0.0
BIOMASS_Ecoli_core_w_GAM: UB:1000.0, LB:0.0
PYRt2: UB:1.26, LB:-1.26
CO2t: UB:30.45, LB:-30.45
RPE: UB:5.67, LB:-5.67
CS: UB:20.56, LB:0.0
RPI: UB:5.56, LB:-5.56
SUCCt2_2: UB:2.36, LB:0.0
CYTBD: UB:40.56, LB:0.0
D_LACt2: UB:4.56, LB:-4.56
ENO: UB:28.78, LB:-28.78
SUCCt3: UB:6.67, LB:0.0
ETOHt2r: UB:2.34, LB:-2.34
SUCDi: UB:24.34, LB:0.0
SUCOAS: UB:20.6, LB:-20.6
TALA: UB:4.45, LB:-4.45
THD2: UB:4.5, LB:0.0
TKT1: UB:3

## Task 3

In addition to maximizing reaction rate (“flux”) through some reaction of biological interest, like the biomass reaction, we can use Flux Balance Analysis (FBA) optimizations also to ask: What are the smallest and largest fluxes possible per reaction, given all present constraints in the network? This is known as “Flux Variabiliry Analysis” or FVA. In FVA, we loop over every reaction of interest and once maximize and once minimize its flux via FBA; thus obtaining the permissible flux range for each reaction in the model.

- **3A. Carry out an FVA for all reactions in the model and print out the resulting minimal and maximal fluxes per reaction (simple printout sufficient).**

In [70]:
# 3.a
# Perform FVA optimization
pd.set_option('display.max_rows', None)
fva_results = flux_variability_analysis(model, model.reactions, fraction_of_optimum=1.0)

# Report the flux variability results
print("Flux Variability Results:")
print(fva_results)

Flux Variability Results:
                               minimum       maximum
PFK                       7.314456e+00  7.314456e+00
PFL                       3.897323e-14  0.000000e+00
PGI                       7.376438e+00  7.376438e+00
PGK                      -1.446649e+01 -1.446649e+01
PGL                       4.751035e-01  4.751035e-01
ACALD                    -1.160000e+00 -1.160000e+00
AKGt2r                    2.351598e-16  0.000000e+00
PGM                      -1.374906e+01 -1.374906e+01
PIt2r                     1.764199e+00  1.764199e+00
ALCD2x                   -1.160000e+00 -1.160000e+00
ACALDt                    3.023483e-16  2.392232e-15
ACKr                     -1.190000e+00 -1.190000e+00
PPC                       1.374260e+00  1.374260e+00
ACONTa                    3.289982e+00  3.289982e+00
ACONTb                    3.289982e+00  3.289982e+00
ATPM                      8.390000e+00  8.390000e+00
PPCK                      0.000000e+00 -1.966609e-14
ACt2r               

- **3B. Identify any reactions with gene expression-imposed maximal reaction activities, whose permissible flux range in forward direction is nonzero yet comes out less than its upper flux bound. State their number and explain why a set of reactions behaves this way.**

    **Reaction behaviour explanation:**
    There are a total of 30 reactions that the permissible flux range in the forward direction is non-zero, but they come out less than the gene expression imposed upper flux bound. This is likely to be explained by the fact that a reactions maximum possible speed is limited by two things:
    Their upper bond, so local capacity and the FVA result so in a sense their systemic throughput. That is the true maximum rate the reaction can occur given all other limitations in the network. We can say that these 30 reactions are not the bottleneck in the system but rather the network itself is the problem. Likely real bottlenecks would be upstream limits: for example the rate of glucose intake (GLCpts at 7,95) this limits any reaction downstream of that. Also the reactions themselves consuming its products are at their own maximum capacity. 

In [71]:
# 3.b
constrained_reactions = []
tolerance = 1e-9

for reaction in model.reactions:
    if reaction.id in estimated_maximal_activity:
        fva_max_flux = fva_results.loc[reaction.id, 'maximum']
        reaction_upper_bound = reaction.upper_bound
        
        if (reaction.id != 'FORt' and
            fva_max_flux > tolerance and  
            fva_max_flux < reaction_upper_bound - tolerance): 
            
            constrained_reactions.append(reaction.id)

print(f"Number of reactions limited by the network: {len(constrained_reactions)}")
print("\nList of constrained reaction IDs:")
print(constrained_reactions)

Number of reactions limited by the network: 30

List of constrained reaction IDs:
['PFK', 'PGI', 'PGL', 'PIt2r', 'PPC', 'ACONTa', 'ACONTb', 'AKGDH', 'ATPS4r', 'PTAr', 'PYK', 'CS', 'CYTBD', 'ENO', 'SUCDi', 'TALA', 'TKT1', 'TPI', 'FBA', 'FUM', 'G6PDH2r', 'GAPD', 'GLCpts', 'GLNS', 'GND', 'ICDHyr', 'MDH', 'NH4t', 'O2t', 'PDH']


- **3C. How many reactions have a positive minimal flux in the FVA? State their number and explain why a set of reactions behaves this way.**

    **Reaction behavior explanation:** Based on the flux variablity results, there are a total of 46 reactions that have a positive minimum flux. For these reactions, it is essential that a reaction happens to fulfill the specified cellular objective, in this case it is maximum biomass production. They must carry a forward flux, thus cannot be turned off or reversed. For some of these reactions this is the case because they are the only link in the pathway that can produce a critical component in that pathway like a specific amino acid for example. Also the entire biomass reaction itself must be in a positive flux for the model to simulate and maintain growth. In some way, those specific reactions are essential to the cells metabolism. 

In [72]:
# 3.c
pos_min_flux_reactions = fva_results[fva_results["minimum"] > 0].index # reactions with minimum flux > 0
pos_min_flux_reactions.size

45

## Task 4

- **4A. Carry out an FBA optimization of biomass and report the maximal biomass production rate in the presence of the implemented constraints.**

In [73]:
# 4.a
# Perform FBA optimization
fba_results = model.optimize()

# Report the maximal biomass production rate
print(f"Maximal biomass production rate: {fba_results.objective_value:.4f}")

Maximal biomass production rate: 0.4796


- **4B. Determine the “bottleneck(s)” – i. e. those reaction(s) whose flux reaches its maximum.**

In [74]:
# 4.b: determine reaction whose flux reaches its maximum
reactions_at_max_flux = []
for reaction in model.reactions:
    flux = fba_results.fluxes[reaction.id]
    if flux == reaction.upper_bound:
        reactions_at_max_flux.append(reaction)

print(f"Number of reactions whose flux reaches its maximum: {len(reactions_at_max_flux)}")
print("Reactions whose flux reaches its maximum:")
for reaction in reactions_at_max_flux:
    print(f"- {reaction.id}")   

Number of reactions whose flux reaches its maximum: 2
Reactions whose flux reaches its maximum:
- THD2
- NADH16


- **4C. The model does not use (have non-zero flux for) all reactions in the model constrained with maximal reaction activities. Pick a reaction with maximal reaction activity constraints and zero flux in the optimal solution, and explain from the network context why it carries not carry flux.**

    **Analyzing an unused reaction: PFL**

    The reaction we are going to analyze is 'PFL'. When we look at the reaction in the online demo we see that it is a reaction which only connects two point and runs in parallel with another reaction, 'PDH'. This also immediately explains why 'PFL' has no flux. All flux that is needed between the two nodes is provided by the other reaction. This make 'PFL' redundant. 

    ![image.png](Q4c.png)

In [75]:
#  4.c
print("Reactions with given maximal reaction activity and zero flux:")
list_4c = []
for reaction_id in estimated_maximal_activity.keys():
    reaction = model.reactions.get_by_id(reaction_id)
    if fba_results.fluxes[reaction.id] == 0:
        list_4c.append(reaction.id)
print(list_4c)


Reactions with given maximal reaction activity and zero flux:
['PFL', 'AKGt2r', 'ACALDt', 'PPCK', 'PPS', 'ADK1', 'SUCCt2_2', 'SUCCt3', 'FBP', 'FORt2', 'FORt', 'FRD7', 'FRUpts2', 'FUMt2_2', 'GLNabc', 'GLUN', 'GLUSy', 'GLUt2r', 'ICL', 'MALS', 'MALt2_2', 'ME1', 'ME2', 'NADTRHD']
