#  Introductions to constrainat-based modeling using cobrapy

## Part 2: Flux Balance Analysis

### Instructor:
* Miguel Ponce de León from (Barcelona Supercomputing Center)
* Contact: miguel.ponce@bsc.es

* Miroslav 
* Pablo

Install the packages if they are not installed yet

In [79]:
import Pkg
Pkg.add(["COBREXA", "GLPK"])

using COBREXA
using GLPK

[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `~/local/research/constraint-based-modeling/summerschool-cobrexa-material/Project.toml`
[32m[1m  No Changes[22m[39m to `~/local/research/constraint-based-modeling/summerschool-cobrexa-material/Manifest.toml`


# Part 2: Genome-scale modelling.

In this part we are gonna use a genome-scale metabolic model of Escherichia coli named iJO1366
The file has already been stored in the data folder and its path is data/iJO1366.xml

Alternatively, you can also access it here:
- [http://bigg.ucsd.edu/models/iJO1366](http://bigg.ucsd.edu/models/iJO1366)

to download the model and to see other metadata (citation, description, etc)

## Part 2.1: Studying the model.

Load the model E. coli IJO1366 from the path:

`data/E_coli_iJO1366.json`

using:

`model = load_model(StandardModel, path_to_model)`

...we've opened it as StandardModel right away to allow easy manual
modifications.

In [114]:
## TODO
## Write your code below




## Exercise 2.1: Inspecting the model's numbers

How many metabolites genes and reactions are contained in the model?

Tip 1: model.metabolite or metabolites(model)

Tip 2: use length to count elemnts

In [66]:
## TODO
## Write your code below




## Inspecting the genes


You can do it directly with: 

`gene = model.genes["b0720"]`

Inspect the reaction by printing:
1. gene.name
2. gene.annotations
3. find reactions encoded by the gene (advance)

In [67]:
## TODO
## Write your code below
  


### Inspecting the systems' boundaries


Use the following line of code to obtain boundary reactions: 

`boundary = [r for r in reactions(model) if length(reaction_stoichiometry(model,r)) == 1]`

this will filter all reaction having just a fingle metabolies

You will find metabolties with prefix
* `EX_`
* `DM_`

Can you find any the differentes? Print to reactions to find out

How many boundary reactions does the model has?

In [68]:
## TODO
## Write your code below




## see the objective function (the reaction set to be optimized)

You can also find the objective function using the following filtering technique:

`objective_reactions = [r for r in model.reactions.keys if model.reactions[r].objective_coefficient != 0]`

This will return the list of reactions id having a none null objective coefficent.

In practice this will return a list with a single element corresponding to the Biomass reaction
* `Ec_biomass_iJO1366_WT_53p95M`

You can also acces the index of the objective reactions using:

`objective(model)`

but this will return a vector with reactions indexes, so using the index you can then do:

`reactions(model)[idx]`

where idx is the index found in prvious step

In [78]:
## TODO
## Write your code below




### Running a Flux Balance Analysis (FBA).

Documentation: https://lcsb-biocore.github.io/COBREXA.jl/stable/examples/05a_fba/

By default, the model boundary condition (growth medium) is M9 aerobic (glucose minimal)
Lets check the boundary conditions

1.  Check the medium by inspecting the lower_bound of the following reactions:
  * `EX_glc__D_e`  (this is the ID of the glucose exchange reaction for glucose)
  * `EX_o2_e`      (this is the ID of the O2 exchange reaction)
  
Tip: Use `model.reactions[exchange_id]` combined with the corresponding attribute

More genrally we can check for exchange reactions having negative lower bounds (advance)
Tip: combined thins from previous excercises

What do we have in the mgrowth media?

In [91]:
## TODO
## Write your code below




### Running a Flux Balance Analysis (FBA).

FBA is solved using a Linear Programing solver and there a several different commercial and free packages.
Some opensource include:
* GLPK
* SCIP

Some commercial packages include:
* Gurobi
* CPLEX

Lets run FBA usging the old good GLPK (GNU Linear Programming Kit)

For simplicity, let's ask for a dictionary right away, this will return a dictionary of the form:

`Reaction_id => flux_vale`



In [92]:
solution_dict = flux_balance_analysis_dict(model, GLPK.Optimizer)

-17.578933530254268

### Exploring the optimal flux distribution I

Using the soluction dictionary find the predict:
* growth rate (biomass production)
* consumed oxygen
* consumed glucose

we can have a look at our objective reaction

In [97]:
## TODO
## Write your code below




### Exploring the optimal flux distribution II

Now tha you have done thing the hard way lets use this nice helper function:

`flux_summary(solution_dict)`


What can you tell from the output?

In [94]:
## TODO
## Write your code below




(this guesses the exchange/biomass status based on reaction IDs)

## Create a dataframe from the optimal flux distribution and save the result

We can write the result to a file for future use. 

But first, let's create a data frame

And the write the solution into a CSV.

Finally, load the solution using Escher:

https://escher.github.io/#/

First choose:
* Map: Central MEtabolism (iJO1366)
* Model: iJO1366

Load your reactions data into the map

In [99]:
Pkg.add(["DataFrames", "CSV"])
using DataFrames, CSV

df = DataFrame(reaction = collect(keys(solution_dict)), flux = collect(values(solution_dict)))

CSV.write("out/aerobic_solution.csv", df)

[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `~/local/research/constraint-based-modeling/summerschool-cobrexa-material/Project.toml`
[32m[1m  No Changes[22m[39m to `~/local/research/constraint-based-modeling/summerschool-cobrexa-material/Manifest.toml`


"aerobic_solution.csv"

## Exercise 2.2: 

1. Change the oxygen exchange lower bound to zero to simulate anaerobic growth.
2. Optimize the model
3. What is the maximal growth rate in anaerobic conditions
4. what are the main three secretion products?

In [98]:
## TODO
## Write your code below





## Exercise 2.3: 

Lets change the carbon source, but also enable oxygen

1. Set the oxygen exchange (`EX_o2_e`) lower bound to -20
2. Set the glucose exchange flux (`EX_glc__D_e`) lower bound to 0)
3. Set the glucose exchange flux (`EX_ac_e lower`) bound bound to -10)

What is the maximal growth rate using acetate as soley carbon source
what is the oxygen uptake rate?

Finally, what does the model tells about E. coli growing in an aerobic condition using acetate as the soley carbon source?
1. Set the oxygen exchange flux (`EX_o2_e`) lower bound to 0
2. Set the glucose exchange flux (`EX_glc__D_e`) lower bound to 0)
3. Set the glucose exchange flux (`EX_ac_e lower`) bound bound to -10)

In [112]:
## TODO
## Write your code below




Tip to understand previous excersice: when there is no feasible solution the function returns nothing.

## Tasteing the magic of COBREXA

At this point, the original model data has been overwritten and there's no
telling which bounds are still from the original model or which have been
modified. For many reasons it is better to do this stuff without breaking the
model internals manually, and COBREXA has a system of "analysis
modifications" for that purpose.

(by the way, this scales much better if you need to try more stuff. 

Other modifications include e.g. `change_objective`, `silence` for shutting down the output from
overly verbose solvers (such as OSQP), and `change_optimizer_attribute` for
tuning the optimizer behavior.

In [13]:
# Reload the model
model = load_model(StandardModel, "data/E_coli_iJO1366.json");
solution_dict = flux_balance_analysis_dict(
    model,
    GLPK.Optimizer,
    modifications = [
        change_constraint("EX_o2_e", lb = 0.0),
        change_constraint("EX_glc__D_e", lb = -5.0),
    ],
);
solution_dict["BIOMASS_Ec_iJO1366_core_53p95M"]

0.18602227486090275

There is a nice app at https://escher.github.io/ that allows us to visualize
and browse the solutions to metabolic models. You can load the visualization
of this model as Map: "Core metabolism (e_coli_core)" and Tool: "Viewer".

Let's produce a JSON file with our solution that we can upload:

In [14]:
Pkg.add("JSON")
using JSON

write(
    "out/anaerobic_solution.json",
    JSON.json(
        solution_dict, # the solution
        2, # enforce nice 2-space indentation instead of squashing the JSON to optimize for size
    )
)

   Resolving package versions...
  No Changes to `~/.julia/environments/v1.9/Project.toml`
  No Changes to `~/.julia/environments/v1.9/Manifest.toml`


55523

You can now upload the file `mysolution.json` to the Escher viewer via
`Data → Load reaction data`
to see the fluxes visualized.

More configurable Escher plotting directly from Julia to files (PDF, PNG) is
available via https://github.com/stelmo/Escher.jl

---

*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*