# MEWpy Optimization


Author: Vitor Pereira

License: [CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)

-------

In this tutorial:

- You will learn how to run combinatorial optimization on microbial communities using MEWpy

## Bacterial cross-feeding via extensive gene loss

Metabolic dependencies between microbial species have a significant impact on the
assembly and activity of microbial communities. However, the evolutionary origins of
such dependencies and the impact of metabolic and genomic architecture on their
emergence are not clear.

Recently, [McNally et al.](https://doi.org/10.1186/s12918-018-0588-4) proposed a method to evolve
cooperative interactions among microbial species by fostering cross-feeding of a
diverse set of metabolites. This was achieved by incrementaly deleting genes on
initially two identical bacterias (E. coli) and increasingly imposing constraints on the
community metabolic network such that two new strains evolve and start to diverge
while maximizing the differences on their genotypes.

The aim is to replicate the experiments using
MEWpy to maximize the number of deleted genes on the two strains
while inducing cross-feeding. 

### Run in Google colab

If you are running this notebook in Colab, you need to perform the following steps, otherwise skip.

In [1]:
%%bash
[[ ! -e /colabtools ]] && exit
! pip install -U -q PyDrive

In [2]:
if 'google.colab' in str(get_ipython()):
    from pydrive.auth import GoogleAuth
    from pydrive.drive import GoogleDrive
    from google.colab import auth
    from oauth2client.client import GoogleCredentials

    auth.authenticate_user()
    gauth = GoogleAuth()
    gauth.credentials = GoogleCredentials.get_application_default()
    drive = GoogleDrive(gauth)

    model_file = drive.CreateFile({'id':'1o0XthuEOs28UJ4XTa9SfFSFofazV-2nN'})
    model_file.GetContentFile('e_coli_core.xml.gz')

## Step 1 - Load the model

In [3]:
from cobra.io import read_sbml_model
import warnings
warnings.filterwarnings('ignore')

model = read_sbml_model('e_coli_core.xml.gz')

Set parameter Username
Academic license - for non-commercial use only - expires 2023-10-30


We will make two copies of the model and rename the two strains

In [4]:
from mewpy import get_simulator
wildtype = get_simulator(model)

In [5]:
ec1 = wildtype.copy()
ec1.id = 'ec1'
ec2 =wildtype.copy()
ec2.id = 'ec2'

Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp9jsctcci.lp
Reading time = 0.00 seconds
: 72 rows, 190 columns, 720 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpzyshdd8d.lp
Reading time = 0.00 seconds
: 72 rows, 190 columns, 720 nonzeros


In the experiment, we will use the medium defined in the model

In [6]:
from mewpy.simulation import Environment
medium = Environment.from_model(wildtype)
medium

EX_ac_e	0.0	1000.0
EX_acald_e	0.0	1000.0
EX_akg_e	0.0	1000.0
EX_co2_e	-1000.0	1000.0
EX_etoh_e	0.0	1000.0
EX_for_e	0.0	1000.0
EX_fru_e	0.0	1000.0
EX_fum_e	0.0	1000.0
EX_glc__D_e	-10.0	1000.0
EX_gln__L_e	0.0	1000.0
EX_glu__L_e	0.0	1000.0
EX_h_e	-1000.0	1000.0
EX_h2o_e	-1000.0	1000.0
EX_lac__D_e	0.0	1000.0
EX_mal__L_e	0.0	1000.0
EX_nh4_e	-1000.0	1000.0
EX_o2_e	-1000.0	1000.0
EX_pi_e	-1000.0	1000.0
EX_pyr_e	0.0	1000.0
EX_succ_e	0.0	1000.0

## Step 2 - Find single strain gene KOs

Two make the search for combinatorial gene deletion on the community model, we will first idenfify combinatorial gene delerions in one strain, and use these results to seed the community gene deletions.

We start by defining a gene deletion optimization problem (`GKOProblem`) defining as objective the maximization of biomass production (f1) and the maximization of the number of deletions.

In [7]:
from mewpy.problems import GKOProblem
from mewpy.optimization.evaluation import TargetFlux, CandidateSize

In [8]:
f1 = TargetFlux(wildtype.biomass_reaction,method='FBA')
f2 = CandidateSize(maximize=True)

To simpilfy the problem, we will define a maximum of 30 gene deletion and run 10 optimization iterations.

In [9]:
problem = GKOProblem(wildtype,
                     [f1,f2],
                     candidate_max_size = 30)


In [47]:
from mewpy.optimization import EA
ea = EA(problem, max_generations=10)
gkos = ea.run(simplify=False)

Running NSGAII
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpc14uh8gp.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp8x4n295j.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp8oz25yst.lp
Reading time = 0.01 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpj5o6da83.lp
Reading time = 0.01 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmphr2vlppm.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpkma8rccm.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP forma

Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpiu9czfyl.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Reading time = 0.00 seconds
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpd32s_y9u.lp
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpn5o7vhn_.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpavua9mxt.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpdn4u8_21.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpoz5m1phk.lp
Reading time = 0.00 seconds


Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp6_3nbz_g.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp4ky9o63s.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpy02t1vkd.lp
: 125 rows, 342 columns, 1406 nonzeros
Reading time = 0.00 seconds
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp_k9ex3e6.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpjnx2wxag.lp
: 125 rows, 342 columns, 1406 nonzeros
Reading time = 0.00 seconds
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp1ea351py.lp
Reading time = 0.00 seconds


Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp5g543k__.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpkj9qc9yt.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp617fjpex.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpw2ha5uea.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpud1ozmvd.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpof7axq4i.lp
: 125 rows, 342 columns, 140

We can have a look to the solution found by the evolutionary algoritm (EA)

In [48]:
ea.dataframe()

Unnamed: 0,Modification,Size,TargetFlux,TargetFlux.1,Size.1
0,"{'b3870_ec1': 0, 'b1611_ec1': 0, 'b0724_ec2': ...",19,0.112749,0.7184466,19.0
1,"{'b1818_ec2': 0, 'b4151_ec1': 0, 'b0351_ec2': ...",43,0.590726,0.1416675,43.0
2,"{'b3736_ec1': 0, 'b4152_ec2': 0, 'b1818_ec2': ...",52,0.0,0.0,52.0
3,"{'b4151_ec1': 0, 'b1676_ec1': 0, 'b1702_ec2': ...",30,0.563597,0.2029618,30.0
4,"{'b1676_ec1': 0, 'b3737_ec1': 0, 'b1773_ec1': ...",20,0.40434,0.4268554,20.0
5,"{'b1479_ec1': 0, 'b1761_ec2': 0, 'b1773_ec2': ...",8,0.112749,0.7184466,8.0
6,"{'b2097_ec2': 0, 'b4151_ec1': 0, 'b4015_ec2': ...",8,0.119112,0.712084,8.0
7,"{'b3870_ec2': 0, 'b4015_ec2': 0, 'b1818_ec1': ...",4,0.712084,0.1191116,4.0
8,"{'b2097_ec2': 0, 'b4151_ec1': 0, 'b3870_ec2': ...",5,0.119112,0.712084,5.0
9,"{'b1676_ec1': 0, 'b1702_ec2': 0, 'b3737_ec1': ...",19,0.430884,0.3607431,19.0


and run an FBA on the first solution (the biomass is set by default as objective)

In [49]:
problem.simulate(solution=gkos[0].values)

objective: 0.831195550185812
Status: OPTIMAL
Method:SimulationMethod.FBA

We may now generate solutions that will seed the EA to be used later in the community gene deletions.

In [50]:
import random

init_pop = []
for s in gkos:
    x=s.values
    init_pop.append([k+'_ec1' for k in x.keys()])
    init_pop.append([k+'_ec2' for k in x.keys()])

random.shuffle(init_pop)


## Community mutants 

We can now address our main goal, starting by defining a community model:

In [51]:
from mewpy.model import CommunityModel

community= CommunityModel([ec1,ec2],flavor='cobra')
sim = community.get_community_model()
sim.set_environmental_conditions(medium)

We will consider 3 optimization objectives:

- Maximize `ec1` growth while ensuring that `ec2` growth is above 0.1/h;
- Maximize `ec2` growth while ensuring that `ec1` growth is above 0.1/h;
- Maximize the total number of gene deletions.

Although we will be using pFBA to select a specific solution, it is not actually a pFBA.

In [52]:
f1 = TargetFlux(community.organisms_biomass['ec1'],
                community.organisms_biomass['ec2'],
                min_biomass_value=0.1,method='pFBA')

f2 = TargetFlux(community.organisms_biomass['ec2'],
                community.organisms_biomass['ec1'],
                min_biomass_value=0.1,method='pFBA')

f3 = CandidateSize(maximize=True)

In [53]:
problem = GKOProblem(sim,
                     [f1,f2,f3],
                     candidate_max_size = 60)

Now that we have defined the optimization problem, we may run it (for 10 iterations, and considering a maximum of 60 gene deletions in total)

In [None]:
ea = EA(problem,
        max_generations=10,
        initial_population=init_pop[:100])

solutions = ea.run(simplify=False)

Building modification target list.


100%|███████████████████████████████████████| 274/274 [00:00<00:00, 2182.16it/s]

Running NSGAII
Skipping seed: ['b4014_ec1_ec2']   'b4014_ec1_ec2' is not in list
Skipping seed: ['b1676_ec1_ec2', 'b3737_ec1_ec2', 'b1773_ec1_ec2', 'b0809_ec1_ec2', 'b4151_ec2_ec2', 'b0351_ec1_ec2', 'b1602_ec1_ec2', 'b4014_ec2_ec2', 'b0755_ec2_ec2', 'b1761_ec1_ec2', 'b3739_ec1_ec2', 'b2283_ec1_ec2', 'b0810_ec2_ec2', 'b3731_ec1_ec2', 'b1818_ec1_ec2', 'b3734_ec1_ec2', 'b4154_ec2_ec2', 'b1603_ec1_ec2', 'b0728_ec2_ec2', 'b3951_ec2_ec2']   'b1676_ec1_ec2' is not in list
Skipping seed: ['b4151_ec1_ec2', 'b3870_ec2_ec2', 'b4154_ec1_ec2', 'b3403_ec1_ec2']   'b4151_ec1_ec2' is not in list
Skipping seed: ['b2097_ec2_ec1', 'b4015_ec2_ec1', 'b3870_ec2_ec1', 'b1818_ec1_ec1', 'b4014_ec2_ec1']   'b2097_ec2_ec1' is not in list
Skipping seed: ['b3736_ec1_ec1', 'b4152_ec2_ec1', 'b1818_ec2_ec1', 'b3870_ec1_ec1', 'b0979_ec2_ec1', 'b2277_ec1_ec1', 'b0726_ec1_ec1', 'b0351_ec2_ec1', 'b2279_ec2_ec1', 'b1812_ec1_ec1', 'b1612_ec1_ec1', 'b2278_ec2_ec1', 'b1702_ec2_ec1', 'b3731_ec2_ec1', 'b3737_ec1_ec1', 'b1479_e




Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmph3ebecwm.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpskq5i_5t.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp6hmwfb59.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpda03h18x.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp319lkffl.lp
: 125 rows, 342 columns, 1406 nonzeros
Reading time = 0.00 seconds
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp680pzsaa.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from fi

Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmphlp7wgyy.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpcyebwiob.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpkcceg_v2.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmp3q844rv6.lp
Reading time = 0.00 seconds
: 125 rows, 342 columns, 1406 nonzeros
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpc05q701c.lp
: 125 rows, 342 columns, 1406 nonzeros
Reading time = 0.00 seconds
Read LP format model from file /var/folders/fw/kbs61_l15j587pjbwf3_y8780000gn/T/tmpa8qq92mc.lp
Reading time = 0.00 seconds


We may now have a look at the solutions as a dataframe or as a plot

In [None]:
df = ea.dataframe()
df

In [None]:
ea.plot()

We may even simulate one of the solution

In [None]:
solution = solutions[8]

In [None]:
problem.simulate(solution=solution.values,method='pFBA').find('BIOMASS',show_nulls=True)

or have a look to the reactions that were 'deleted'

In [None]:
problem.solution_to_constraints(solution.values)