# **GAPFILLING A GENOME SCALE METABOLIC MODEL USING DNNGIOR**

Short tutorial (adapted from Boer et al. 2024, see below) by Maria Carolina Sisco. 

Improving genome-scale metabolic models of incomplete genomes with deep learning
Boer et al. 2024. DOI: 10.1016/j.isci.2024.111349]

- DNNgior: Deep Neural Network Guided Imputation of Reactomes
- GSMM: Genome Scale Metabolic Model 
- DNNgior uses AI to improve gap-filling by learning from the presence and absence of metabolic reactions across diverse bacterial genomes

### **INSTALLATION** 

<div class="alert alert-block alert-info">
First, we need a Gurobi license, a linear programming solver.
1.Register at: https://www.gurobi.com/downloads/gurobi-software/
2.Get a free academic named-user-license here: https://www.gurobi.com/features/academic-named-user-license/ 
(you will need an institutional email)

In [None]:
#!pip install -i https://pypi.gurobi.com gurobipy

In [None]:
#!pip install dnngior

## **GAPFILLING USING A COMPLETE MEDIUM**
In this exercise we will gapfill (adding missing reactions) a GSMM of Blautia, a genus of anaerobic bacteria with probiotic characteristics.

Let's explore the GSMM with some basic Cobrapy commands!

In [None]:
import cobra
from cobra.io import read_sbml_model
draft_reconstruction = read_sbml_model('bh_ungapfilled_model.sbml')

In [None]:
draft_reconstruction.summary()

Notice the objective function (biomass=growth rate). Also, there are no fluxes.

In [None]:
draft_reconstruction.optimize()

In [None]:
draft_reconstruction.medium

The exchange reactions (uptake reactions) are set to a unlimited value, there is no constraint regarding what the bacteria can take from the medium, but even so, our model can't simulate growth.

Let's start gapfilling our model!<br>
Import the dnngior library and use the Gapfill class to gapfill the reconstruction

In [None]:
import os, sys
path_to_blautia_model = ("bh_ungapfilled_model.sbml")

In [None]:
import dnngior
gapfilled_model_complete = dnngior.Gapfill(draftModel = path_to_blautia_model, 
                                          medium = None, 
                                          objectiveName = 'bio1')

Make a new object of the gapfilled model

In [None]:
gf_model_compl_med = gapfilled_model_complete.gapfilledModel.copy()

In [None]:
gf_model_compl_med.optimize()

### It's growing!, the growth rate after optimization is : 146.138 mmol/gDW/ hr (Millimoles per gram dry cell weight per hour), the default flux units used in FBA

Now let's see how many and which reactions DNNgior added in order to simulate growth

In [None]:
print("Number of reactions added:", len(gapfilled_model_complete.added_reactions))
print("~~")
for reaction in gapfilled_model_complete.added_reactions:
    print(gf_model_compl_med.reactions.get_by_id(reaction).name)

In [None]:
gf_model_compl_med.reactions.get_by_id('EX_cpd15432_e0')

In [None]:
gf_model_compl_med.reactions.get_by_id('EX_cpd15511_e0')

## **GAPFILLING USING A DEFINED MEDIUM**

First, load the media file containing the composition of the medium

In [None]:
medium_file_path = 'Nitrogen-Nitrite_media.tsv'

In [None]:
import pandas as pd
new_medium = pd.read_csv(medium_file_path, sep="\t")
new_medium.head()

Let's gapfill our GSMM so it can growth on this medium

In [None]:
gapfill_nitr = dnngior.Gapfill(path_to_blautia_model, medium_file = medium_file_path, objectiveName = 'bio1')

Again, make a new object of the gapfilled model and check if it's growing.

In [None]:
gf_model_Nit_med = gapfill_nitr.gapfilledModel.copy()
gf_model_Nit_med.optimize()

Let's see how many and which reactions DNNgior added in order to simulate growth on the nitrite media

In [None]:
print("Number of reactions added:", len(gapfill_nitr.added_reactions))
print("~~")
#for reaction in gapfill_nitr.added_reactions[:5]:
for reaction in gapfill_nitr.added_reactions:
    print(gf_model_Nit_med.reactions.get_by_id(reaction).name)