# **GAPFILLING A GENOME SCALE METABOLIC MODEL USING DNNGIOR**

Short tutorial (adapted from Boer et al. 2024, see below) by Maria Carolina Sisco. 

Improving genome-scale metabolic models of incomplete genomes with deep learning
Boer et al. 2024. DOI: 10.1016/j.isci.2024.111349]

- DNNgior: Deep Neural Network Guided Imputation of Reactomes
- GSMM: Genome Scale Metabolic Model 
- DNNgior uses AI to improve gap-filling by learning from the presence and absence of metabolic reactions across diverse bacterial genomes

### **INSTALLATION** 

Open a terminal or command prompt and run the following command to create the environment:


conda create --name dnngior python=3.10.16

When prompted, confirm the installation by typing y and pressing Enter. After that, activate the environment

conda activate dnngior

First, we must install GUROBI optimizer (for further information, you can check on
https://support.gurobi.com/hc/en-us/articles/14799677517585-Getting-Started-with-Gurobi-Optimizer).
Visit the Download Gurobi Optimizer page (https://www.gurobi.com/downloads/gurobi-
software/) and download it. Now, you need edit the bashrc. Follow the steps bellow:

1. open a terminal

2. go to root directory using cd

3. open the bashrc file. We suggest use gedit, nano or vim

4. add the following line in the bashrc:

export PATH=’path-to-gurobi-bin-folder/bin/:$PATH’

to locate the path-to-gurobi-bin-folder, you have to go to the directory where
you have extracted the files from the downloaded tar.gz folder

Second, we need a Gurobi license, a linear programming solver. You can obtain a free academic named-user-license here https://www.gurobi.com/features/academic-named-user-license/ with your institutional email.

click on named-user-license. You will generate one grbgetkey to your machine. Please, type (changing the X for your respective key):

grbgetkey 9f4XXXX-XXXX-XXXX-XXXX-XXXXXXXXXX

Now, you will install DNNgior (inside your conda environment) with the following command:

pip install dnngior

In order to run the dnngior pipeline on a jupyter notebook, you need to install jupyter inside your conda environment with the following command:

conda install -c conda-forge notebook -y<br>
or<br>
pip install notebook

Open a new notebook typing jupyter-notebook on the terminal and test your dnngior installation by typing:

import dnngior

In some instances, you can find a version inconsistency with numpy, one of the dependencies of dnngior. To fix this, go to the terminal and type:

pip install numpy==1.23.5

Then, test your dnngior installation again. You should see this:

Set parameter Username<br>
Set parameter LicenseID to value 2671523<br>
Academic license - for non-commercial use only - expires 2026-05-27<br>
WARNING: To enable the NN_Trainer script, you need to install<br>
tensorflow <https://www.tensorflow.org/install>,→<br>
The rest of dnngior features can be used without it.<br>

You should be ready to go!

## **GAPFILLING USING A COMPLETE MEDIUM**
In this exercise we will gapfill (adding missing reactions) a GSMM of Blautia, a genus of anaerobic bacteria with probiotic characteristics.

Let's explore the GSMM with some basic Cobrapy commands!

In [None]:
import cobra
from cobra.io import read_sbml_model
draft_reconstruction = read_sbml_model('bh_ungapfilled_model.sbml')

In [None]:
draft_reconstruction.summary()

Notice the objective function (biomass=growth rate). Also, there are no fluxes.

In [None]:
draft_reconstruction.optimize()

In [None]:
draft_reconstruction.medium

The exchange reactions (uptake reactions) are set to a unlimited value, there is no constraint regarding what the bacteria can take from the medium, but even so, our model can't simulate growth.

Let's start gapfilling our model!<br>
Import the dnngior library and use the Gapfill class to gapfill the reconstruction

In [None]:
import os, sys
path_to_blautia_model = ("bh_ungapfilled_model.sbml")

In [None]:
import dnngior
gapfilled_model_complete = dnngior.Gapfill(draftModel = path_to_blautia_model, 
                                          medium = None, 
                                          objectiveName = 'bio1')

Make a new object of the gapfilled model

In [None]:
gf_model_compl_med = gapfilled_model_complete.gapfilledModel.copy()

In [None]:
gf_model_compl_med.optimize()

### It's growing!, the growth rate after optimization is : 146.138 mmol/gDW/ hr (Millimoles per gram dry cell weight per hour), the default flux units used in FBA

Now let's see how many and which reactions DNNgior added in order to simulate growth

In [None]:
print("Number of reactions added:", len(gapfilled_model_complete.added_reactions))
print("~~")
for reaction in gapfilled_model_complete.added_reactions:
    print(gf_model_compl_med.reactions.get_by_id(reaction).name)

In [None]:
gf_model_compl_med.reactions.get_by_id('EX_cpd15432_e0')

In [None]:
gf_model_compl_med.reactions.get_by_id('EX_cpd15511_e0')

## **GAPFILLING USING A DEFINED MEDIUM**

First, load the media file containing the composition of the medium

In [None]:
medium_file_path = 'Nitrogen-Nitrite_media.tsv'

In [None]:
import pandas as pd
new_medium = pd.read_csv(medium_file_path, sep="\t")
new_medium.head()

Let's gapfill our GSMM so it can growth on this medium

In [None]:
gapfill_nitr = dnngior.Gapfill(path_to_blautia_model, medium_file = medium_file_path, objectiveName = 'bio1')

Again, make a new object of the gapfilled model and check if it's growing.

In [None]:
gf_model_Nit_med = gapfill_nitr.gapfilledModel.copy()
gf_model_Nit_med.optimize()

Let's see how many and which reactions DNNgior added in order to simulate growth on the nitrite media

In [None]:
print("Number of reactions added:", len(gapfill_nitr.added_reactions))
print("~~")
#for reaction in gapfill_nitr.added_reactions[:5]:
for reaction in gapfill_nitr.added_reactions:
    print(gf_model_Nit_med.reactions.get_by_id(reaction).name)