# Ersilia Model Hub
The Ersilia Model Hub is a repository of pre-trained, ready-to-use AI models for drug discovery. A list of models and its applications is available [here](https://ersilia.io/model-hub).

You can run the Ersilia Model Hub in your computer by installing the [Ersilia Python Package](https://github.com/ersilia-os/ersilia). In this session, we will use the Google Colab implementation of the Ersilia Model Hub to ensure compatibility with all systems.

In [None]:
#@markdown Click on the play button to install Ersilia in this Colab notebook.

%%capture
%env MINICONDA_INSTALLER_SCRIPT=Miniconda3-py37_4.12.0-Linux-x86_64.sh
%env MINICONDA_PREFIX=/usr/local
%env PYTHONPATH={PYTHONPATH}:/usr/local/lib/python3.7/site-packages
%env CONDA_PREFIX=/usr/local
%env CONDA_PREFIX_1=/usr/local
%env CONDA_DIR=/usr/local
%env CONDA_DEFAULT_ENV=base
!wget https://repo.anaconda.com/miniconda/$MINICONDA_INSTALLER_SCRIPT
!chmod +x $MINICONDA_INSTALLER_SCRIPT
!./$MINICONDA_INSTALLER_SCRIPT -b -f -p $MINICONDA_PREFIX
!python -m pip install git+https://github.com/ersilia-os/ersilia.git
!python -m pip install requests --upgrade
import sys
_ = (sys.path.append("/usr/local/lib/python3.7/site-packages"))

# MMV Malaria Dataset
We will use the list of 400 compounds from the MMV Malaria Box for this exercise. The list of molecules is already prepared in the /data folder of the google drive h3d_ersilia_ai_workshop we created during Session 1.
First, we will mount Google Drive on the notebook to access the data.

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

#import necessary packages
import matplotlib.pyplot as plt
import pandas as pd

In [None]:
#we can open it as a pandas dataframe
smiles = "drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/mmv_malariabox.csv"
df=pd.read_csv(smiles)
df.head()

# Example Model Prediction
We will use one model as a step-by-step guide of how to use the Ersilia Model Hub and analyse the results. Each Ersilia model is identified by a code (eosxxxx) and a slug (one - two word identifier). We will always refer to the models by either the code or the slug. More details are available in the Ersilia Model Hub [documentation](https://ersilia.gitbook.io/ersilia-book/).



## Antimalarial Activity
The Ersilia Model Hub contains a surrogate version of MAIP, a web-based model for predicting blood-stage malaria inhibitors, published in [Bosc et al, 2021](https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00487-2).

### Steps
1. Fetch the model from the online repositor using a bash command (!) 
2. Import the ersilia package as a Python function
3. Load the selected model, "eos2gth"
4. Run predictions for the input of interest (the MMV Malaria Box smiles list). The output will be loaded in a Pandas dataframe
5. Close the model

In [None]:
!ersilia fetch eos2gth

In [None]:
from ersilia import ErsiliaModel

model = ErsiliaModel("eos2gth")
model.serve()
output = model.predict(input=smiles, output="pandas")
model.close()

In [None]:
#once the model has run the predictions, let's save the output in our Google Drive
output.to_csv("drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/eos2gth.csv", index=False)

### Analyising the model output

In [None]:
#First, let's load the predictions we just stored in drive

maip = pd.read_csv("drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/eos2gth.csv")
maip.head()

We observe three columns:

*   key: InChiKey representation of the molecules
*   input: SMILES
*   score: model prediction

We can read more about the output of the model in its associated [documentation](https://chembl.gitbook.io/malaria-project/output-file). As we can see, the output is a score, and "The higher the score is the more likely the compound is predicted to be active. Because there is no normalised score, the user defines a selection threshold."



In [None]:
#we can sort the molecules based on its score
output.sort_values("score", ascending=False).head()

In [None]:
#we can plot the distribution of the scores

plt.hist(output["score"], bins=50, color="#50285a")
plt.xlabel("MAIP Score")
plt.ylabel("Number of molecules")
plt.show()

# Breakout session
Here is a list of models that can be used for this exercise. Please refer to the [Ersilia Model Hub](https://www.ersilia.io/model-hub) to read more about each one of them, the source of data they use and how can them be applied to our problem.

*   Malaria Activity: eos2gth / maip-malaria-surrogate
*   Tuberculosis Activity: eos46ev / chemtb
*   Antibiotic Activity: eos4e41 / chemprop-antibiotic-lite
*   Cardiotoxicity (hERG): eos43at / molgrad-herg
*   Retrosynthetic Accessibility: eos2r5a / retrosynthetic-accessibility
*   Aqueous Solubility: eos6oli / soltrannet-aqueous-solubility
*   Natural Product Likeness: eos9yui / natural-product-likeness


In [None]:
#@title ChemTB
#@markdown Click on the play button to run predictions using ChemTB (eos46ev)
!ersilia fetch eos46ev
from ersilia import ErsiliaModel

model = ErsiliaModel("eos46ev")
model.serve()
output = model.predict(input=smiles, output="pandas")
model.close()

output.to_csv("drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/eos46ev.csv", index=False)

In [None]:
#@title Chemprop Antibiotic
#@markdown Click on the play button to run predictions using Chemprop Antibiotic (eos4e41)
from ersilia import ErsiliaModel
!ersilia fetch eos4e41

model = ErsiliaModel("eos4e41")
model.serve()
output = model.predict(input=smiles, output="pandas")
model.close()

output.to_csv("drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/eos4e41.csv", index=False)

In [None]:
#@title Cardiotoxicity
#@markdown Click on the play button to run predictions using Cardiotoxicity(eos43at)
!ersilia fetch eos43at

model = ErsiliaModel("eos43at")
model.serve()
output = model.predict(input=smiles, output="pandas")
model.close()

output.to_csv("drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/eos43at.csv", index=False)

In [None]:
#@title Retrosynthetic Accessibility
#@markdown Click on the play button to run predictions using RA (eos2r5a)
!ersilia fetch eos2r5a

model = ErsiliaModel("eos2r5a")
model.serve()
output = model.predict(input=smiles, output="pandas")
model.close()

output.to_csv("drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/eos2r5a.csv", index=False)

In [None]:
#@title Aqueous Solubility
#@markdown Click on the play button to run predictions using Solubility (eos6oli)
!ersilia fetch eos6oli

model = ErsiliaModel("eos6oli")
model.serve()
output = model.predict(input=smiles, output="pandas")
model.close()

output.to_csv("drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/eo6oli.csv", index=False)

In [None]:
#@title Natural Product Likeness
#@markdown Click on the play button to run predictions using NP Likeness (eos9yui)
!ersilia fetch eos9yui

model = ErsiliaModel("eos9yui")
model.serve()
output = model.predict(input=smiles, output="pandas")
model.close()

output.to_csv("drive/MyDrive/h3d_ersilia_ai_workshop/data/session2/eos9yui.csv", index=False)