# Serotonin 3D GNN Project

This project builds upon research done by Łapińska et al. (2024): https://doi.org/10.3390/pharmaceutics16030349

Data used: https://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_35/

Move the unpacked chembl_35_sqlite.tar.gz file into the data/ dir.

The research linked above presents two Quantitative Structure-Activity Relationship (QSAR) models to predict serotonergic binding affinity and selectivity, respectively, using Mordred molecular 2D descriptors. Specifically, one model classifies compounds binarily as "active" or "inactive", with a cutoff of pKi = 7. Another model does multiclass classification to predict the serotonergic selectivity of compounds previously classified as "active".

I am following a similar approach, but using 3D graphical representations instead of 2D molecular descriptors as input modality and using only the ChEMBL database, not ZINC.

## Imports

In [14]:
from pathlib import Path
import shutil
import sqlite3
import pandas as pd

## Configuration

In [22]:
IN_COLAB = True

PATH_NOTEBOOK = Path("/content/drive/MyDrive/Colab Notebooks/serotonin-3d-gnn.ipynb") if IN_COLAB else Path.cwd() / "serotonin-3d-gnn.ipynb"
PATH_REPO = Path("/content/drive/MyDrive/Repositories/serotonin-3d-gnn") if IN_COLAB else Path.cwd()
PATH_DB = PATH_REPO / "data" / "chembl_35_preprocessed.db"

## Google Colab Setup

In [8]:
if IN_COLAB:
  from google.colab import drive

  drive.mount('/content/drive')

  !cp PATH_REPO / 'requirements.txt' /content/

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Utils

### Syncing this file between Colab and local Git repo

Make sure the paths exist.

In [11]:
shutil.copyfile(PATH_NOTEBOOK, PATH_REPO / "serotonin-3d-gnn.ipynb")

PosixPath('/content/drive/My Drive/Repositories/serotonin-3d-gnn/serotonin-3d-gnn.ipynb')

## Data

In [25]:
conn = sqlite3.connect(PATH_DB)
cursor = conn.cursor()

cursor.execute("SELECT * FROM sqlite_master;")

tables = cursor.fetchall()

table_names = [table[0] for table in tables]

tables

[]