Skip to content

This repository contains the benchmarking of ZairaChem using the Therapeutics Data Commons Datasets

License

Notifications You must be signed in to change notification settings

ersilia-os/zaira-chem-tdc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

zaira-chem-tdc

This repository contains the benchmarking of ZairaChem v1 using the Therapeutics Data Commons Datasets

ZairaChem

ZairaChem is an automated pipeline for ML-based (Q)SAR models. Detailed installation instructions can be found in Ersilia's GitBook

In short, to use ZairaChem:

git clone https://github.com/ersilia-os/zaira-chem.git
cd zaira-chem
bash install_script.sh

Model training and prediction:

conda activate zairachem
zairachem fit -i <train_data.csv> -m <model_folder>
zairachem predict -i <test_data.csv> -m <model_folder> -o <pred_folder>

Classification tasks

We have benchmarked ZairaChem in the ADMET TDC Leaderboard. At this stage we have focused only on classification tasks.

The admet_classifications notebook shows the code to reproduce the model training and evaluation. For simplicity, the automated reports and raw data of the 5-fold evaluations are provided in the /predictions folder.

Results

Dataset Metric Score
Bioavailability_Ma AUROC 0.74 ± 0.017
HIA_Hou AUROC 0.957 ± 0.014
Pgp_Broccatelli AUROC 0.944 ± 0.002
BBB_Martins AUROC 0.93 ± 0.003
CYP2C9_Veith AUPRC 0.792 ± 0.003
CYP2D6_Veith AUPRC 0.65 ± 0.09
CYP3A4_Veith AUPRC 0.872 ± 0.002
CYP2C9_Substrate_CarbonMangels AUPRC 0.421 ± 0.038
CYP2D6_Substrate_CarbonMangels AUPRC 0.725 ± 0.006
CYP3A4_Substrate_CarbonMangels AUPRC 0.656 ± 0.005
hERG AUROC 0.861 ± 0.012
AMES AUROC 0.863 ± 0.003
DILI AUROC 0.936 ± 0.008

Cite us

If you use our work, please cite us:

ZairaChem Software

About us

The Ersilia Open Source Initiative is a Non Profit Organization with the mission is to equip labs, universities and clinics in LMIC with AI/ML tools for infectious disease research.

Help us achieve our mission!

About

This repository contains the benchmarking of ZairaChem using the Therapeutics Data Commons Datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published