Skip to content

My notes on a series of publications with examples of machine learning and deep learning applied to problems in toxicology.

Notifications You must be signed in to change notification settings

nicospinu/papers_comptox

Repository files navigation

Publications on Computational Toxicology

About this repository

I created this repository for the purpose of understanding what the current trends are in the development of computational models in toxicology, especially in the context of QSARs/QSPRs. It includes a series of personal notes that summarise main ideas of recent published articles in peer-reviewed Journals that can serve as promising examples.

QSARs/QSPRs definition

Quantitative structure-activity relationships (QSARs) and quantitative structure-property relationships (QSPRs) models are mathematical models developed to assess chemical induced toxicity using continuous (regression, e.g., LD50) or discrete (classification, i.e., binary, multi-classes) predictions based on molecular descriptors that are computationally calculated using a range of software, or determined experimentally from the molecules themselves to describe the structure of the chemical, e.g., the relationship between physico-chemical or biochemical properties (e.g., LogP) and biological activity. It allows building the model based on the correlation between the variable of interest (target toxicity) and chemical structure and associated properties. QSAR%20Flow-chart

General workflow

QSAR%20workflow

Overview of the publications studied herein

Reference LR KNN SVM RF NB XGB ANN DNN GCN Endpoint(s)
Mansouri et al (2019) Y Y Y pKa
Zaslavskyi et al (2019) Y Y Y Bioactivity
Li et al (2020) Y Y Y Y Acute toxicity
Xu et al (2020) Y Y Y Y Y 14 organs
Garcia de Lomana et al (2021) Y Y Y Y Y MIEs thyroid
Jain et al (2021) Y Y Y Acute toxicity
Li et al (2021) Y Y Y Y Y Y DILI
Liu et al (2021) Y Y Y BBB
Rathman et al (2021) Y Y Y Y Y DILI
Ulrich et al (2021) Y LogP
Zhoue et al (2021) Y Y Y Y DIR

LR: Linear Regression; KNN: k-Nearest Neighbors; SVM: Support Vector Machine; RF: Random Forest; NB: Naïve Bayes; XGB: eXtreme Gradient Boosting; ANN: Artificial Neural Network; DNN: Deep Neural Network; CGN: Graph Convolutional Network. Y: Yes. pKa = − log10 Ka (acid dissociation constant also called the protonation or ionization constant). MIEs: Molecular initiating Events. DILI: Drug-Induced Liver Injury. BBB: Blood-Brain Barrier. LogP: logKow (the octanol–water partition coefficient Kow). DIR: Drug-induced rhabdomyolysis

Current trends

  • Multi-task modelling
  • Battery of in silico models and combination of MLs
  • Ensembling learning (e.g., models trained on different fingerprints/descriptors)
  • Combination of continuous regression with classification modeling approaches
  • Consensus model (the predicted toxicity is estimated by taking an average of the predicted toxicities from each single model)
  • Model interpretation and explainability including model benchmark (comparison with other models, datasets)
  • The choice of molecular descriptors’ on the impact on model performance
  • Sequential feature selection strategy
  • Uncertainty quantification (e.g., using Dempster-Shafer decision theory)

Recommended references

Books:

Other links:

About

My notes on a series of publications with examples of machine learning and deep learning applied to problems in toxicology.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published