Skip to content

RunshengSong/QSAR_SSD_Toolbox

Repository files navigation

QSAR-based SSD ToolBox

Runsheng Song
runsheng@umail.ucsb.edu


A framework to create Species Sensitivity Distributions (SSD) using pre-trained QSAR models

QSAR models were developed using Neural Networks in Tensorflow + Keras. Descriptors were calculated using Rdkit and Mordred, and optimized using tree-based feature selection.

All QSAR models have been cross-validated

Current toxicity endpoint is LC50

Prerequisite:

  • Anaconda Python 2.7
  • Recommend using Linux or MacOS.

Install

  • Install rdkit with conda first(save ur life):
conda install -c rdkit rdkit=2017.03.1
  • Install QSAR_SSD_Toolbox via pip:
pip install QSAR_SSD_Toolbox
  • Install the requirments.txt if some packages are missing via
pip install -r requirements.txt

Basic Usage

Single QSAR Model on One Species:

from QSAR_SSD_Toolbox.src.qsar import qsar

SMILEs = 'CCCC' # The input SMILEs 

this_q = qsar("Lepomis Macrochirus") # the name of the species, see below for avaliable species
print this_q.predict(SMILEs) # return a list of predicted LC50 values for the given species

Run model on all species and prepare for SSD development:

from QSAR_SSD_Toolbox.src.qsar import run_all

SMILEs = ['CCCC'] # The input SMILEs must be a list

this_q = run_all.run(SMILEs) # return a pandas dataframe for the input chemicals on corrosponding species. 

Plot SSD Curves

from QSAR_SSD_Toolbox.src.ssd import ssd_generator
from scipy.stats import lognorm

this_ssd = ssd_generator()
this_ssd.generate(this_q, dist=lognorm, run_bootstrap=True, bootstrap_time=1000, display_range=[0.8,100]) # this will return a plot with bootstrap and baseline SSD curves. For more information about bootstrap in SSD refer to this blog: https://edild.github.io/ssd/

Available QSAR Models:

  • Lepomis Macrochirus: R^2 on testing chemicals: 0.51

  • Oncorhynchus Mykiss: R^2 on testing chemicals: 0.72

  • Americamysis bahia: R^2 on testing chemicals: 0.45

  • Oncorhynchus Mykiss: R^2 on testing chemicals: 0.75

  • Oryzias latipes: R^2 on testing chemicals: 0.56

  • Pimephales promelas: R^2 on testing chemicals: 0.72

  • Daphnia magna: R^2 on testing chemicals: 0.77

  • Other water fleas model: This model include the experimental data (LC50) of different kind of water fleas, except Daphnia magna R^2 on testing chemicals: 0.61

About

An Neural Networks based QSAR Model and a Connector to SSD

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published