QSAR-Bioacumulation

This project is an implementation written in R of a scheme to predict whether a compound is

Mainly stored within lipid tissues
Has additional storage sites (e.g., proteins)
Is metabolized/eliminated with a reduced bioconcentration

The approach is based on two validated QSAR (Quantitative Structure–Activity Relationship) trees, whose salient features are descriptor interpretability and simplicity.

The scheme is based on the following paper: Investigating the mechanisms of bioconcentration through QSAR classification trees

Dataset

The dataset has the following fields:

3 Compound identifiers:

CAS number
Molecular SMILES
Train/test splitting

9 molecular descriptors (independent variables):

nHM
piPC09
PCD
X2Av
MLOGP
ON1V
N-072
B02[C-N]
F04[C-O]

2 experimental responses:

Bioconcentration Factor (BCF) in log units (regression)
Bioaccumulation class (three classes)

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
QSAR-Bioacumulation.Rproj		QSAR-Bioacumulation.Rproj
README.md		README.md
Script.Rmd		Script.Rmd
dataset.csv		dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QSAR-Bioacumulation

Dataset

About

Contributors 2

olegbrz/QSAR-Bioacumulation

Folders and files

Latest commit

History

Repository files navigation

QSAR-Bioacumulation

Dataset

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2