DrugDiscoveryML

messing around with drug discovery and machine-learning

Purpose

The objective of this repository is to address pharmacokinetics issues with drug discovery and if/how can machine-learning help on this issue while learning myself. Any input is welcomed.

How to

The import cleans the data and imports a subset from chembl_23 and eache one of the assays works from there. The dataset used is available here.

Assays

There are testing issues for logP, Protein Binding and Aqueous solubility

Results so far

Solubility

The mean is -2.145955
The median is -2.25

Model	RMSE	MAE	info
glm stack	0.959	0.524	kknn + cubist + bam + gam

LogP

The mean of the y is 5.503537
The median is 5

Binding energy

The mean is 10.11311
The median is 11.2

TODO

Bioavailability
Clean code and readability
Improve results
- Scale and normalize
- Better feature selection
- More methods
~~Introduce DrugBank Information~~
Finishing test in python with scikit-learn
Introduce KNIME, deepchem and RDkit and CNN
...

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
images		images
python		python
.gitignore		.gitignore
Binding.Rmd		Binding.Rmd
LICENSE		LICENSE
README.md		README.md
aqueous solubility.Rmd		aqueous solubility.Rmd
bioavailability.Rmd		bioavailability.Rmd
feature selection.Rmd		feature selection.Rmd
import data.Rmd		import data.Rmd
logP.Rmd		logP.Rmd
testing.R		testing.R

License

joofio/DrugDiscoveryML

Folders and files

Latest commit

History

Repository files navigation

DrugDiscoveryML

Purpose

How to

Assays

Results so far

Solubility

LogP

Binding energy

TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Languages