Skip to content

Retrosynthesis has been used for a long time, but it's slower and prone to bias. Machine learning aided techniques like this can help speed up the process.

Notifications You must be signed in to change notification settings

aasimayazwani/Machine-Learning-Aided-Retrosynthesis

Repository files navigation

Biosynthetic Pathway Generation Repository

Introduction

This repository contains code for effective synthetic circuit models written in the Python programming language. The algorithm and examples are described in the publication:

Installation and Requirements

There are several required Python packages that must be installed to use the biochemical pathway generation code.

package version download documentation
Rdkit 2019.09.3.0 Installation Guide Documentation.
Numpy 1.17.2 Installation Guide Documentation
Pandas 0.25.1 Installation Guide Documentation
Keras 2.3.1 Installation Guide Documentation
urllib3 1.25.8 Installation Guide Documentation
scikit-learn 0.22.2.post1 Installation Guide Documentation
scipy 1.4.1 Installation Guide Documentation

After the above packages are installed the notebook has to opened in the rdkit environment. Terminal commands for finding the rdkit environment are:

> conda info --envs conda info -e
> conda activate my-rdkit-env
> jupyter notebook 

Lastly, some files required to run the notebook are too big to be maintained in this repository. We have to store these files externally. Please download the folder. The link directs to two sub-folders "Required to run code" and "reaction rules database (2.36 GB)". The prior is required to run the notebook with example pathways. If the transformation is not found, the latter should be downloaded, and rules added.

How do I execute the biochemical route planning job?

The function required to run a pathway generation job is the:

result = final_main(initialCompound, finalCompound, threshold, hydrogenAdd, types)
Argument Type Description
initial, final Smile String The product from which you would want to iterate back to target (this is the reaction's precursor.) For example, when creating a Tyrosine pathway to Morphine, the initial compound and final target compound are Morphine and Tyrosine, respectively
threshold int This parameter controls the maximum Tanimoto threshold cutoff accepted for any two successive compounds in the pathway. The range of the Tanimoto similarity is [0,1]. Default value: 0.1
hydrogen Boolean Depending on whether the reaction rules have hydrogen implicitly or explicitly added, this parameter controls whether we add hydrogen or not.
types string This parameter controls which deep learning is being chosen for candidate ranking. This has only three accepted values: "reaction", "molecular", "atomic_level", "atomic_spectator_model"

Funding

The work described was supported by the Center on the Physics of Cancer Metabolism at Cornell University through Award Number 1U54CA210184-01 from the National Cancer Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.

About

Retrosynthesis has been used for a long time, but it's slower and prone to bias. Machine learning aided techniques like this can help speed up the process.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published