Skip to content

A Molecular Stereostructure Descriptor based on Spherical Projection

License

Notifications You must be signed in to change notification settings

licheng-xu-echo/SPMS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SPMS

This is a repository for paper: A Molecular Stereostructure Descriptor based on Spherical Projection

Introduction

Description of molecular stereostructure is critical for the machine learning prediction of asymmetric catalysis. We develop a spherical projection descriptor of molecular stereostructure (SPMS), which allows precise representation of the molecular vdW surface.

This project provides the key script to generate SPMS based on MDL SDF files (.sdf, V2000 version). In addition, we provide two Jupyter Notebooks in Example folder to demonstrate how to generate SPMS from SDF files and how to use it for machine learning application on the dataset of asymmetric thiol addition to N-acylimines from Denmark's recent work. (Science 2019, 363, eaau5631.)

This work is published at Synlett. If this project was used in your work, please cite this paper.

Dependence

All third-party python packages required for generating SPMS are just numpy, ase and rdkit.

In order to run Jupyter Notebook for machine learning application demonstration, several machine learning, deep learning and visualazation third-party python packages are required.

python>=3.6
numpy>=1.17.4
rdkit>=2021.9.4
pandas>=1.1.1
tensorflow-gpu=1.14.0
scikit-learn>=0.22
seaborn>=0.9.0
ase>=3.19.1

Installation

Execute the command within the project directory ('SPMS' folder), to install dependence and install spms as a package:

conda create -n spms python=3.6
conda activate spms
conda install tensorflow-gpu=1.14.0
pip install numpy rdkit-pypi pandas scikit-learn seaborn ase
pip install .

The version of dependent third-party packages above are recommended. The virsion of TensorFlow should be 1.X. We suggest using Anaconda to install python3.6 or higher version, as conda and pip together make the installation of these dependences much easier.

Usage

The core script to generate SPMS is desc.py, which is very easy to understand and use for its few code lines.

There is an example file L-proline for demonstration. The chiral carbon is selected as the key atom which will be placed at the origin of cartesian coordinate system. Several key atoms to generate SPMS is also supported. If key atom is not defined, the center of mass will be placed at the origin of cartesian coordinate system. The resolution of SPMS is controlled by desc_n and desc_m. If sphere_radius paramter is not set, the smallest radius to hold the whole molecule will be calulated and used. The simplest usage is just like below:

from spms.desc import SPMS
## Initiaze the SPMS
spms_calc = SPMS('./L-proline.sdf',key_atom_num=[3],desc_n=40,desc_m=40,sphere_radius=8)

## Calculate the SPMS
spms_desc = spms_calc.GetSphereDescriptors()

More details for the usage of SPMS and machine learning application of SPMS, please check two Jupyter Notebooks in Example folder.

We also provide a website for drawing SPMS figures for chemical interpretation.

How to cite

If the method is used in your paper, please cite as: Synlett 2021, 32, 1837.

Contact us

Email: hxchem@zju.edu.cn; licheng_xu@zju.edu.cn

About

A Molecular Stereostructure Descriptor based on Spherical Projection

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published