Skip to content

A Python based code to construct a Sorted Coulomb matrix from Smile strings (CSV input) of molecules . An optional scikit-learn is invoked at the end of the script to classify molecules using SVM.

Notifications You must be signed in to change notification settings

pythonpanda/coulomb_matrix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Coulomb matrix has been developed as a descriptor for molecules, inorder to learn and predict their properties using Machine Learning.

A Python script to construct a sorted coulomb matrix from SMILES string of molecules. The code internally utilizes openbabel to process the chemical data input in the form of SMILES. By default the Sorted Coulomb matrix is saved to a CSV output file containing LabeledPoint vectors optimized to be read by Apache Spark. Apache Spark is particularly optimal for handling big data and comes with built in powerful Machine learning library.

An optional scikit-learn is invoked at the end of the script to classify molecules using SVM.

About

A Python based code to construct a Sorted Coulomb matrix from Smile strings (CSV input) of molecules . An optional scikit-learn is invoked at the end of the script to classify molecules using SVM.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages