CheTo - Chemical Topic Modeling
Clone or download
Latest commit c64be56 Aug 10, 2017
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ChemTopicModel Rename folder May 4, 2017
LICENSE Create LICENSE May 4, 2017
README.md Update README.md Aug 10, 2017
setup.py an actual description Jul 26, 2017

README.md

CheTo - RC(=O)R

CheTo (ChemicalTopic) allows to apply topic modeling, a method developed in the text-mining field, to chemical data. Please see our recent publication for detailed information:

Schneider, N.; Fechner, N.; Landrum, G. A.; Stiefl, N. Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach. J. Chem. Inf. Model. 2017, http://pubs.acs.org/doi/10.1021/acs.jcim.7b00249

The supplementary of the paper contains exemplary data sets extracted from the ChEMBL database and Jupyter notebooks to run the experiments described in the paper.

An interactive web page showing an exemplary topic model of data set A from our paper can be found here http://www.t5informatics.com/Papers/InteractiveTopicModelDatasetA.html

Installation

To install CheTo using Conda, simply run:

conda install -c rdkit cheto

Further reading

Using CheTo in KNIME: http://rdkit.blogspot.ch/2017/08/chemical-topic-modeling-with-rdkit-and.html

After publication of our article we were made aware that applying topic modeling to chemical data was also suggested by Rajarshi Guha in 2012 in his blog (http://blog.rguha.net/?p=997).