Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Datasets :: InformationIsPower

Datasets provided on the exeResearch LLC website ( ).

The validation of newly designed methods and protocols relies on the ability to apply them to known datasets. While one or even several datasets cannot account for all instances, knowing how new ideas compare to previous results is a first step. Some of the provided datasets are well known while others are from research conducted at exeResearch LLC.

CoMFA Steroid Benchmark Dataset

This steroid dataset was made famous by the original CoMFA article (J Am Chem Soc, 1988, 110(18), pp5959-5967. DOI:10.1021/ja00226a005). The provided steroid dataset is the corrected version by Eugene A Coats (Perspect Drug Discovery Des, 1998, 12/13/14, pp199-213. DOI:10.1023/A:1017050508855).

Selwood Dataset

The Selwood dataset is well known to those interested in genetic algorithms and QSAR modeling. This is the Selwood dataset used in the Rogers and Hopfinger Genetic Function Approximation (GFA) study (J Chem Inf Comput Sci, 1994, 34(4), pp854-866. DOI:10.1021/ci00020a020); originally curated by Selwood et al. (J Med Chem, 1990, 33(1) pp136-142. DOI:10.1021/jm00163a023).

Oxime Dataset

The oxime dataset contains 17 oximes with percent reactivation values for cyclosarin, sarin, tabun, and VX. The conformation and AM1-bcc atomic charges are provided as a stacked MOL2 file and the conformations and the percent reactivation values are provided in a SDFile. This dataset is from Esposito et al. Chem Res Tox, 2014, 27(1), pp99-110. DOI:10.1021/tx400350b


Datasets provided on the exeResearch LLC website ( )






No releases published


No packages published