The Chemistry Development Kit (CDK)

Copyright © 1997-2021 The CDK Development Team

License: LGPL v2, see LICENSE.txt

The CDK is an open-source Java library for cheminformatics and bioinformatics.

Key Features:

  • Molecule and reaction valence bond representation.
  • Read and write file formats: SMILES, SDF, InChI, Mol2, CML, and others.
  • Efficient molecule processing algorithms: Ring Finding, Kekulisation, Aromaticity.
  • Coordinate generation and rendering.
  • Canonical identifiers for fast exact searching.
  • Substructure and SMARTS pattern searching.
  • ECFP, Daylight, MACCS, and other fingerprint methods for similarity searching.
  • QSAR descriptor calculations


The CDK is a class library intended to be used by other programs, it will not run as a stand-alone program.

The library is built with Apache Maven and currently requires Java 1.7 or later. From the root of the project run to build the JAR files for each module. The bundle/target/ directory contains the main JAR with all dependencies included:

$ mvn install

You can also download a pre-built library JAR from releases.

Include the main JAR on the Java classpath when compiling and running your code:

$ javac -cp cdk-2.5.jar
$ java -cp cdk-2.5.jar:. MyClass

If you are using Maven, you can use the uber cdk-bundle, note it is much more efficient to use include the modules you need:


If you are a Python user, the Cinfony project provides access via Jython. Noel O'Boyle's Cinfony provides a wrapper around the CDK and over toolkits exposing core functionality as a consistent API.

Further details on building the project in integrated development environments (IDEs) are available on the wiki:

Getting Help

The Toolkit-Rosetta Wiki Page provides some examples for common tasks. If you need help using the CDK and have questions please use the user mailing list, (you must subscribe here first to post).


