Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time
January 30, 2017 10:09
April 16, 2023 15:02

Awesome Python Chemistry Awesome

License: CC BY 4.0

A curated list of awesome Python frameworks, libraries, software and resources related to Chemistry.

Inspired by awesome-python.

Table of contents

General Chemistry

Packages and tools for general chemistry.

  • aizynthfinder - A tool for retrosynthetic planning.
  • batchcalculator - A GUI app based on wxPython for calculating the correct amount of reactants (batch) for a particular composition given by the molar ratio of its components.
  • cctbx - The Computational Crystallography Toolbox.
  • ChemFormula - ChemFormula provides a class for working with chemical formulas. It allows parsing chemical formulas, calculating formula weights, and generating formatted output strings (e.g. in HTML, LaTeX, or Unicode).
  • chemlib - A robust and easy-to-use package that solves a variety of chemistry problems.
  • chempy - ChemPy is a package useful for chemistry (mainly physical/inorganic/analytical chemistry).
  • datamol: - Molecular Manipulation Made Easy. A light wrapper build on top of RDKit.
  • GoodVibes - A Python program to compute quasi-harmonic thermochemical data from Gaussian frequency calculations.
  • hgraph2graph - Hierarchical Generation of Molecular Graphs using Structural Motifs.
  • ionize - Calculates the properties of individual ionic species in aqueous solution, as well as aqueous solutions containing arbitrary sets of ions.
  • LModeA-nano - Calculates the intrinsic chemical bond strength based on local vibrational mode theory in solids and molecules.
  • mendeleev - A package that provides a python API for accessing various properties of elements from the periodic table of elements.
  • nmrglue - A package for working with nuclear magnetic resonance (NMR) data including functions for reading common binary file formats and processing NMR data.
  • Open Babel - A chemical toolbox designed to speak the many languages of chemical data.
  • periodictable - This package provides a periodic table of the elements with support for mass, density and xray/neutron scattering information.
  • propka - Predicts the pKa values of ionizable groups in proteins and protein-ligand complexes based in the 3D structure.
  • pybel - Pybel provides convenience functions and classes that make it simpler to use the Open Babel libraries from Python.
  • pycroscopy - Scientific analysis of nanoscale materials imaging data.
  • pyEQL - A set of tools for conventional calculations involving solutions (mixtures) and electrolytes.
  • pyiron - pyiron - an integrated development environment (IDE) for computational materials science.
  • pymatgen - Python Materials Genomics is a robust, open-source library for materials analysis.
  • symfit - a curve-fitting library ideally suited to chemistry problems, including fitting experimental kinetics data.
  • symmetry - Symmetry is a library for materials symmetry analysis.
  • stk - A library for building, manipulating, analyzing and automatic design of molecules, including a genetic algorithm.
  • spectrochempy - A library for processing, analyzing and modeling spectroscopic data.

Machine Learning

Packages and tools for employing machine learning and data science in chemistry.

  • amp - Is an open-source package designed to easily bring machine-learning to atomistic calculations.
  • atom3d - Enables machine learning on three-dimensional molecular structure.
  • chainer-chemistry - A deep learning framework (based on Chainer) with applications in Biology and Chemistry.
  • chemml - A machine learning and informatics program suite for the analysis, mining, and modeling of chemical and materials data.
  • chemprop - Message Passing Neural Networks for Molecule Property Prediction .
  • cgnn - Crystal graph convolutional neural networks for predicting material properties.
  • deepchem - Deep-learning models for Drug Discovery and Quantum Chemistry.
  • DeepPurpose - A Deep Learning Library for Compound and Protein Modeling DTI, Drug Property, PPI, DDI, Protein Function Prediction.
  • DescriptaStorus - Descriptor computation (chemistry) and (optional) storage for machine learning.
  • DScribe - Descriptor library containing a variety of fingerprinting techniques, including the Smooth Overlap of Atomic Positions (SOAP).
  • graphein - Provides functionality for producing geometric representations of protein and RNA structures, and biological interaction networks.
  • Matminer - Library of descriptors to aid in the data-mining of materials properties, created by the Lawrence Berkeley National Laboratory.
  • MoleOOD - a robust molecular representation learning framework against distribution shifts.
  • megnet - Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals.
  • MAML - Aims to provide useful high-level interfaces that make ML for materials science as easy as possible.
  • MORFEUS - Library for fast calculations of molecular features from 3D structures for machine learning with a focus on steric descriptors.
  • olorenchemengine - Molecular property prediction with unified API for diverse models and respresentations, with integrated uncertainty quantification, interpretability, and hyperparameter/architecture tuning.
  • schnetpack - Deep Neural Networks for Atomistic Systems.
  • selfies - Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation.
  • Summit - Package for optimizing chemical reactions using machine learning (contains 10 algorithms + several benchmarks).
  • TDC - Therapeutics Data Commons (TDC) is the first unifying framework to systematically access and evaluate machine learning across the entire range of therapeutics.
  • XenonPy - Library with several compositional and structural material descriptors, along with a few pre-trained neural network models of material properties.

Generative Molecular Design

Packages and tools for generating molecular species

  • GraphINVENT - A platform for graph-based molecular generation using graph neural networks.
  • GuacaMol - A package for benchmarking of models for de novo molecular design.
  • moses - A benchmarking platform for molecular generation models.
  • perses - Experiments with expanded ensembles to explore chemical space.


Packages for atomistic simulations and computational chemistry.

  • alchemlyb - Makes alchemical free energy calculations easier by leveraging the full power and flexibility of the PyData stack.
  • Atomic Silumation Environment (ASE) - Is a set of tools and modules for setting up, manipulating, running, visualizing and analyzing atomistic simulations.
  • basis_set_exchange - A library containing basis sets for use in quantum chemistry calculations. In addition, this library has functionality for manipulation of basis set data.
  • CACTVS - Cactvs is a universal, scriptable cheminformatics toolkit, with a large collection of modules for property computation, chemistry data file I/O and other tasks.
  • CalcUS - Quantum chemisttry web platform that brings all the necessary tools to perform quantum chemistry in a user-friendly web interface.
  • cantera - A collection of object-oriented software tools for problems involving chemical kinetics, thermodynamics, and transport processes.
  • CatKit - General purpose tools for high-throughput catalysis.
  • ccinput - A tool and library for creating quantum chemistry input files.
  • cclib - A library for parsing output files various quantum chemical programs.
  • cinfony - A common API to several cheminformatics toolkits (Open Babel, RDKit, the CDK, Indigo, JChem, OPSIN and cheminformatics webservices).
  • chemlab - Is a library that can help the user with chemistry-relevant calculations.
  • emmet - A package to 'build' collections of materials properties from the output of computational materials calculations.
  • fromage - The "FRamewOrk for Molecular AGgregate Excitations" enables localised QM/QM' excited state calculations in a solid state environment.
  • GPAW - Is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE).
  • horton - Helpful Open-source Research TOol for N-fermion system, a quantum-chemistry program that can perform computations involving model Hamiltonians.
  • HTMD - High-Throughput Molecular Dynamics: Programming Environment for Molecular Discovery.
  • Indigo - Universal cheminformatics libraries, utilities and database search tools.
  • Jarvis-tools - An open-access software package for atomistic data-driven materials design
  • mathchem - Is a free open source package for calculating topological indices and other invariants of molecular graphs.
  • MDAnalysis - Is an object-oriented library to analyze trajectories from molecular dynamics (MD) simulations in many popular formats.
  • MDTraj - Package for manipulating molecular dynamics trajectories with support for multiple formats.
  • MMTK - The Molecular Modeling Toolkit is an Open Source program library for molecular simulation applications.
  • MolMod - A library with many components that are useful to write molecular modeling programs.
  • oddt - Open Drug Discovery Toolkit, a modular and comprehensive toolkit for use in cheminformatics, molecular modeling etc.
  • OPEM - Open source PEM (Proton Exchange Membrane) fuel cell simulation tool.
  • openmmtools - A batteries-included toolkit for the GPU-accelerated OpenMM molecular simulation engine.
  • overreact - A library and command-line tool for building and analyzing complex homogeneous microkinetic models from quantum chemistry calculations, with support for quasi-harmonic thermochemistry, quantum tunnelling corrections, molecular symmetries and more.
  • ParmEd - Parameter/topology editor and molecular simulator with visualization capability.
  • pGrAdd - A library for estimating thermochemical properties of molecules and adsorbates using group additivity.
  • phonopy - An open source package for phonon calculations at harmonic and quasi-harmonic levels.
  • PLAMS - Python Library for Automating Molecular Simulation: input preparation, job execution, file management, output processing and building data workflows.
  • pMuTT - A library for ab-initio thermodynamic and kinetic parameter estimation.
  • PorePy - A Simulation Tool for Fractured and Deformable Porous Media.
  • ProDy - An open source package for protein structural dynamics analysis with a flexible and responsive API.
  • ProLIF - Interaction Fingerprints for protein-ligand complexes and more.
  • Psi4 - A hybrid Python/C++ open-source package for quantum chemistry.
  • Psi4NumPy - Psi4-based reference implementations and Jupyter notebook-based tutorials for foundational quantum chemistry methods.
  • pyEMMA - Library for the estimation, validation and analysis Markov models of molecular kinetics and other kinetic and thermodynamic models from molecular dynamics data.
  • pygauss - An interactive tool for supporting the life cycle of a computational molecular chemistry investigations.
  • PyQuante - Is an open-source suite of programs for developing quantum chemistry methods.
  • pysic - A calculator incorporating various empirical pair and many-body potentials.
  • Pyscf - A quantum chemistry package written in Python.
  • pyvib2 - A program for analyzing vibrational motion and vibrational spectra.
  • RDKit - Open-Source Cheminformatics Software.
  • ReNView - A program to visualize reaction networks.
  • stk - A library for building, manipulating, analyzing and automatic design of molecules.
  • QMsolve - A module for solving and visualizing the Schrödinger equation.
  • QUIP - A collection of software tools to carry out molecular dynamics simulations.
  • torchmd - End-To-End Molecular Dynamics (MD) Engine using PyTorch.
  • tsase - The library which depends on ASE to tackle transition state calculations.
  • yank - An open, extensible Python framework for GPU-accelerated alchemical free energy calculations.

Force Fields

Packages related to force fields

  • FitSNAP - A Package For Training SNAP Interatomic Potentials for use in the LAMMPS molecular dynamics package.
  • FLARE - A package for creating fast and accurate interatomic potentials.
  • global-chem - A Chemical Knowledge Graph and Toolkit, writting in IUPAC/SMILES/SMARTS, for common small molecules from diverse communities to aid users in selecting compounds for forcefield parametirization.
  • NeuralForceField - Neural Network Force Field based on PyTorch.
  • openff-toolkit - The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools.

Molecular Visualization

Packages for viewing molecular structures.

  • ase-gui - The graphical user-interface allows users to visualize, manipulate, and render molecular systems and atoms objects.
  • chemiscope - An interactive structure/property explorer for materials and molecules.
  • chemview - An interactive molecular viewer designed for the IPython notebook.
  • imolecule - An embeddable webGL molecule viewer and file format converter.
  • moleculekit - A molecule manipulation library.
  • nglview - A Jupyter widget to interactively view molecular structures and trajectories.
  • PyMOL - A user-sponsored molecular visualization system on an open-source foundation, maintained and distributed by Schrödinger.
  • pymoldyn - A viewer for atomic clusters, crystalline and amorphous materials in a unit cell corresponding to one of the seven 3D Bravais lattices.
  • sumo - A toolkit for plotting and analysis of ab initio solid-state calculation data.
  • surfinpy - A library for the analysis, plotting and visualisation of ab initio surface calculation data.
  • trident-chemwidgets - Jupyter Widgets to interact with molecular datasets.

Database Wrappers

Providing a python layer for accessing chemical databases

  • ccdc - An API for the Cambridge Structural Database System.
  • ChemSpiPy - ChemSpider wrapper, that allows chemical searches, chemical file downloads, depiction and retrieval of chemical properties.
  • CIRpy - An interface for the Chemical Identifier Resolver (CIR) by the CADD Group at the NCI/NIH.
  • pubchempy - PubChemPy provides a way to interact with PubChem in Python.
  • chembl-downloader - Automate downloading and querying the latest (or a given) version of ChEMBL
  • drugbank-downloader - Automate downloading, opening, and parsing DrugBank

Learning Resources

Resources for learning to apply python to chemistry.

  • An Introduction to Applied Bioinformatics - A Jupyter book demonstrating working with biochemical data using the scikit-bio library for tasks such as sequence alignment and calculating Hamming distances.
  • Computational Thermodynamics - This collection of Jupyter notebooks demonstrates solutions to a range of thermodynamic problems including solving chemical equilibria, comparing real versus ideal gas behavior, and calculating the temperature and composition of a combustion reaction.
  • SciCompforChemists - Scientific Computing for Chemists with Python is a Jupyter book teaching basic python in chemistry skills, including relevant libraries, and applies them to solving chemical problems.

Miscellaneous Awesome

  • Colorful Nuclide Chart - A beatuful, interactive visualization of nuclides with access to a varirty of nuclear properties and allows saving high quality images for publications, presentations and outreach.

See Also

  • awesome-cheminformatics Another list focuses on Cheminformatics, including tools not only in Python.
  • awesome-small-molecule-ml A collection of papers, datasets, and packages for small-molecule drug discovery. Most links to code are in Python.
  • awesome-molecular-docking A curated list of molecular docking software, datasets, and papers.
  • jarvis Joint Automated Repository for Various Integrated Simulations is a repository designed to automate materials discovery and optimization using classical force-field, density functional theory, machine learning calculations and experiments.
  • polypharmacy-ddi-synergy-survey A collection of research papers (with Python implementations) focusing on drug-drug interactions, synergy and polypharmacy.