Skip to content


initial cli
Browse files Browse the repository at this point in the history
  • Loading branch information
hgarrereyn committed Jun 14, 2021
1 parent a72d50e commit 670da60
Show file tree
Hide file tree
Showing 2 changed files with 510 additions and 3 deletions.
137 changes: 134 additions & 3 deletions
@@ -1,11 +1,142 @@

# DeepFrag

This repository contains code for machine learning based lead optimization.
DeepFrag is a machine learning model for fragment-based lead optimization. In this repository, you will find code to train the model and code to run inference using a pre-trained model.

# Examples
# Citation

See [this Colab]( for an interactive example of how to use a pre-trained DeepFrag model to generate predictions.
If you use DeepFrag in your research, please cite as:

Green, H., Koes, D. R., & Durrant, J. D. (2021). DeepFrag: a deep convolutional neural network for fragment-based lead optimization. Chemical Science.

title={DeepFrag: a deep convolutional neural network for fragment-based lead optimization},
author={Green, Harrison and Koes, David Ryan and Durrant, Jacob D},
journal={Chemical Science},
publisher={Royal Society of Chemistry}

# Usage

There are three ways to use DeepFrag:

1. **DeepFrag Browser App**: We have released a free, open-source browser app for DeepFrag that requires no setup and does not transmit any structures to a remote server.
- View the online version at [](
- See the code at [](
2. **DeepFrag CLI**: In this repository we have included a `` script that can perform common prediction tasks using the API.
- See the `DeepFrag CLI` section below
3. **DeepFrag API**: For custom tasks or fine-grained control over predictions, you can invoke the DeepFrag API directly and interface with the raw data structures and the PyTorch model. We have created an example Google Colab (Jupyter notebook) that demonstrates how to perform manual predictions.
- See the interactive [Colab](

# DeepFrag CLI

The DeepFrag CLI is invoked by running `python3` in this repository. The CLI requires a pre-trained model and the fragment library to run. You will be prompted to download both when you first run the CLI and these will be saved in the `./.store` directory.

## Structure (specify exactly one)
The input structures are specified using either a manual receptor and ligand pdb or by specifying a pdb id and the ligand residue number.
- `--receptor <rec.pdb> --ligand <lig.pdb>`
- `--pdb <pdbid> --resnum <resnum>`

## Connection Point (specify exactly one)

DeepFrag will predict new fragments that connect to the _connection point_ via a single bond. You must specify the connection point atom using one of the following:
- `--cname <name>`: Specify the connection point by atom name (e.g. `C3`, `N5`, `O2`, ...).
- `--cx <x> --cy <y> --cz <z>`: Specify the connection point by atomic coordinate. DeepFrag will find the closest atom to this point.

## Fragment Removal (optional) (specify exactly one)

If you are using DeepFrag for fragment _replacement_, you must first remove the original fragment from the ligand structure. You can either do this by hand, e.g. editing the PDB, or DeepFrag can do this for you by specifying _which_ fragment should be removed.

_Note: predicting fragments in place of hydrogen atoms (e.g. protons) does not require any fragment removal since hydrogen atoms are ignored by the model._

To remove a fragment, you specify a second atom that is contained in the fragment. Like the connection point, you can either use the atom name or the atom coordinate.

- `--rname <name>`: Specify the connection point by atom name (e.g. `C3`, `N5`, `O2`, ...).
- `--rx <x> --ry <y> --rz <z>`: Specify the connection point by atomic coordinate. DeepFrag will find the closest atom to this point.

## Output (optional)

By default, DeepFrag will print a list of fragment predictions to stdout similar to the [Browser App](

- `--out <out.csv>`: Save predictions in CSV format to `out.csv`.

## Miscellaneous (optional)

- `--cpu/--gpu`: DeepFrag will attempt to infer if a Cuda GPU is available and fallback to the CPU if it is not. You can set either the `--cpu` or `--gpu` flag to explicitly specify the target device.

# Reproduce Results

You can use the DeepFrag CLI to reproduce the highlighted results from the main manuscript:

## 1. Fragment replacement

To replace fragments, specify the connection point (`cname` or `cx/cy/cz`) and specify a second atom that is contained in the fragment (`rname` or `rx/ry/rz`).

# Fig. 3: (2XP9) H. sapiens peptidyl-prolyl cis–trans isomerase NIMA-interacting 1 (HsPin1p)

# Carboxylate A
$ python3 --pdb 2xp9 --resnum 1165 --cname C10 --rname C12

# Phenyl B
$ python3 --pdb 2xp9 --resnum 1165 --cname C1 --rname C2

# Phenyl C
$ python3 --pdb 2xp9 --resnum 1165 --cname C18 --rname C19

# Fig. 4A: (6QZ8) Protein myeloid cell leukemia1 (Mcl-1)

# Carboxylate group interacting with R263
$ python3 --pdb 6qz8 --resnum 401 --cname C12 --rname C14

# Ethyl group
$ python3 --pdb 6qz8 --resnum 401 --cname C6 --rname C10

# Methyl group
$ python3 --pdb 6qz8 --resnum 401 --cname C25 --rname C30

# Chlorine atom
$ python3 --pdb 6qz8 --resnum 401 --cname C28 --rname CL

# Fig. 4B: (1X38) Family GH3 b-D-glucan glucohydrolase (barley)

# Hydroxyl group interacting with R158 and D285
$ python3 --pdb 1x38 --resnum 1001 --cname C2B --rname O2B

# Phenyl group interacting with W286 and W434
$ python3 --pdb 1x38 --resnum 1001 --cname C7B --rname C1

# Fig. 4C: (4FOW) NanB sialidase (Streptococcus pneumoniae)

# Amino group
$ python3 --pdb 4fow --resnum 701 --cname CAE --rname NAA

## 2. Fragment addition

For fragment addition, you only need to specify the atom connection point (`cname` or `cx/cy/cz`). In this case, DeepFrag will implicily replace a valent hydrogen.

# Fig. 5: Ligands targeting the SARS-CoV-2 main protease (MPro)

# 5A: (5RGH) Extension on Z1619978933
$ python3 --pdb 5rgh --resnum 404 --cname C09

# 5B: (5R81) Extension on Z1367324110
$ python3 --pdb 5r81 --resnum 1001 --cname C07

# Overview

Expand Down

0 comments on commit 670da60

Please sign in to comment.