`BpForms`: toolkit for concretely describing non-canonical DNA, RNA, and proteins

BpForms is a set of tools for concretely representing the primary structures of non-canonical forms of biopolymers, such as oxidized DNA, methylated RNA, and acetylated proteins, and calculating properties of non-canonical biopolymers.

BpForms encompasses five tools:

A grammar for concretely describing the primary structures of non-canonical biopolymers. See the documentation for more information. For example, the following text represents a modified DNA molecule that contains a deoxyinosine monomeric form at the fourth position.
```
ACG[id: "dI"
     | structure: "[H][C@]1(O)C[C@@]([H])(O[C@]1([H])CO)N1C=NC2=C1N=CN=C2O"]T
```
This concrete representation enables the BpForms software tools to calculate properties of non-canonical biopolymers.
Tools for calculating properties of non-canonical biopolymers including their chemical formulae, molecular weights, charges, and major protonation and tautomerization states.
- A web app: https://bpforms.org
- A JSON REST API: https://bpforms.org/api
- A command line interface. See the documentation for more information.
- A Python API. See the documentation for more information.

BpForms was motivated by the need to concretely represent the biochemistry of DNA modification, DNA repair, post-transcriptional processing, and post-translational processing in whole-cell computational models. BpForms is also a valuable tool for experimental proteomics and synthetic biology. In particular, we developed BpForms because there were no notations, schemas, data models, or file formats for concretely representing non-canonical forms of biopolymers, despite the existence of several databases and ontologies of DNA, RNA, and protein modifications, the ProForma Proteoform Notation, and the MOMODICS codes for modified RNA bases.

BpForms can be combined with BcForms to concretely describe the primary structure of complexes.

Installation

The following is a brief guide to installing BpForms. The Dockerfile in the repository contains detailed instructions for how to install BpForms in Ubuntu Linux.

Install the third-party dependencies listed below.
- ChemAxon Marvin: optional to calculate major protonation and tautomerization states and draw molecules
  - Java >= 1.8
- Open Babel
- Pip >= 19.0
- Python >= 3.6
To use Marvin to calculate major protonation and tautomerization states, set JAVA_HOME to the path to your Java virtual machine (JVM)
```
export JAVA_HOME=/usr/lib/jvm/default-java
```
To use Marvin to calculate major protonation and tautomerization states, add Marvin to the Java class path
```
export CLASSPATH=$CLASSPATH:/opt/chemaxon/marvinsuite/lib/MarvinBeans.jar
```

Install this package

Install the latest release from PyPI:
```
pip install bpforms
```

Install the latest revision from GitHub:

pip install git+https://github.com/KarrLab/pkg_utils.git#egg=pkg_utils
pip install git+https://github.com/KarrLab/wc_utils.git#egg=wc_utils[chem]
pip install git+https://github.com/KarrLab/bpforms.git#egg=bpforms

To calculate major protonation and tautomerization states, BpForms must be installed with the [protontation] option:

pip install bpforms[protontation]
pip install git+https://github.com/KarrLab/bpforms.git#egg=bpforms[protontation]

To draw molecules, BpForms must be installed with the [draw] option:

pip install bpforms[draw]
pip install git+https://github.com/KarrLab/bpforms.git#egg=bpforms[draw]

To export the alphabets in OBO format, BpForms must be installed with the [onto_export] option:

pip install bpforms[onto_export]
pip install git+https://github.com/KarrLab/bpforms.git#egg=bpforms[onto_export]

To install the rest API, BpForms must be installed with the [rest_api] option:

pip install bpforms[rest_api]
pip install git+https://github.com/KarrLab/bpforms.git#egg=bpforms[rest_api]

Examples, tutorial, and documentation

Please see the documentation. An interactive tutorial is also available in the whole-cell modeling sandbox.

License

The package is released under the MIT license.

Citing `BpForms`

Lang PF, Chebaro Y, Zheng X, Sekar JAP, Shaikh B, Natale DA & Karr JR. BpForms and BcForms: a toolkit for concretely describing non-canonical polymers and complexes to facilitate global biochemical networks. Genome Biology. 🔗

Development team

This package was developed by the Karr Lab at the Icahn School of Medicine at Mount Sinai in New York, USA.

Jonathan Karr
Yassmine Chebaro
Paul Lang
John Sekar
Bilal Shaikh
Darren Natale

Questions and comments

Please contact the Karr Lab with any questions or comments.

Name		Name	Last commit message	Last commit date
Latest commit History 643 Commits
.circleci		.circleci
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
bpforms		bpforms
docs		docs
examples		examples
tests		tests
.gitignore		.gitignore
.karr_lab_build_utils.yml		.karr_lab_build_utils.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
install_webserver.py		install_webserver.py
requirements.optional.txt		requirements.optional.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

`BpForms`: toolkit for concretely describing non-canonical DNA, RNA, and proteins

Installation

Examples, tutorial, and documentation

License

Citing `BpForms`

Development team

Questions and comments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

KarrLab/bpforms

Folders and files

Latest commit

History

Repository files navigation

BpForms: toolkit for concretely describing non-canonical DNA, RNA, and proteins

Installation

Examples, tutorial, and documentation

License

Citing BpForms

Development team

Questions and comments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

`BpForms`: toolkit for concretely describing non-canonical DNA, RNA, and proteins

Citing `BpForms`

Packages