`BcForms` is a toolkit for concretely describing the primary structure of macromolecular complexes, including non-canonical monomeric forms and intra and inter-subunit crosslinks. `BcForms` includes a textual grammar for describing complexes and a Python library, a command line program, and a REST API for validating and manipulating complexes described in this grammar. `BcForms` represents complexes as sets of subunits, with their stoichiometries, and covalent crosslinks which link the subunits. DNA, RNA, and protein subunits can be represented using `BpForms`. Small molecule subunits can be represented using `openbabel.OBMol`, typically imported from SMILES or InChI.

This notebook illustrates how to use the `BcForms` Python library via some simple. Please see the second tutorial for more details and more examples. Please also see the [documentation](https://docs.karrlab.org/bcforms/) for more information about the `BcForms` grammar and more instructions for using the `BcForms` website, JSON REST API, and command line interface.

# Import BpForms and BcForms libraries

In [1]:
import bcforms
import bpforms

# Create complexes from their string representations

In [2]:
form_1 = bcforms.BcForm().from_str('2 * subunit_a + 3 * subunit_b')
form_1.set_subunit_attribute('subunit_a', 'structure',
    bpforms.ProteinForm().from_str('CAAAAAAAA'))
form_1.set_subunit_attribute('subunit_b', 'structure',
    bpforms.ProteinForm().from_str('AAAAAAAAC'))

In [3]:
form_2 = bcforms.BcForm().from_str(
    '2 * subunit_a'
    '| x-link: [type: disulfide | l: subunit_a(1)-1 | r: subunit_a(2)-1]')
form_2.set_subunit_attribute('subunit_a', 'structure',
    bpforms.ProteinForm().from_str('CAAAAAAAA'))

# Create complexes programmatically

In [4]:
form_1_b = bcforms.BcForm()
form_1_b.subunits.append(bcforms.core.Subunit('subunit_a', 2,
    bpforms.ProteinForm().from_str('CAAAAAAAA')))
form_1_b.subunits.append(bcforms.core.Subunit('subunit_b', 3,
    bpforms.ProteinForm().from_str('AAAAAAAAC')))

In [5]:
form_2_b = bcforms.BcForm()
subunit = bcforms.core.Subunit('subunit_a', 2,
    bpforms.ProteinForm().from_str('CAAAAAAAA'))
form_2_b.subunits.append(subunit)
form_2_b.crosslinks.append(bcforms.core.OntologyCrosslink(
    'disulfide', 'subunit_a', 1, 'subunit_a', 1, 1, 2))

# Get properties of polymers

## Subunits

In [6]:
form_1.subunits

[<bcforms.core.Subunit at 0x7f736d607ed0>,
 <bcforms.core.Subunit at 0x7f736d025210>]

## Crosslinks

In [7]:
form_2.crosslinks 

[<bcforms.core.OntologyCrosslink at 0x7f736d753d10>]

# Get the string representation of a complex

In [8]:
str(form_1_b) 

'2 * subunit_a + 3 * subunit_b'

# Check equality of complexes

In [9]:
form_1_b.is_equal(form_1)

True

# Calculate properties of a complex

## Molecular structure

In [10]:
form_1.get_structure()[0]

<openbabel.OBMol; proxy of <Swig Object of type 'OpenBabel::OBMol *' at 0x7f737038ede0> >

## SMILES representation

In [11]:
form_1.export('smiles') 

'C(=O)([C@@H]([NH3+])CS)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)O.C(=O)([C@@H]([NH3+])CS)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)O.C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)O)CS.C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)O)CS.C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)O)CS'

## Formula

In [12]:
form_1.get_formula()

AttrDefault(<class 'float'>, False, {'C': 135.0, 'H': 240.0, 'N': 45.0, 'O': 50.0, 'S': 5.0})

## Charge

In [13]:
form_1.get_charge()

5

## Molecular weight

In [14]:
form_1.get_mol_wt()

3453.9699999999993