> ### In this tutorial we will cover:
> - the different syntax flavours of building molecules

If you have checked out some of the other tutorials you will have definitely come accross the `bb.connect` function that is used to connect two molecules together. 
However, biobuild comprises three different syntaxes to use when performing different tasks. A _functional_, a _method-based_, and an _operator-based_ syntax. 

### Functional API
Biobuild contains a lot of functions spread over a number of modules. Especially the `structural` module contains a great number of them. Most biobuild functions are intended for usage by an end-user, so it is completely fine to import whatever functions you need and use them. This gives a more R-like user experience. Many functions are also automatically imported when loading biobuild. 

#### Examples
- `bb.connect`
- `bb.read_pdb`
- `bb.structural.autolabel`
- `bb.get_compound`

### Method API
For convenience, most functions that a user is likely to use on a regular basis have already been integrated into methods of the `Molecule` or other classes. Hence, it is probably the most convenient for most users to rely on the method based API the most since it saves you the time of importing the modules necessary. Most methods support the full range of parameters that the functions they are linked to support, but this is not always the case! Of course, there are many methods that are only implemented as methods and not available as stand-alone functions. Additionally, there are some synonymous methods available such as `bb.Molecule.get_residue_graph` and `bb.Molecule.make_residue_graph` for historic reasons and compatibility in the code base.

#### Examples
- `bb.Molecule.attach`
- `bb.Molecule.from_pdb`
- `bb.Molecule.autolabel`
- `bb.PDBEComponds.get`

### Operator API
Operators are a great way of writing very short code. Biobuild implements a syntax called "molecular arithmetics" that supports the basic operations for connecting molecules together. For instance, it allows us to connect `mol_c = mol_a + mol_b`. Naturally, this syntax is the most constrained out of the three, but it offers a wonderfully short way of creating larger structures. 

#### Available Operators
| Function                                       | Method                                             | Attribute                               |  Operator                |
|------------------------------------------------|----------------------------------------------------|-----------------------------------------|--------------------------|
|                                              - |  `mol_a.set_linkage(link)`                         |  `mol_a.linkage = link`                 | `mol_a % link`
|                                              - |  `mol_a.set_attach_residue(res)`                   | `mol_a.attach_residue = res`            |  `mol_a @ res`           |
|                                              - |  `mol_a.set_root(root_atom)`                       |  `mol_a.root_atom = root_atom`          |  `mol_a ^ root_atom`     |
| `mol_c = bb.connect(mol_a, mol_b, link)`       | `mol_c = mol_a.attach(mol_b, link, inplace=False)` |  -                                      |  `mol_c = mol_a + mol_b` |
| `bb.connect(mol_a, mol_b, link, copy_a=False)` |  `mol_a.attach(mol_b, link)`  | - | `mol_a += mol_b`
| `mol_c = bb.polymerize(mol_a, n, link)`        |  `mol_c = mol_a.repeat(n, link, inplace=False)`    |  -                                      |  `mol_c = mol_a * n`     |
| `bb.polymerize(mol_a, n, link, inplace=True)`  |  `mol_a.repeat(n, link)`                           |  -                                      |  `mol_a *= n`            |


<!-- 

Function;Method;Attribute; Operator
-; `mol_a.set_linkage(link)`; `mol_a.linkage = link`, `mol_a % link`
-; `mol_a.set_attach_residue(res)`;`mol_a.attach_residue = res`; `mol_a @ res`
-; `mol_a.set_root(root_atom)`; `mol_a.root_atom = root_atom`; `mol_a ^ root_atom`
`mol_c = bb.connect(mol_a, mol_b, link)`;`mol_c = mol_a.attach(mol_b, link, inplace=False)`; -; `mol_c = mol_a + mol_b`
`bb.connect(mol_a, mol_b, link, copy_a=False)`; `mol_a.attach(mol_b, link)`, -, `mol_a += mol_b`
`mol_c = bb.polymerize(mol_a, n, link)`; `mol_c = mol_a.repeat(n, link, inplace=False)`; -; `mol_c = mol_a * n`
`bb.polymerize(mol_a, n, link, inplace=True)`; `mol_a.repeat(n, link)`; -; `mol_a *= n` -->

> ### A general note on functions versus methods
> The functional API is usually taylored toward **non-inplace** operations, while the method based API is taylored toward **inplace** operations. If you look at the table above closely, you will notice that the `copy` and `inplace` arguments are always switched between the two. So, when calling `Molecule.attach`, the operation will be in-place by default, while calling `connect` will by default return a copy.

## Examples

Let's look at an example. Because we like sugars so much, we'll build a glycan structure (yep, not very creative, but it does the trick)... If you are unfamiliar with glycans, just tag along for the ride and don't think too much about it. It's just an example...

```mermaid
flowchart TB
  node_1["Glucose"]
  node_2["Glucose"]
  node_3["Galactose"]
  node_4["Galactose"]
  node_5["Galactose"]
  node_6["Mannose"]
  node_7["Mannose"]
  node_8["Glucose"]
  node_1 --"beta 1-4"--> node_2
  node_2 --"alpha 1-3"--> node_3
  node_3 --"alpha 1-4"--> node_4
  node_4 --"alpha 1-3"--> node_5
  node_6 --"beta 1-4"--> node_7
  node_7 --"alpha 1-4"--> node_8
  node_2 --"beta 1-2"--> node_6
```

In [1]:
import biobuild as bb
# first get some compounds
bb.load_sugars()

glc = bb.molecule("GLC")
gal = bb.molecule("GAL")
man = bb.molecule("MAN")
fuc = bb.molecule("FUC")



#### Using functional syntax

Now we will build the glycan only using the functional syntax of biobuild:

In [6]:
# start with the glucose-glucose at the top
# remember that biobuild has pre-available linkages that 
# can be referenced by their string id directly;
# glycosydic linkages are among them.
glycan = bb.connect(glc, glc, link="14bb")

# now build the galactose branch
gal_branch = bb.connect(gal, gal, link="14aa")
gal_branch = bb.connect(gal_branch, gal, link="13aa")

# now build the mannose branch
man_branch = bb.connect(man, man, "14bb")
man_branch = bb.connect(man_branch, glc, "14ab")

# now attach both branches to the glucose-glucose
# this time we need to specify at which residues to connect
# since we don't want to use the default last residue
glycan = bb.connect(glycan, gal_branch, link="13ab", at_residue_b=1)
glycan = bb.connect(glycan, man_branch, link="12bb", at_residue_a=2, at_residue_b=1)

# that's it! now we can visualize the glycan
glycan.show()

#### Using method syntax

Now let's repeat the same using the method syntax. Here we will need to add more `inplace=False` statements since we want to reuse the individual molecules multiple times. We did not have to bother with this using the functional syntax since we automatically generated copies there. On the other hand, we now can work a little more efficiently since we don't create too many unnecessary copies of objects (of course, we could have achieved the same with the functional syntax by adjusting the `copy` arguments). 

In [7]:
glycan2 = glc.attach(glc, link="14bb", inplace=False)

# now build the galactose branch
gal_branch2 = gal.attach(gal, link="14aa", inplace=False)
gal_branch2.attach(gal, link="13aa")

# now build the mannose branch
man_branch2 = man.attach(man, "14bb", inplace=False)
man_branch2.attach(glc, "14ab")

# now attach both branches to the glucose-glucose
glycan2.attach(gal_branch2, link="13ab", other_residue=1)
glycan2.attach(man_branch2, link="12bb", at_residue=2, other_residue=1)

# that's it! now we can visualize the glycan
glycan2.show()

#### Using operator syntax 

Now for one final version using the operator based syntax. This one is the shortest but also the most cryptic. It requires us to first set the linkage and residues before we can use `+` or `*` operators. 

In [10]:
# start again with the glucose-glucose at the top
# first set the linkage using %, then add the next molecule
glycan3 = glc % "14bb" + glc

# now build the galactose branch
# we can separate the statements into multiple lines
gal % "14aa"
gal_branch3 = gal + gal
gal_branch3 % "13aa"
gal_branch3 += gal

# now build the mannose branch
# or we can "chain" the statements together using ()
man_branch3 = (man % "14bb" + man) % "14ab" + glc

# now attach both branches to the glucose-glucose
# now we also need to set the attach residues using @
glycan3 = glycan3 % "13ab" + gal_branch3 @ 1
glycan3 = glycan3 % "12bb" @ 2 + man_branch3 @ 1

# that's it! now we can visualize the glycan
glycan3.show()

Naturally, we can combine all three syntaxes together if we like. Whatever bits and pieces you like best about the three. So, if you don't like the `mol_a % "14bb"` syntax just use `mol_a.set_linkage("14bb")` instead, it will not affect your ability to use `mol_a + mol_b` or `mol_a.attach(mol_b)` or `bb.connect(mol_a, mol_b)` later on. By the way, if you set the linkage beforehand, there is no need to specify it again when calling the `attach` method or `connect` function. Same thing goes for the attach residue. 

With that we are at the end of this little tour through the three different ways to construct molecules in biobuild. Of course, there are many more functions and methods than there are operators, and it may not help readability to rely on the operators too much, especially when chaining many statements together. Nevertheless, they are a handy way to very concisely construct molecules. Please, feel free to use whichever syntax you prefer. Good luck in your project using biobuild! 