Welcome to the first `biobuild` tutorial, where we shall learn the basics of operating with `biobuild`!

> ### In this tutorial we will cover:
> - which (main) classes and modules exist and when to use them
> - how to read input to make `Molecule`s
> - how to connect two molecules together

## Main Classes in `biobuild`

The most important class is the `Molecule` class which houses ~90% of all functionality that the average user is likely to use. 
Next, comes the `Linkage` class that defines how multiple molecules can be connected, so the user can build larger structures. Third,
are the `ResidueGraph` and `Rotatron` environments of the `optimizers` module - to optimize conformations. Finally, the  `MoleculeViewer3D` may come in quite handy. It is essentially just a plotly 3D-plot but it can be selectively colored easily so as to highlight specific parts of your molecule - it is a great tool to check if your building process is working (or to debug why it may not be). 
That's already about it, there is a lot more - but everything else is mostly underneath the surface and will most likely not bother the average user directly. 

```mermaid

flowchart TB
  node_0["core"]
  node_1("Molecule")
  node_2("ResidueGraph")
  node_3("MoleculeViewer3D")
  node_4["Optimizers"]
  node_6("Rotatron")
  node_7{"{}_optimize"}
  node_5["Linkage"]
  node_1 -.-> node_2
  node_1 -.-> node_3
  node_0 --- node_1
  node_4 --- node_6
  node_4 --- node_7
  node_6 -.-> node_7
  node_2 -.-> node_7
  node_0 --- node_5

```

In [1]:
import biobuild as bb



### Molecules

Molecules are the essential data unit in `biobuild`. Each Molecule houses atoms that form a molecular structure. We can generate Molecules from:
- a PDB or CIF file (e.g. "my_structure.pdb")
- a PDB ID (e.g. "GLC" - the PDB id for alpha-D-glucose)
- a trivial name (e.g. "triacetamide")
- a chemical formula (e.g. "C2H6O" - caution, may produce ambiguities!)
- SMILES and InChI/InChIKey

The `Molecule` class has classmethods for all of these named `from_pdb`, `from_compound` (for PDB ID, name etc.), `from_pubchem` and so forth. But there also exists the toplevel `molecule` function that will try to automatically figure out the user input and generate a molecule for you.

In [2]:
# get a serine
ser = bb.molecule("SER") # (using the PDB id)

# check what it looks like
ser.show()

In [3]:
# get a diacetamide (using its name)
tri = bb.molecule("diacetamide")

# check what it looks like
tri.show()

### Linkage 

Now assuming we want to take _diacetamide_ and connect the _serine_ to it. We can define a `Linkage` to specify which atoms to link between the two. To do so, we need to know what the atoms are labelled as (another reason to always visualize the structures). In the Linkage definition we specify, for instance, that the diacetamide-N1 should be connected to serine-C. In the process, we remove the diacetamide-H1 and the serine's OXT and HXT.

> #### Note
> The linkages do *not* need to follow chemical reasoning - they only connect structures by adding and removing connections. Hence, we could connect the molecules by one of the methyl-groups making one of the hydrogens the leaving group (even though chemically such a reaction would probably never occur) - biobuild builds structures, but it does not imitate chemical reactions! 

In [4]:
# define the linkage between the two molecules
# remember to check the atom labelling as these atom labels must match
# the existing atom labels in both molecules
link = bb.linkage(atom1="N1", atom2="C", delete_in_target=["H1"], delete_in_source=["OXT", "HXT"])

Now we can connect the two molecules for example using the `connect` function:

In [5]:
new = bb.connect(tri, ser, link)

# and check what it looks like
new.show()

If we observe a clash in the resulting conformation, we can apply an optimization. There is a quick way to do so by simply calling the `optimize` method of any `Molecule`:

In [6]:
# optimize the geometry (at least a bit)
new.optimize()

# and check what it looks like
new.show()

The implementation of the `optimize` method is quite rudimentary and does not consider any particulars of the existing conformation. Hence, it may not produce satisfactory results, especially for smaller structures such as our toy example. However, we can set up a more intricate optimization by ourself with not too much extra effort.

The steps are always the same:
- generate a `ResidueGraph` from your molecule to optimize
- make the graph "detailed" 
- choose which bonds within your molecule to rotate around
- generate an environment of your choice from your graph and rotatable bonds
- choose an optimization function and call it on your environment
- apply the solution back to your molecule

In [20]:
from biobuild import optimizers

# generate a residue graph
graph = new.make_residue_graph()
graph.make_detailed(True, True, 0.1)

# choose only the the bond between N1 and C (in serine) to be rotatable
bonds = new.get_bonds("N1", "C")

# let's use the standard "MultiBondRotatron" environment
env = optimizers.MultiBondRotatron(graph, bonds)

# and optimize the geometry using particle-swarm optimization
# we immediately apply the solution to a copy of the molecule
out = optimizers.swarm_optimize(env, molecule=new.copy())

# and check what it looks like
out.show()