# How to use Project RACCOON in a notebook

<img src="screenshots/asset1.png"
     style="float: center; margin-right: 10px;" />
     <p></p>
<b> Project RACCOON by Obenauer, Spauszus, et al. @ JGU Mainz 2023 </b>
<p></p>
Please cite the following references when using Project RACCOON:  

[![status](https://joss.theoj.org/papers/a72d0ea4ef2c43b6384a5fff784aa1ba/status.svg)](https://joss.theoj.org/papers/a72d0ea4ef2c43b6384a5fff784aa1ba) 

</b>



We can easily create a starting geometry using Project RACCOON in 4 steps. 

1. Import the `raccoon` module.
2. Instantiate the `monomers`.
3. Write a sequence and generate a `sequence` object.
4. Generate a starting geometry and write it to a `pdb` file.

below i will show you how to execute these steps in a python script

In [1]:
import project_raccoon as rc 

First we need to create the monomer building block. This can either be done via a custom JSON file 

```python
monomers = rc.Monomers.from_json(filepath)
```

or using the default file, which is available under `raccoon/src/data/monomers.json`.

```python
monomers = rc.Monomers.from_file()
```

Here we will use the standard monomers.

In [2]:
monomers = rc.Monomers.from_json()

We can use the notebook cell magic ```%%writefile``` to write the sequence directly into a file. A sequence is specified as follows:

<p align="left">(abbreviation of the monomer) : (resolution of the monomer) : (0, or 1 if inverted) : (n number of repetitions)</p>
<p align="left">...</p>




In [3]:
%%writefile seq.txt
# Sequence Generation for C2 symmetric linear polymer peptide conjugate according to Otter et. al. 2018
# R. Otter, N. A. Henke, C. Berac, T. Bauer, M. Barz, S. Seiffert, P. Besenius, Macromol. Rapid Commun. 2018, 39, 1800459.
# https://doi.org/10.1002/marc.201800459
ACE:AA:0:1
PHE:AA:0:1
HIS:AA:0:1
PHE:AA:0:1
HIS:AA:0:1
PHE:AA:0:1
CSX:UA:0:1
LNK:AA:0:1
PEO:UA:0:50
LNK:AA:1:1
CSX:UA:1:1
PHE:AA:1:1
HIS:AA:1:1
PHE:AA:1:1
HIS:AA:1:1
PHE:AA:1:1
ACE:AA:0:1

Writing seq.txt


Sequences can be generated from a file as shown below. It returns an NamendTuple

```python
Sequence = NamedTuple('Sequence', [('index', List[int]), ('inverted', List[bool]), ('reps', List[int])])
```

, which ist used as container for the Sequence. 

In [4]:
seq = rc.generate_sequence(monomers, "seq.txt")

PDB files can be generated from sequence files as shown below. If the geometry generation takes too long, changing the self-avoidant random walk treshhold *trr* can help.

In [13]:
trr = 1
rc.generate_file(monomers, seq, False, "out.pdb", trr=trr)
rc.visualize_pdb_file("out.pdb")

To check whether it is a valid pdb file, this can be checked using biopandas

In [6]:
rc.check_pdb_file("out.pdb")

To check whether it is a valid start geometry, the minimum distance in the molecule can be checked.

In [9]:
elements, coords = rc.get_elements_and_coords_from_pdb("out.pdb")
rc.calc_minimal_distance(coords, coords) >= trr

True