# Setting the environment

## How to run the scripts from your computer

-  
First of all, you need to install the libraries:

> **conda env create -f bioinfo-env.yml**

- Then, you need to activate the Conda environment: 
    
> **source activate bioinfo-env**

- Finally, in the **scripts** directory, open a terminal and type: 
    
> **python `<SCRIPT.py>`**

## How to import napoli.py from everywhere...

I would say that the best way to import my scripts from anywhere, it would be by adding the project directory into the ENV python. However, I will not cover it right now.

**Note:** by everywhere, I mean: to open a terminal at any directory in your PC and import the library without getting import errors. Usually, I just go to the main directory and open the terminal there as I explained in the previous section.

Another simple way to do it is:

> **import sys**
>
> **scripts = `<YOUR PATH TO THE SCRIPTS\>`**
>
> **sys.path.append(scripts)**

### Test...

In [2]:
import sys

scripts = '/media/data/Workspace/nAPOLI_v2/scripts'
sys.path.append(scripts)

import napoli

if 'napoli' in sys.modules:
    print("Module successfully imported!!!")
else:
    print("Module not imported!!!")

Module successfully imported!!!


# How to discover which ligands to use

In [3]:
from mol.entry import recover_entries_from_entity
from MyBio.util import download_pdb
from MyBio.PDB.PDBParser import PDBParser

# Define a PDB id you want to analyze
pdb_id = '3QQK'
output_path = "."
# It will download a PDB, but you can also use one already in your PC.
download_pdb(pdb_id=pdb_id, output_path=output_path)

pdb_file = '%s/%s.pdb' % (output_path, pdb_id)

# BioPython: parse a pdb file into a structure object.
PDB_PARSER = PDBParser(PERMISSIVE=True, QUIET=True, FIX_ATOM_NAME_CONFLICT=False, FIX_OBABEL_FLAGS=False)
structure = PDB_PARSER.get_structure(pdb_id, pdb_file)

# The function recover_entries_from_entity() allows you to recover ligands from a PDB file and return a list
# of strings. So, you'll be able to choose which ligands you want to analyze and then create the Entry objects
# as I will show later.
print("Ligands found in the PDB %s:\n" % pdb_id)
for entry in recover_entries_from_entity(structure):
    print("\t", entry)

Downloading PDB structure '3QQK'...
Ligands found in the PDB 3QQK:

	 3QQK:A:EDO:490 
	 3QQK:A:EDO:491 
	 3QQK:A:EDO:492 
	 3QQK:A:EDO:493 
	 3QQK:A:EDO:496 
	 3QQK:A:X02:497 
	 3QQK:A


# Defining your own list of Entry objects

## Protein-small molecule entries

- Each Protein-Small molecule entry is in the pattern: **`PDB:CHAIN:LIG-NAME:LIG-NUM[+ICODE]`**. The ICODE is only used when this information exists in the PDB file.


> Therefore, a small molecule Entry should always have the parameters: **`pdb_id, chain_id, comp_name, comp_num`**. There is also an optional parameter `comp_icode` for setting Icode.

**IMPORTANT:** the `pdb_id` parameter must coincide with the PDB filename.

### Examples

- **First, import the Entry objects.**

In [4]:
from mol.entry import *

- **Then you can define a new small molecule entry either by passing parameters or a string.**

In [5]:
# Parameters: 
entry = CompoundEntry("3QQK", "A", "X02", 497)
print(entry)
print("Entry as string:", entry.to_string())
print()

# String...
entry = CompoundEntry.from_string("3QQK:A:X02:497", sep=":")
print(entry)
print("Entry as string:", entry.to_string())

<CompoundEntry: 3QQK:A:X02:497>
Entry as string: 3QQK:A:X02:497

<CompoundEntry: 3QQK:A:X02:497>
Entry as string: 3QQK:A:X02:497


- **The previous example covers only the cases in which the ligand and the protein structure is in the same PDB file. However, if these components are in two different files as, for example, in a PDB and molecular file (Mol, Mol2, SDF, PDB, etc), you can use the following entry object.**

In [6]:
# Use PDB + molecule files (Mol, Mol2, SDF, etc):
# This option allows you to define a separate ligand file as input. 
# This is useful when you want to keep the ligand charge information, bond order, valence, etc.
entry = MolEntry("MY-PDB-ID", "MY-LIGAND-ID-IN-THE-FILE", mol_file="/my/dir/file.mol", mol_obj_type='rdkit')
print(entry)

<MolEntry: MY-PDB-ID:MY-LIGAND-ID-IN-THE-FILE>


- **If you have a file containing strings, you can also create entries automatically as follows:**

Let's create a new file for testing...

In [7]:
# I imported this module only to check if the file was created.
from os.path import exists

output_file = "input_example"
with open(output_file, "w") as OUT:
    OUT.write("3QQK:A:X02:497\n")
    OUT.write("3QL8:A:X01:300\n")

if exists(output_file):
    print("File '%s' created successfully!!!" % output_file)

File 'input_example' created successfully!!!


Now, we read the strings from the created file...

In [8]:
# Entry strings are in the file
input_file = "input_example"

entries = []
# In this example, each line in the input file has an entry.
with open(input_file, "r") as IN:
    for entry in IN.readlines():
        entries.append(CompoundEntry.from_string(entry.strip()))

print("List of entries:\n")
print(entries)

List of entries:

[<CompoundEntry: 3QQK:A:X02:497>, <CompoundEntry: 3QL8:A:X01:300>]


- **Another automatic way to initialize a list of entries is by using the module recover_entries_from_entity() as explained in the section 2.**

Note that I defined the parameter `get_chains` to False when I called the module `recover_entries_from_entity()`. This parameter defines if Chain entries should be recovered or not. Since, I am working with small molecules in these examples, I ignored all Chain entries.

In [9]:
from mol.entry import recover_entries_from_entity
from MyBio.util import download_pdb
from MyBio.PDB.PDBParser import PDBParser

# Define a PDB id you want to analyze
pdb_id = '3QQK'
output_path = "."
# It will download a PDB, but you can also use one already in your PC.
download_pdb(pdb_id=pdb_id, output_path=output_path)

pdb_file = '%s/%s.pdb' % (output_path, pdb_id)

# BioPython: parse a pdb file into a structure object.
PDB_PARSER = PDBParser(PERMISSIVE=True, QUIET=True, FIX_ATOM_NAME_CONFLICT=False, FIX_OBABEL_FLAGS=False)
structure = PDB_PARSER.get_structure(pdb_id, pdb_file)

entries = []
for entry in recover_entries_from_entity(structure, get_chains=False):
    entries.append(CompoundEntry.from_string(entry.strip()))
    
print("List of entries found in the PDB %s:\n" % pdb_id)
print(entries)

List of entries found in the PDB 3QQK:

[<CompoundEntry: 3QQK:A:EDO:490>, <CompoundEntry: 3QQK:A:EDO:491>, <CompoundEntry: 3QQK:A:EDO:492>, <CompoundEntry: 3QQK:A:EDO:493>, <CompoundEntry: 3QQK:A:EDO:496>, <CompoundEntry: 3QQK:A:X02:497>]


## Protein-chain entries

- Each Protein-Chain entry is in the pattern: **`PDB:CHAIN`**.


> Therefore, a chain Entry should always have the parameters: **`pdb_id, chain_id`**. 

**IMPORTANT:** the `pdb_id` parameter must coincide with the PDB filename.

**These entries are useful when you want to calculate interactions that a specific protein/dna/rna chain establishes with other molecules.** 

### Examples

- **First, import the Entry objects.**

In [10]:
from mol.entry import *

- **Then you can define a new small molecule entry either by passing parameters or a string.**

In [11]:
# Parameters: 
entry = ChainEntry("3QQK", "A")
print(entry)
print("Entry as string:", entry.to_string())
print()

# String...
entry = ChainEntry.from_string("3QQK:A", sep=":")
print(entry)
print("Entry as string:", entry.to_string())

<ChainEntry: 3QQK:A>
Entry as string: 3QQK:A

<ChainEntry: 3QQK:A>
Entry as string: 3QQK:A


- **As I showed for small molecule entries, it is also possible to automatically read Chain entries from files or by recovering entries from a PDB file.**

> **Just remember that you should use ChainEntry instead of CompoundEntry.**

Note that I defined the parameter `get_small_molecules` to False when I called the module `recover_entries_from_entity()`. This parameter defines if small molecules entries should be recovered or not. Since, I am working with Chains in these examples, I ignored all small molecule entries.

In [12]:
from mol.entry import recover_entries_from_entity
from MyBio.util import download_pdb
from MyBio.PDB.PDBParser import PDBParser

# Define a PDB id you want to analyze
pdb_id = '3QQK'
output_path = "."
# It will download a PDB, but you can also use one already in your PC.
download_pdb(pdb_id=pdb_id, output_path=output_path)

pdb_file = '%s/%s.pdb' % (output_path, pdb_id)

# BioPython: parse a pdb file into a structure object.
PDB_PARSER = PDBParser(PERMISSIVE=True, QUIET=True, FIX_ATOM_NAME_CONFLICT=False, FIX_OBABEL_FLAGS=False)
structure = PDB_PARSER.get_structure(pdb_id, pdb_file)

entries = []
for entry in recover_entries_from_entity(structure, get_small_molecules=False):
    entries.append(ChainEntry.from_string(entry.strip()))
    
print("List of entries found in the PDB %s:\n" % pdb_id)
print(entries)

List of entries found in the PDB 3QQK:

[<ChainEntry: 3QQK:A>]


# Calculate interactions

## Set up the parameters to calculate interactions

In the below cell, I will describe a list of useful parameters that may be useful for calculating interactions. There are other parameters, but they will not be discussed here.

In [13]:
import napoli

opt = {}

entries = [CompoundEntry.from_string("3QQK:A:X02:497", sep=":")]

# The list of entries you defined previously.
opt["entries"] = entries

# Where do you want to save your project.
opt["working_path"] = "tmp/example_project"

# In case the working path you defined above already exists, it allows the script to overwrite the directory.
# Just be aware that it can remove files from a previous project.
opt["overwrite_path"] = True

# Set where your PDB files are located if they are already available. If you don't have any PDBs, the tool
# will try to download them from the PDB site (https://rcsb.org).
opt["pdb_path"] = "tmp"

# This option is useful only when you work with MolEntry objects and have only one molecular file with multiple 
# ligands. So, this option allows the program to load all the molecules at once instead of performing an I/O 
# operation for each ligand. This will save you a lot of processing time.
opt["preload_mol_files"] = False

# Define if you want to add hydrogens. If the file already has hydrogens as in NMR structure, 
# it will avoid to add hydrogens.
opt["try_h_addition"] = True
# Controls the pH and how the hydrogens are going to be added.
opt["ph"] = 7.4

# This option is useful when you are working with PDB files. As PDB files does not have charge, bond order, 
# nor valence information. So, basically I implemented a sanitization function to correct charges
# and valences for some atoms.
opt["amend_mol"] = True

# Define if you want to work with RDKit or OpenBabel objects. The default option is RDKit, so you do not
# need to define it here.
opt["mol_obj_type"] = 'rdkit'

# You can define your own interaction configuration.
from mol.interaction.conf import InteractionConf
opt["inter_conf"] = InteractionConf({"max_da_dist_hb_inter": 3.9})

# You can define different interaction calculator objects. It is used to control how interactions are 
# calculated.
#
# The below example uses a Protein-Ligand interaction filter, but other filters are also available. 
from mol.interaction.filter import InteractionFilter
from mol.interaction.calc import InteractionCalculator
inter_calc = InteractionCalculator(inter_filter=InteractionFilter.new_pli_filter())
#
# It is also possible to define a geometrical configuration by setting the inter_conf/
# inter_calc = InteractionCalculator(inter_conf=InteractionConf({"max_da_dist_hb_inter": 3.9}), 
#                                    inter_filter=InteractionFilter.new_pli_filter())
opt["inter_calc"] = inter_calc

# This flag defines if you want to compute Molecular fingerprints or not.
opt["calc_mfp"] = False

# This flag defines if you want to compute Interaction fingerprints (IFP) or not.
# If your goal is to test different fingerprint parameters, you can set it off and manually create the IFPs 
# using different parameter combinations.
opt["calc_ifp"] = False

# IFP parameter: number of levels
opt["ifp_num_levels"] = 5
# IFP parameter: radius growth
opt["ifp_radius_step"] = 1
# IFP parameter: size of the fingerprint.
opt["ifp_length"] = 2048


## Then, execute the project defined previously

In [14]:
from napoli import LocalProject, FingerprintProject

# LocalProject: useful if you want to save results locally, and also it is useful if your goal is to work with
# the interactions and other results after the processing.
#
# In a first moment, you can use this option because it will be faster for you just to generate different
# fingerprints for pre-computed interactions. So, the idea is: run this cell, and then use the interactions as
# I'm going to show in the next examples.
proj_obj = LocalProject(**opt)

# Fingerprint_Project: useful if you want to generate a lot of fingerprint data into a file. 
# It uses multiprocessing, so it will be faster than Local_Project but you will not be able to interact with the
# results after the files were created...I need to find a way to solve this probs...
# proj_obj = Fingerprint_Project(**opt)

# Run it like this
proj_obj.run()

print()
print("DONE!!!")

Downloading PDB structure '3QQK'...

DONE!!!


## How to interact with the results

- **If you need to check which properties you can access inside objects in Python use dir().**

With this built-in function you can analyze all objects and see which functions and properties they have.

In [15]:
print(dir(proj_obj))

['__call__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'add_mol_obj_to_entries', 'amend_mol', 'atom_prop_file', 'butina_cutoff', 'calc_ifp', 'calc_mfp', 'clusterize_ligands', 'current_entry', 'db_conf_file', 'decide_hydrogen_addition', 'default_properties', 'entries', 'feature_extractor', 'generate_ligand_figure', 'get_fingerprint', 'get_or_create_task', 'get_pdb_file', 'get_rdkit_mol', 'get_status_id', 'has_local_files', 'ifp_length', 'ifp_num_levels', 'ifp_radius_step', 'ifps', 'init_common_tables', 'init_db_connection', 'init_logging_file', 'inter_calc', 'inter_conf', 'interactions', 'log_preferences', 'mfp_opts', 'mfps', 'mol_obj_type', 'neighborhoods', 'overwrite_path', 'pdb_path', 'p

## Atom group analysis

- To analyze the atom groups present in each binding site obtained departing from the entry objects, you can use the property **`neighborhoods`**, which can be accessed through a project object.

> This property stores a list of tuples, in which each tuple contains the entry information and a atom group manager.

In [16]:
proj_obj.neighborhoods

[(<CompoundEntry: 3QQK:A:X02:497>,
  <mol.groups.AtomGroupsManager at 0x7f2a9046cef0>)]

- An atom group manager is an object that provides several functions for working with the atom groups identified for a particular entry. This manager allows one to loop throgh the atom groups, filter by feature, and generate statistics for them.

In [17]:
# Access the atom group manager in the first tuple in the neighborhoods property.
dir(proj_obj.neighborhoods[0][1])

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_atm_grps',
 'add_atm_grps',
 'atm_grps',
 'filter_by_types',
 'merge_hydrophobic_atoms',
 'remove_atm_grps',
 'size',
 'summary']

### Loop through a list of atom groups

In [33]:
for atm_grp in proj_obj.neighborhoods[0][1].atm_grps:
    print(atm_grp)

<AtomGroup: [<ExtendedAtom: 3QQK/0/A/ILE-10/C>, <ExtendedAtom: 3QQK/0/A/ILE-10/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/THR-14/CG2>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/VAL-30/C>, <ExtendedAtom: 3QQK/0/A/VAL-30/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/TYR-15/N>, <ExtendedAtom: 3QQK/0/A/THR-14/C>, <ExtendedAtom: 3QQK/0/A/THR-14/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-321/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLN-85/N>, <ExtendedAtom: 3QQK/0/A/HIS-84/C>, <ExtendedAtom: 3QQK/0/A/HIS-84/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-419/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/VAL-30/N>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLN-85/N>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/VAL-18/CG1>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/PHE-80/CG>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLU-81/C>, <ExtendedAtom: 3QQK/0/A/GLU-81/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/C5>, <ExtendedAtom: 3QQK/0/A/X02-497/S9>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLN-85/CD>]>
<AtomGroup: [<Extended

### Filtering by features

In [32]:
for atm_grp in proj_obj.neighborhoods[0][1].filter_by_types(["Acceptor", "Donor"]):
    print(atm_grp)

<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-321/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-419/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-363/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-380/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-372/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-443/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-435/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-385/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/THR-14/OG1>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-315/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-418/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HIS-84/NE2>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HIS-84/ND1>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-368/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-344/O>]>
<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-442/O>]>


### Loop through each entry and atom group

In [27]:
for entry, atm_grp_mngr in proj_obj.neighborhoods:
    for atm_grp in atm_grp_mngr.atm_grps:
        print(entry, atm_grp, atm_grp.features, atm_grp.atoms)

<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/HIS-84/CG>]> [<Feature=Atom>] [<ExtendedAtom: 3QQK/0/A/HIS-84/CG>]
<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLY-13/C>]> [<Feature=Atom>] [<ExtendedAtom: 3QQK/0/A/GLY-13/C>]
<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/LYS-65/N>]> [<Feature=Donor>, <Feature=Atom>] [<ExtendedAtom: 3QQK/0/A/LYS-65/N>]
<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/PHE-146/N>, <ExtendedAtom: 3QQK/0/A/ASP-145/C>, <ExtendedAtom: 3QQK/0/A/ASP-145/O>]> [<Feature=Amide>] [<ExtendedAtom: 3QQK/0/A/PHE-146/N>, <ExtendedAtom: 3QQK/0/A/ASP-145/C>, <ExtendedAtom: 3QQK/0/A/ASP-145/O>]
<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLN-131/N>]> [<Feature=Donor>, <Feature=Atom>] [<ExtendedAtom: 3QQK/0/A/GLN-131/N>]
<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/ALA-31/CB>]> [<Feature=Hydrophobe>] [<ExtendedAtom: 3QQK/0/A/ALA-31/CB>]
<Co

## Interaction analysis

- To analyze the interactions present in each binding site obtained departing from the entry objects, you can use the property **`interactions`**, which can be accessed through a project object.

> This property stores a list of tuples, in which each tuple contains the entry information and an interaction manager.

In [19]:
# Access the interaction manager in the first tuple in the interactions property.
dir(proj_obj.interactions[0][1])

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_interactions',
 'add_interactions',
 'filter_by_types',
 'find',
 'interactions',
 'remove_interactions',
 'size',
 'summary']

### Loop through a list of atom groups

In [21]:
for inter in proj_obj.interactions[0][1].interactions:
    print(inter)

<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLN-85/CA>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-344/O>]>) type=Weak hydrogen bond>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/LEU-134/CD2>, <ExtendedAtom: 3QQK/0/A/LEU-134/CD1>, <ExtendedAtom: 3QQK/0/A/LEU-134/CG>, <ExtendedAtom: 3QQK/0/A/LEU-134/CB>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/S9>, <ExtendedAtom: 3QQK/0/A/X02-497/C8>]>) type=Hydrophobic>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/ALA-31/CB>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/C5>, <ExtendedAtom: 3QQK/0/A/X02-497/N6>, <ExtendedAtom: 3QQK/0/A/X02-497/C7>, <ExtendedAtom: 3QQK/0/A/X02-497/C8>, <ExtendedAtom: 3QQK/0/A/X02-497/S9>]>) type=Weak hydrogen bond>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/PHE-82/CD1>, <ExtendedAtom: 3QQK/0/A/PHE-82/CD2>, <ExtendedAtom: 3QQK/0/A/PHE-82/CE1>, <ExtendedAtom: 3QQK/0/A/PHE-82/CE2>, <ExtendedAtom: 3QQK/0/A/PHE-82/CG>, <ExtendedAto

### Filtering by interactions

In [22]:
for inter in proj_obj.interactions[0][1].filter_by_types(["Hydrogen bond"]):
    print(inter)

<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/N4>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/LEU-83/O>]>) type=Hydrogen bond>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/LEU-83/N>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/N6>]>) type=Hydrogen bond>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/N10>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/O18>]>) type=Hydrogen bond>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-344/O>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/ASP-86/OD2>]>) type=Hydrogen bond>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/N10>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLU-81/O>]>) type=Hydrogen bond>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/ASP-86/N>]>, <AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-344/O>]>) type=Hydrogen bond>
<InteractionType: compounds=(<AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-321/O>]>, <AtomGrou

### Loop through each entry and atom group

In [23]:
# The property proj_obj.interactions returns tuples: entry and its interactions
for entry, inter_mngr in proj_obj.interactions:
    # The variable 'inters' contains all the interactions found for an entry. So, you can loop over it.
    for i in inter_mngr.interactions:
        print(entry, i.src_grp, i.trgt_grp, i.type)
        print()

<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/GLN-85/CA>]> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/HOH-344/O>]> Weak hydrogen bond

<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/LEU-134/CD2>, <ExtendedAtom: 3QQK/0/A/LEU-134/CD1>, <ExtendedAtom: 3QQK/0/A/LEU-134/CG>, <ExtendedAtom: 3QQK/0/A/LEU-134/CB>]> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/S9>, <ExtendedAtom: 3QQK/0/A/X02-497/C8>]> Hydrophobic

<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/ALA-31/CB>]> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/X02-497/C5>, <ExtendedAtom: 3QQK/0/A/X02-497/N6>, <ExtendedAtom: 3QQK/0/A/X02-497/C7>, <ExtendedAtom: 3QQK/0/A/X02-497/C8>, <ExtendedAtom: 3QQK/0/A/X02-497/S9>]> Weak hydrogen bond

<CompoundEntry: 3QQK:A:X02:497> <AtomGroup: [<ExtendedAtom: 3QQK/0/A/PHE-82/CD1>, <ExtendedAtom: 3QQK/0/A/PHE-82/CD2>, <ExtendedAtom: 3QQK/0/A/PHE-82/CE1>, <ExtendedAtom: 3QQK/0/A/PHE-82/CE2>, <ExtendedAtom: 3QQK/0/A/PHE-82/CG>, <ExtendedAtom: 3QQK/0

## How to plot interactions as a Pymol session

- **First, create a `PymolInteractionViewer` object.**

> This Class contains some parameters to set, but, in general, I don't change them. You can try different things if you want. For example: you can set `add_directional_arrows` to False if you don't want to show arrows pointing to the interaction direction, or set `show_cartoon` to True if you want to see the cartoon as soon as the session is created.

In [24]:
from mol.interaction.view import InteractionViewer
piv = InteractionViewer()

- **Then, call `new_session()` passing a list of tuples, in which each tuple should contain the entry information and an interaction manager. You need also to define a name for the Pymol session output.**

> **Important**: In the below example, all interaction tuples in the project object will be used. In these situations, multiple binding sites will be plotted in the Pymol session. However, as different PDB files may have different coordinate systems, each structure may be located at different parts of the space. That's why you may need to align the structures before plotting them into the Pymol session. If you decide to do it, please tell me and I can teach you how to do it.

In [25]:
pse_file = "tmp/output.pse"
print("Number of interactions to plot: %d" % len(proj_obj.interactions))
piv.new_session(proj_obj.interactions, pse_file)

Number of interactions to plot: 1
 PyMOL not running, entering library mode (experimental)
 Applying pse_export_version=1.800 compatibility


- **If you want, you can also plot only specific binding site. Just pass a list containing only the tuples you want to analyze.**

In [26]:
# For example: let's say we want just the first entry.

interactions = proj_obj.interactions[0:1]
print("Number of interactions to plot: %d" % len(interactions))

pse_file = "tmp/output.pse"
piv.new_session(interactions, pse_file)

print("DONE!!!!")

Number of interactions to plot: 1
 Applying pse_export_version=1.800 compatibility
DONE!!!!
