## VOXELIZER TUTORIAL

This notebook is a tutorial to get around with `CNNs4QSPR`.It contains examples showing how each function runs and what output is expected. You can work on this notebook to get familiar with the repo and then can run the package smoothly. Have fun!

Okay let's start with loading the file and getting some cool protein stuff out of it.

### Modules you will need to import before you get started

In [2]:
import torch
import os
import sys
sys.path.append('..')
import numpy as np
import pandas as pd
import plotly.express as px
from biopandas.mol2 import PandasMol2
from biopandas.pdb import PandasPdb
from scipy.stats import special_ortho_group
from cnns4qspr import loader
from cnns4qspr import visualizer
from cnns4qspr import featurizer

In [64]:
myprotein_dict = loader.load_pdb('sample_pdbs/1a0iA01')
myprotein_dict.keys()

dict_keys(['x_coords', 'y_coords', 'z_coords', 'positions', 'atom_types', 'num_atoms', 'atom_type_set', 'num_atom_types', 'residues', 'residue_set', 'shifted_positions'])

In [55]:
# myprotein_dict

So we get the data of protein in the form of dictionary and you can use this data to create channels which are fed into the neural network. Protein dictionary contains the data of the protein like `positions, atom_types, residues, etc..`

In [56]:
shift_coordinates = loader.shift_coords(myprotein_dict)

We use the `shift_coords` function to place the protein such that it's coordinates are in the center of the field tensor. Sometimes, we don't get the plots of the protein which is fed in the neural net if it's size is large to place it completely inside the field tensor. So `shift_coords` basically shifts the coordinates as per the mean value of the extreme coordinates and we can get the beautiful plots.

In [67]:
field_dict = loader.make_fields(myprotein_dict)

In [58]:
make_ = loader.voxelize('sample_pdbs/1a0rP02',channels = ['charged'])

The `make_fields` and `voxelize` both return the field tensor of the protein. The `make_fields` take the protein dictionary as the input whil the `voxelize` take the protein file as the input but both of them have the same output of the field tensor of the protein.

In [69]:
afp = loader.atoms_from_residues(myprotein_dict,['GLU'])

The `atoms_from_residues` function gives you the the positions of the atoms in the residues.

In [60]:
plottable=visualizer.plot_field(field_dict['CA'],show=False)
plottable.show()

In [61]:
sys.path.append('../checkpoints')

In [62]:
cnn = featurizer.load_cnn('cnn_no_vae.ckpt')

In [63]:
visualizer.plot_internals(cnn, field_dict['CA'])