### SPACE2

This notebook provides example code of how to cluster antibodies with SPACE2 and shows customisation options for the clustering.

**Default run**

In [None]:
import glob
import SPACE2

antibody_models = glob.glob("path/to/antibody/models/*.pdb")

clustered_dataframe = SPACE2.agglomerative_clustering(antibody_models, cutoff=1.25)

**Specify CDRs for structural comparison and framework for structural alignment**

In [None]:
from SPACE2 import reg_def

# residues for structural comparison
cdr_selection = [reg_def['CDRH1'], reg_def['CDRH2'], reg_def['CDRH3']]
# residues for structural alignment
fw_selection = [reg_def['fwH']]

# any list of np.arrays of integers can be used for cdr_selection and fw_selection
# these correspond to the imgt residue number of residues to select

clustered_dataframe = SPACE2.agglomerative_clustering(
    antibody_models, selection=cdr_selection, anchors=fw_selection, cutoff=1.25
    )

**Distance calculation with dynamic time warping (DTW)**

This allows to drop the requirement for all antibodies in a cluster to have identical CDR length.

In [None]:
import numpy as np

# specify the allowed length tolerance for each CDR for an antibody to be allowed in the same cluster
# here CDRH1 tolerance is 2, CDRH2: 2, CDRH3: 5, CDRL1: 2, CDRL2: 2, CDRL3: 2
length_tolerance = np.array([2,2,5,2,2,2])
clustered_dataframe = SPACE2.agglomerative_clustering(
    antibody_models, d_metric='dtw', length_tolerance=length_tolerance,
    )

# dtw can also be run with a customised CDR selection
cdr_selection = [reg_def['CDRH1'], reg_def['CDRH2'], reg_def['CDRH3']]
# here length_tolerance sets the length tolerance for each CDR and in 
# the same order as specified in cdr_selection
length_tolerance = np.array([2,2,5])
clustered_dataframe = SPACE2.agglomerative_clustering(
    antibody_models, selection=cdr_selection, d_metric='dtw', length_tolerance=length_tolerance,
    )

**Customise clustering algorithm**

In [None]:
# greedy clustering
clustered_dataframe = SPACE2.greedy_clustering(antibody_models, cutoff=1.25)

# k-means clustering, or any other clustering algorithm from scikit-learn
from sklearn.cluster import KMeans
algorithm = KMeans()
clustered_dataframe = SPACE2.cluster_with_algorithm(algorithm, antibody_models)