# The Distance Matrix

The pairwise distance matrix $D$ is a $n \times n$ matrix, $n$ being the number of atoms of the RNA molecule, where each $d_{i, j}$ represents the distance between atoms $i$ and $j$. It is thus symmetrical.

In [14]:
from Bio.PDB import PDBParser
from utils import get_coordinates_from_structure
from sklearn.metrics import pairwise_distances

structure = PDBParser().get_structure("4xw7", "4xw7.pdb")
atoms = get_coordinates_from_structure(structure)

d = pairwise_distances(atoms)
d.shape

(1473, 1473)

Since using all the atoms in the molecule results in really large matrices, nucleotides are often reduced in coarse-grained representations. The number and postions of each bead can vary, sometimes it is one or multiple atoms, sometimes the centroïd.

Using only the phosphate atom in the molecule, we have :

In [15]:
atoms = get_coordinates_from_structure(structure, ["P"])

d = pairwise_distances(atoms)
d.shape

(65, 65)