# Detection of rigid Components in Graphs applied to Proteins


Project on rigidity in RNA and protein structures for Advanced Methods in Bioinformatics course

Thank you for using our little pebble and component-detection game tool!

We designed it to determine rigid components in proteins and other graphs.
The algorithm for the pabblegame/component-detection is based on the research-paper "Pebble game algorithms and sparse graphs"
by Lee and Streinu

Additionally we established a pipeline to detect rigid components in Proteins using the graphein-library.

To just run a basic pebblegame you can use the function in generic_pebblegame.py.
To run a component detection use the functions in pebblegame.py

Functions wrapping the rigid-component-detection and Protein-Graph building and visualization you can find in PDB_to_Grphein.py

Use the UI.py to perform a quickstart compnent detection

## A little introduction to the main functions of PDB_to_Graphein.py:

pdb_to_graph(path, only_covalent=True, gran="atom")

- output: networkX MultiGraph

- loads PDB-file and converts to graph
- setting only_covalent to False takes sidechain interactions such as H-bond or ionic interactions into account and makes them to edges
- gran="atom" sets the atoms of the Protein to nodes. Change to gran="centroid" to set aminoacids as nodes.

find_components(G, k=5, l=6)

- output: List of components as sets containing the edges as frozen sets

- performs k,l-pebblegame-based detection of rigid components
- default k=5 and l=6 to perform a k,l-pebblegame for 3D protein Graphs

assign_components(G, components)

- output: netwokX Multigraph with node-attibute component based on input components-list

- it makes only sense to use the components list as input which is generated based on the same graph as input of this function

print_attributes(G)

- output: pandas dataframe with the node attributes

print_component_dataframe(component_list):

- output: pandas dataframe with components as index and a column with nodes and one with edges contained in each component

## Example Pipeline to perform a Component-Detection based on a PDB-File

In [3]:
from graphein.protein.visualisation import plotly_protein_structure_graph
import PDB_to_Graphein as pdg

path = 'pdb_samples/2mgo.pdb'

G = pdg.pdb_to_graph(path, only_covalent=False, gran="atom")
comp = pdg.find_components(G)
G = pdg.assign_components(G, comp)
attr_df = pdg.print_attributes(G)
comp_df = pdg.print_component_dataframe(comp)



Output()

Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge already in rigid component identified
Edge alread

In [5]:
p = plotly_protein_structure_graph(
        G,
        colour_edges_by="component",
        colour_nodes_by="component",
        label_node_ids=False,
        plot_title="Peptide backbone graph. Nodes coloured by components",
        node_size_multiplier=1
    )
p.show()

In [6]:
attr_df

Unnamed: 0,chain_id,residue_name,residue_number,atom_type,element_symbol,coords,b_factor,meiler,component
A:CYS:1:N,A,CYS,1,N,N,"[4.874, 2.855, 0.366]",0.0,dim_1 1.77 dim_2 0.13 dim_3 2.43 dim_...,0
A:CYS:1:CA,A,CYS,1,CA,C,"[3.791, 1.923, 0.075]",0.0,dim_1 1.77 dim_2 0.13 dim_3 2.43 dim_...,0
A:CYS:1:C,A,CYS,1,C,C,"[4.157, 0.518, 0.5]",0.0,dim_1 1.77 dim_2 0.13 dim_3 2.43 dim_...,0
A:CYS:1:O,A,CYS,1,O,O,"[5.232, 0.286, 1.045]",0.0,dim_1 1.77 dim_2 0.13 dim_3 2.43 dim_...,0
A:CYS:1:CB,A,CYS,1,CB,C,"[3.435, 1.946, -1.418]",0.0,dim_1 1.77 dim_2 0.13 dim_3 2.43 dim_...,0
...,...,...,...,...,...,...,...,...,...
A:GLY:9:N,A,GLY,9,N,N,"[-3.406, 8.654, -2.333]",0.0,dim_1 0.00 dim_2 0.00 dim_3 0.00 dim_...,0
A:GLY:9:CA,A,GLY,9,CA,C,"[-3.689, 10.034, -2.161]",0.0,dim_1 0.00 dim_2 0.00 dim_3 0.00 dim_...,0
A:GLY:9:C,A,GLY,9,C,C,"[-3.647, 10.353, -0.712]",0.0,dim_1 0.00 dim_2 0.00 dim_3 0.00 dim_...,0
A:GLY:9:O,A,GLY,9,O,O,"[-3.337, 11.487, -0.337]",0.0,dim_1 0.00 dim_2 0.00 dim_3 0.00 dim_...,0


In [7]:
comp_df

Unnamed: 0,Nodes,Edges
1,"{'A:PRO:7:CB', 'A:PRO:7:CA', 'A:PRO:7:CG', 'A:...","[['A:PRO:7:N', 'A:PRO:7:CD'], ['A:PRO:7:CG', '..."
2,"{'A:TYR:2:CD1', 'A:TYR:2:CZ', 'A:TYR:2:CE2', '...","[['A:TYR:2:CD1', 'A:TYR:2:CE2'], ['A:TYR:2:CZ'..."


## Basic Pebblegame with simple example

In [10]:
import generic_pebblegame as gp
import networkx as nx

In [11]:
sample09 = [("A", "B"), ("B", "C"), ("C", "D"), ("D", "E"), ("E", "F"), ("F", "A"), ("A", "G"), ("B", "H"), ("C", "H"),
            ("D", "H"), ("E", "G"), ("F", "G"), ("G", "H"),("A","I"),("B","I")]
sample09_graph = nx.from_edgelist(sample09)

In [29]:
gp.generic_pebblegame(sample09_graph, 2, 3)

well-constraint; l pebbles remain. no edge has been left out
