# **Generating Protein Structure Networks from PDB files Using the RING server**
---

![1yok_workflow.png](attachment:3df0ad85-c1c9-421e-9ea7-c5923d525162.png)

<font size='4'>The method employed in this project for generating protein structure networks begins with a folder or list of relevant Protein Data Bank (PDB) structure files. A PDB structure file is a standard file format used to represent three-dimensional molecular structures, primarily of proteins and nucleic acids. It contains detailed information about the atomic coordinates, chemical components, and molecular interactions within the structure. PDB structure files are crucial for understanding the molecular architecture and interactions of biological molecules and are widely used in structural biology, bioinformatics, and drug discovery research. They adhere to a standardized format maintained by the [Worldwide Protein Data Bank](https://www.rcsb.org/) to ensure compatibility and interoperability across various computational tools and databases. Here's what a typical PDB structure file contains:</font>

In [None]:
import pandas as pd
from Bio.PDB import PDBParser

def pdb_to_dataframe(pdb_file):
    parser = PDBParser(QUIET=True)
    structure = parser.get_structure('structure', pdb_file)

    data = []
    for model in structure:
        for chain in model:
            for residue in chain:
                for atom in residue:
                    data.append([
                        model.id,
                        chain.id,
                        residue.id[1],
                        residue.resname,
                        atom.name,
                        atom.coord[0],
                        atom.coord[1],
                        atom.coord[2],
                        atom.occupancy,
                        atom.bfactor
                    ])

    columns = [
        'Model', 'Chain', 'Residue_Number', 'Residue_Name',
        'Atom_Name', 'X', 'Y', 'Z', 'Occupancy', 'B_Factor'
    ]

    df = pd.DataFrame(data, columns=columns)
    return df

# Example usage:
pdb_file = 'structure_files/1yok.pdb'
pdb_df = pdb_to_dataframe(pdb_file)
pdb_df

Unnamed: 0,Model,Chain,Residue_Number,Residue_Name,Atom_Name,X,Y,Z,Occupancy,B_Factor
0,0,A,300,SER,N,28.985001,54.266998,34.372002,1.0,88.63
1,0,A,300,SER,CA,27.739000,54.654999,33.648998,1.0,88.62
2,0,A,300,SER,C,27.105000,53.480999,32.896999,1.0,89.53
3,0,A,300,SER,O,27.492001,53.158001,31.774000,1.0,90.12
4,0,A,300,SER,CB,28.028999,55.799000,32.671001,1.0,87.83
...,...,...,...,...,...,...,...,...,...,...
2589,0,N,1,UNK,I,26.409000,23.811001,36.709000,0.0,0.00
2590,0,N,1,UNK,S,30.770000,21.639999,30.334999,0.0,0.00
2591,0,N,1,UNK,H,23.517000,23.597000,28.778000,0.0,0.00
2592,0,N,1,UNK,HN,34.470001,26.077999,34.348999,1.0,0.00


<font size='4'>The [Residue Interaction Network Generator](https://ring.biocomputingup.it/) (RING) server is a computational tool designed to analyze and visualize residue interactions in proteins. It uses protein structure data to create residue interaction networks (RINs), which represent the network of contacts and interactions between amino acid residues within a protein structure.</font>

## **Key Features and Uses:**

#### 1. **Network Visualization:** RING server visualizes residue interactions in a protein as a network, where nodes represent residues and edges represent interactions (e.g., hydrogen bonds, van der Waals contacts).

#### 2. **Interaction Analysis:** It quantifies and categorizes different types of residue interactions within the protein structure.

#### 3. **Comparative Analysis:** It allows for the comparison of residue interaction networks between different protein structures or between different conformations of the same protein.

#### 4. **Dynamic Analysis:** It can analyze and visualize changes in residue interactions in molecular dynamics simulations or upon ligand binding.

#### 5. **Structural Bioinformatics:** The server provides insights into the structural basis of protein function, stability, and dynamics.

#### 6. **Predictive Modeling:** It can aid in predicting the effects of mutations or protein engineering on residue interactions and stability.

## **Why is it Useful for Research?**

<font size='4'>Protein structure networks are essential tools for researchers because they provide a detailed map of interactions between amino acids within a protein. This information is fundamental for understanding protein function, including enzymatic activity, molecular recognition, and signaling pathways. These networks also aid in predicting important protein properties such as stability, flexibility, and solubility, which are crucial for protein engineering and drug design. By identifying key residues and interactions, researchers can pinpoint critical sites that stabilize the protein structure or mediate specific functional roles, guiding both experimental studies and computational predictions. Comparative analysis of protein structures helps to reveal evolutionary relationships, structural motifs, and functional similarities or differences among proteins. Additionally, protein structure networks facilitate structure-based drug design by identifying potential targets for therapeutic agents. They also support molecular dynamics and simulation studies, enabling researchers to predict how mutations or environmental changes affect protein stability and function. Moreover, these networks allow for the systematic analysis of protein-protein interactions, elucidating how proteins form complexes and cooperate in biological processes. In the field of structural bioinformatics, protein structure networks serve as a foundation for developing algorithms and tools for protein structure prediction, analysis, and visualization, thereby advancing research in molecular biology and biophysics.</font>

<font size='4'>In summary, the RING server is a powerful tool in structural bioinformatics that facilitates the analysis, visualization, and comparison of residue interaction networks within protein structures, contributing significantly to research in molecular biology, biophysics, and drug discovery.</font>

### User defined functions for creating necessary folders:

In [None]:
# Function to verify or create folder
def ensure_folder_exists(directory_path):
    if not os.path.exists(directory_path):
        os.makedirs(directory_path)
        print(f"Folder '{directory_path}' created.")
    else:
        print(f"Folder '{directory_path}' already exists.")

# Function for coverting pdb files to cif files
def convert_pdb_to_cif(pdb_file, cif_file):
    command = [
        "pymol", "-cq", pdb_file,
        "-d", f"save {cif_file}, state=1; quit;"
    ]
    try:
        subprocess.run(command, check=True)
        print(f"Conversion successful: {pdb_file} -> {cif_file}")
    except subprocess.CalledProcessError as e:
        print(f"Error during conversion: {e}")

# Generate Directory of .cif Files from Directory of Selected PDB Files:
---

In [None]:
import os
import subprocess

# Directories
input_dir = "structure_files/"
output_dir = "cif_files/"

# Ensure directories exist
ensure_folder_exists(input_dir)
ensure_folder_exists(output_dir)

# Loop over all pdb files in the input directory
for filename in os.listdir(input_dir):
    if filename.endswith(".pdb"):
        pdb_file_path = os.path.join(input_dir, filename)
        cif_file_name = os.path.splitext(filename)[0] + ".cif"
        cif_file_path = os.path.join(output_dir, cif_file_name)

        convert_pdb_to_cif(pdb_file_path, cif_file_path)


# RING Server API:
---

In [None]:
module_code = '''

import requests
import os
import time
from threading import Thread
import shutil

list_task_url = "https://scheduler.biocomputingup.it/task/"
list_script_url = "https://scheduler.biocomputingup.it/script/"
list_params_url = "https://scheduler.biocomputingup.it/params/"

class Status:
    statusMap = {
        "task has been rejected from the ws": "failed",
        "task has been received from the ws": "pending",
        "task has been created and sent to the DRM": "pending",
        "process status cannot be determined": "pending",
        "job is queued and active": "running",
        "job is queued and in system hold": "running",
        "job is queued and in user hold": "running",
        "job is queued and in user and system hold": "running",
        "job is running": "running",
        "job is system suspended": "pending",
        "job is user suspended": "pending",
        "job finished normally": "success",
        "job finished, but failed": "failed",
        "job has been deleted": "deleted"
    }

    def __init__(self, status):
        self.status = self.decode_status(status)

    def __repr__(self):
        return self.status

    def __eq__(self, other):
        return self.status == other

    def decode_status(self, status_long):
        return self.statusMap[status_long]


class Task:
    _status: [Status, None] = None
    _uuid: [str, None] = None

    def __init__(self, uuid=None, status=None):
        self.uuid = uuid
        self.status = status

    def __repr__(self) -> str:
        return "{} - {}".format(self.uuid, self.status)

    @property
    def status(self):
        return self._status

    @status.setter
    def status(self, status):
        self._status = Status(status) if status is not None else status

    @property
    def uuid(self):
        return self._uuid

    @uuid.setter
    def uuid(self, uuid):
        self._uuid = uuid

    def is_finished(self) -> bool:
        return self._status == "failed" or self._status == "deleted" or self._status == "success"


def check_for_job(task):
    try:
        job_url = "{}/{}".format(list_task_url, task.uuid)

        while not task.is_finished():
            response = requests.get(job_url, timeout=5)
            response.raise_for_status()
            task.status = response.json()["status"]
            if not task.is_finished():
                time.sleep(3)

    except requests.exceptions.RequestException as err:
        return err


def post_job(task, file_pth, params):
    try:
        files = {'input_file': open(file_pth, 'rb')}

        response = requests.post(list_task_url, files=files, data=params, timeout=5)
        response.raise_for_status()
        task.uuid = response.json()["uuid"]
        task.status = response.json()["status"]

    except requests.exceptions.RequestException as err:
        return err


def download_results(task, extract_pth):
    if task.status == "failed":
        return
    try:
        output_url = "{}/{}/{}".format(list_task_url, task.uuid, "download")

        response = requests.get(output_url, timeout=5)
        response.raise_for_status()
        file_name = response.headers["content-disposition"].split("filename=")[1]
        file_pth = "{}/{}".format(extract_pth, file_name)

        with open(file_pth, "wb") as f:
            f.write(response.content)

        shutil.unpack_archive(file_pth, extract_pth)
        os.remove(file_pth)

    except requests.exceptions.RequestException as err:
        return err


def config_to_parameters(config):
    convert = {
        "-g": "seq_sep",
        "-o": "len_salt",
        "-s": "len_ss",
        "-k": "len_pipi",
        "-a": "len_pica",
        "-b": "len_hbond",
        "-w": "len_vdw"
    }

    new_config = {}

    for key, value in config.items():
        if key in convert:
            new_config[convert[key]] = value.strip("--")

    new_config[config["edges"].strip("--")] = True

    return new_config


def run_ring_api(file_pth, run_config, tmp_dir, progress_f):

    task = Task()

    file_name = os.path.basename(file_pth)

    parameters = {
        "task_name": "ring-plugin-api",
        "original_name": file_name
    }

    parameters.update(config_to_parameters(run_config))

    t_post_job = Thread(target=post_job, args=(task, file_pth, parameters))
    t_post_job.start()

    prev_progress = 0
    while t_post_job.is_alive():
        progress_f(min([prev_progress, 15]))
        prev_progress += 0.01

    t_check_job = Thread(target=check_for_job, args=(task,))
    t_check_job.start()

    prev_progress = 15
    timer = time.time() - 5
    while t_check_job.is_alive():
        if time.time() - timer > 5:
            timer = time.time()

        progress_f(min([prev_progress, 85]))
        prev_progress += 0.00001

    t_download_results = Thread(target=download_results, args=(task, tmp_dir))
    t_download_results.start()

    prev_progress = 85

    while t_download_results.is_alive():
        progress_f(min([prev_progress, 100]))
        prev_progress += 0.01

    progress_f(100)
'''

with open('ring_api_module.py', 'w') as file:
    file.write(module_code)

# Run RING API using Strict Thresholds:
---

In [None]:
import os
from ring_api_module import run_ring_api

def progress(p):
    print(f"Progress: {p:.02f}%")

dir_name = 'cif_files/'
crnt_dir = os.getcwd()
dir_path = os.path.join(crnt_dir, dir_name)
file_names = os.listdir(dir_path)

'''
-g : sequence separation
-o : salt bridge (ionic bond)
-s : disulphide bond distance
-k : pi-pi stacking distance
-a : pi-cation distance
-b : hydrogen bond distance
-w : van Der Waals radii
'''

run_config = {
    "-g": '3',
    "-o": '4.0',
    "-s": '2.5',
    "-k": '6.5',
    "-a": '5.0',
    "-b": '3.5',
    "-w": '0.5',
    "edges": "--best_edge"
}
tmp_dir = 'results/'
ensure_folder_exists(tmp_dir)

cif_files = [file for file in file_names if file.endswith('.cif')]

for file_name in cif_files:
    file_path = os.path.join(dir_path, file_name)
    run_ring_api(file_path, run_config, tmp_dir, progress)


# Run RING API Using Relaxed Thresholds:
---

In [None]:
import os
from ring_api_module import run_ring_api

def progress(p):
    print(f"Progress: {p:.02f}%")

dir_name = 'cif_files/'
crnt_dir = os.getcwd()
dir_path = os.path.join(crnt_dir, dir_name)
file_names = os.listdir(dir_path)

'''
-g : sequence separation
-o : salt bridge (ionic bond)
-s : disulphide bond distance
-k : pi-pi stacking distance
-a : pi-cation distance
-b : hydrogen bond distance
-w : van Der Waals radii
'''

run_config = {
    "-g": '3',
    "-o": '5.0',
    "-s": '3.0',
    "-k": '7.0',
    "-a": '7.0',
    "-b": '5.5',
    "-w": '0.8',
    "edges": "--best_edge"
}
tmp_dir = 'results2/'
ensure_folder_exists(tmp_dir)

cif_files = [file for file in file_names if file.endswith('.cif')]

for file_name in cif_files:
    file_path = os.path.join(dir_path, file_name)
    run_ring_api(file_path, run_config, tmp_dir, progress)


# Result from Executing the RING API Using Relaxed Parameters:
---

In [None]:
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

# Specify the path to your tab-separated file
tsv_file = 'results2/1yok.cif_ringEdges'

# Read the tab-separated file into a DataFrame
df = pd.read_csv(tsv_file, sep='\t')
df

Unnamed: 0,NodeId1,Interaction,NodeId2,Distance,Angle,Energy,Atom1,Atom2,Donor,Positive,Cation,Orientation,Model
0,A:300:_:SER,VDW:MC_SC,A:488:_:ASN,3.976,,6.0,C,ND2,,,,,1
1,A:301:_:ILE,VDW:SC_SC,A:305:_:ILE,3.655,,6.0,CG2,CG2,,,,,1
2,A:301:_:ILE,VDW:SC_SC,A:306:_:LEU,3.774,,6.0,CB,CD2,,,,,1
3,A:301:_:ILE,VDW:SC_SC,A:446:_:ARG,3.918,,6.0,CD1,CD,,,,,1
4,A:302:_:PRO,VDW:SC_SC,A:305:_:ILE,4.186,,6.0,CB,CD1,,,,,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
425,C:745:_:LEU,VDW:MC_SC,C:748:_:LEU,4.197,,6.0,C,CB,,,,,1
426,C:745:_:LEU,VDW:MC_SC,C:749:_:LEU,3.788,,6.0,C,CD2,,,,,1
427,C:746:_:ARG,HBOND:SC_SC,C:750:_:ASP,3.383,17.538,17.0,NE,OD2,C:746:_:ARG,,,,1
428,C:747:_:TYR,HBOND:SC_SC,C:751:_:LYS,3.541,48.081,17.0,OH,NZ,C:747:_:TYR,,,,1


# The Rendered PSN of 1YOK using Networkx and Plotly:
---

In [None]:
import pandas as pd
import networkx as nx
import plotly.graph_objects as go
import warnings
import plotly.io as pio

warnings.simplefilter(action='ignore', category=FutureWarning)

# Specify the path to your tab-separated file
tsv_file = 'results2/1yok.cif_ringEdges'

# Read the tab-separated file into a DataFrame
df = pd.read_csv(tsv_file, sep='\t')

# Create the graph from the DataFrame
G = nx.from_pandas_edgelist(df, 'NodeId1', 'NodeId2')

# Create a force-directed layout using NetworkX's spring_layout
pos = nx.spring_layout(G, k=0.75, seed=42)  # Use a fixed seed for reproducibility

# Extract the positions
x_nodes = [pos[node][0] for node in G.nodes()]
y_nodes = [pos[node][1] for node in G.nodes()]

# Extract the edges
edge_trace = []
for edge in G.edges():
    x0, y0 = pos[edge[0]]
    x1, y1 = pos[edge[1]]
    edge_trace.append(go.Scatter(
        x=[x0, x1, None],
        y=[y0, y1, None],
        mode='lines',
        line=dict(width=1, color='gray'),
        hoverinfo='none'
    ))

# Create node trace
node_trace = go.Scatter(
    x=x_nodes,
    y=y_nodes,
    mode='markers+text',
    text=[str(node) for node in G.nodes()],
    textposition="bottom center",
    hoverinfo='text',
    marker=dict(
        showscale=False,
        color='orange',
        size=10,
        line=dict(width=1, color='black')
    )
)

# Create the figure
fig = go.Figure(data=edge_trace + [node_trace],
                layout=go.Layout(
                    title='Protein Structure Network of 1YOK Using RING Server',
                    titlefont_size=16,
                    showlegend=False,
                    hovermode='closest',
                    margin=dict(b=20, l=5, r=5, t=40),
                    # xaxis=dict(showgrid=True, zeroline=False, showticklabels=False, gridcolor='blue'),
                    # yaxis=dict(showgrid=True, zeroline=False, showticklabels=False, gridcolor='blue'),
                    dragmode='select'
                ))

# Update the layout to set width and height, and background color
fig.update_layout(
    width=1000,
    height=1000,
    plot_bgcolor='white',  # Set the background color to white
    paper_bgcolor='white',  # Set the color of the plotting area to white
)

pio.write_html(fig, file='network_graph.html', auto_open=True)
# Show the figure
fig.show()


In [None]:
from IPython.display import IFrame

# Embed the local HTML file inside an iframe
IFrame(src='network_graph.html', width='100%', height=1000)