<a href="https://colab.research.google.com/github/Siddhartha96123/Prot-Conf-Het/blob/main/Conf_Het.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### WELCOME to the ***CLASSIC*** Executable **Conformational Heterogeneity Scanner**.

Please Press Play.

We need to first Install the Dependencies

In [None]:
# Required libraries installation
!pip install biopython

Now that the dependencies are done : Lets move onto the actual script.

The script will at the tail end of this code - ask for the **Reference PDB** first - which shall be needed to be uploaded from your local system, followed by the **Target PDB**.

Green tick on the left side of the Play button implies this part has been executed completely without any errors. Herein, we are :

1. Defining the function for uploading files and Parsing. Initalizing the ***Superimposer***.

2. Defining the function for aligning the structures.

3. Defining the function for calculating *RMSD/Res*

4. Defining the function to map the values on a an excel sheet.

5. Defining the function for plotting the values from the excel to a Graph.

In [6]:
from Bio.PDB import PDBParser, Superimposer, Selection
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from google.colab import files
import os

# Function to upload files in Colab
def upload_files():
    uploaded = files.upload()
    file_paths = list(uploaded.keys())
    return file_paths

def align_structures(reference_path, target_path):
    parser = PDBParser(QUIET=True)

    # Load the reference structure
    reference_structure = parser.get_structure("reference", reference_path)

    # Load the target structure
    target_structure = parser.get_structure("target", target_path)

    # Check if structures are empty
    if len(reference_structure) == 0 or len(target_structure) == 0:
        raise ValueError("One or both of the structures are empty.")

    # Initialize the superimposer
    superimposer = Superimposer()
    reference_atoms = []
    target_atoms = []

    # Iterate over residues in the reference structure and select only CA atoms
    for ref_chain in reference_structure[0]:
        for ref_residue in ref_chain:
            ref_residue_id = ref_residue.get_id()
            try:
                target_residue = target_structure[0][ref_chain.id][ref_residue_id]
                ref_atom = ref_residue['CA'] # Select C-alpha atom
                try:
                    target_atom = target_residue['CA'] # Select C-alpha atom
                    # Append atoms for alignment only if both exist
                    reference_atoms.append(ref_atom)
                    target_atoms.append(target_atom)
                except KeyError:
                    # Skip atoms not found in the target structure
                    continue
            except KeyError:
                # Skip residues not found in the target structure
                continue

    # Check if atoms are available for superimposition
    if not reference_atoms or not target_atoms:
        raise ValueError("No atoms available for reference superimposition.")

    # Perform the superimposition
    superimposer.set_atoms(reference_atoms, target_atoms)
    superimposer.apply(target_structure[0].get_atoms())

    return reference_structure, target_structure

def calculate_distances(reference_structure, target_structure):
    distances = []

    # Iterate over residues and calculate distances only for CA atoms
    for ref_chain in reference_structure[0]:
        for ref_residue in ref_chain:
            ref_residue_id = ref_residue.get_id()
            try:
                target_residue = target_structure[0][ref_chain.id][ref_residue_id]
                ref_atom = ref_residue['CA'] # Select C-alpha atom
                try:
                    target_atom = target_residue['CA'] # Select C-alpha atom
                    distance = np.linalg.norm(ref_atom.get_coord() - target_atom.get_coord())
                    distances.append((ref_residue_id[1], ref_residue_id[1], distance)) # Use residue ID
                except KeyError:
                    # Skip atoms not found in the target structure
                    continue
            except KeyError:
                # Skip residues not found in the target structure
                continue

    return distances

def save_distances_to_excel(distances, output_path):
    df = pd.DataFrame(distances, columns=["Reference Residue", "Aligned Residue", "Distance"])
    df.to_excel(output_path, index=False)

def plot_rmsf(distances, output_plot_path):
    reference_residues = [d[0] for d in distances]
    rmsf_values = [d[2] for d in distances]

    plt.scatter(reference_residues, rmsf_values)
    plt.xlabel("Reference Residue")
    plt.ylabel("RMSF (Å)")
    plt.title("Root Mean Square Fluctuation")
    plt.savefig(output_plot_path)
    plt.close()

The green tick on the above block here signals that the functions to perform every associated funciton within this tool have been ***Defined & Initialised***  

We can finally now, move to uploading the ***SINGLE CHAIN PDBs*** for all further analyses.

In [7]:
# Upload the PDB files individually
print("Please upload the reference PDB file.")
reference_pdb = upload_files()[0]

print("Please upload the target PDB file.")
target_pdb = upload_files()[0]

try:
    # Align structures
    reference_structure, aligned_structure = align_structures(reference_pdb, target_pdb)

    # Calculate distances between atoms
    distances = calculate_distances(reference_structure, aligned_structure)

    # Save distances to Excel
    output_excel = "temp_PDB_RMSF.xlsx"
    save_distances_to_excel(distances, output_excel)

    # Plot RMSF
    output_plot = "temp_PDB_RMSF_plot.png"
    plot_rmsf(distances, output_plot)

    # Download the files
    print("Script execution completed. \n The temp Excel file and the Plot shall be auto-downloaded to your local system now. \n Please allow permissions.")
    files.download(output_excel)
    files.download(output_plot)

except Exception as e:
    print("An error occurred during script execution:", str(e))

Please upload the reference PDB file.


Saving 1e22_1.pdb to 1e22_1 (2).pdb
Please upload the target PDB file.


Saving 4h02_A.pdb to 4h02_A (2).pdb
Script execution completed. 
 The Excel file and the Plot will be auto-downloaded to your system.


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# ***MAZE KARO.***