<a href="https://colab.research.google.com/github/eoinleen/PDB-tools/blob/main/Copy_of_PDB_Chain_Renamer_Super_Simple_Version.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
"""
PDB Chain Renamer - Super Simple Version


======================================

What does this thing do?
-----------------------
Changes any chain designation in a PDB file to any other chain designation!
Currently set up to change E→A and F→B, but can change ANY chain to ANY other chain.
Changes both the chain ID and the segment ID at the same time.
Saves a new file with 'mod_' at the start of the name.

What do you need?
---------------
1. A PDB file with any chain designations you want to change
2. The file must be in your Google Drive
3. That's it!

Where do I put stuff?
-------------------
1. Your PDB file goes in a folder in your Google Drive
2. The script will save the new file in the same folder
3. Example path: /content/drive/MyDrive/your_folder/your_file.pdb

What do I need to change in the code?
----------------------------------
TWO things:

1. The chain mapping (if you want something other than E→A and F→B):
  Look for this line:
  chain_map={'E': 'A', 'F': 'B'}

  Change it to whatever you want, for example:
  chain_map={'A': 'X', 'B': 'Y'}  # This would change chain A to X and B to Y
  chain_map={'C': 'A'}  # This would just change chain C to A

2. The file location:
  Look for this line near the bottom:
  input_pdb_file = "/content/drive/MyDrive/PDB-files/PDB-files-for-mod/3NOB-EF/3NOB-EF.pdb"

  Change it to point to your PDB file like:
  input_pdb_file = "/content/drive/MyDrive/your_folder/your_file.pdb"

What will I get?
--------------
1. A new PDB file that:
  - Has the same name as your input but with 'mod_' at the start
  - Has all chain designations changed according to your chain_map
  - Works properly in PyMOL

Example:
Input:  protein.pdb (with chains C and D)
Output: mod_protein.pdb (with chains X and Y, if you set chain_map={'C': 'X', 'D': 'Y'})
It should be able to handle lots of chains, not just 2.

How do I know it worked?
----------------------
The script will:
1. Tell you how many lines it processed
2. Show you the first few lines before and after
3. Tell you exactly where it saved the new file

Common Problems:
--------------
1. "File not found": Double-check your file path
2. "Error reading file": Make sure your PDB file isn't corrupted
3. "Drive not mounted": Click the link that appears to connect to Google Drive

Created by: Claude (Anthropic)
Version: 1.0
Last Updated: 2024
"""

# Mount Google Drive
from google.colab import drive
import os
drive.mount('/content/drive')

def clean_pdb_name(filename):
    """
    Create a PyMOL-friendly filename
    """
    # Remove .pdb extension if present
    base = os.path.splitext(filename)[0]
    # Remove special characters and replace with underscores
    clean = ''.join(c if c.isalnum() else '_' for c in base)
    # Add .pdb extension back
    return f"{clean}.pdb"

def rename_chains_in_pdb(input_file, chain_map):
    """
    Rename chains in PDB file according to chain_map.
    """
    modified_lines = []
    line_count = 0
    modified_count = 0

    try:
        with open(input_file, 'r') as file:
            for line in file:
                line_count += 1
                if not line.strip():
                    modified_lines.append(line)
                    continue

                if len(line) < 22:
                    modified_lines.append(line)
                    continue

                if line.startswith(("ATOM", "HETATM")):
                    chain_id = line[21]
                    if chain_id in chain_map:
                        modified_count += 1
                        if len(line) < 76:
                            line = line.rstrip() + ' ' * (80 - len(line.rstrip()))

                        new_line = (line[:21] +
                                  chain_map[chain_id] +
                                  line[22:72] +
                                  chain_map[chain_id].ljust(4) +
                                  line[76:])
                        modified_lines.append(new_line)
                        continue
                elif line.startswith("TER"):
                    chain_id = line[21]
                    if chain_id in chain_map:
                        modified_count += 1
                        new_line = line[:21] + chain_map[chain_id] + line[22:]
                        modified_lines.append(new_line)
                        continue

                modified_lines.append(line)

        print(f"Processed {line_count} lines")
        print(f"Modified {modified_count} ATOM/HETATM/TER records")
        return modified_lines

    except Exception as e:
        print(f"Error reading file: {e}")
        print(f"Error occurred at line {line_count}")
        return None

def process_pdb_file(input_file, chain_map={'B': 'A', 'C': 'B'}):
    """
    Process a PDB file and save modified version.
    """
    if not os.path.exists(input_file):
        print(f"Error: Input file not found: {input_file}")
        return False

    directory = os.path.dirname(input_file)
    filename = os.path.basename(input_file)
    clean_name = clean_pdb_name(filename)
    output_file = os.path.join(directory, f"mod_{clean_name}")

    print(f"Processing: {filename}")
    print(f"Input path: {input_file}")
    print(f"Output path: {output_file}")

    modified_content = rename_chains_in_pdb(input_file, chain_map)

    if modified_content:
        try:
            with open(output_file, 'w') as file:
                file.writelines(modified_content)
            print(f"\nSuccessfully saved modified PDB to: {output_file}")

            # Print first few lines of output for verification
            print("\nFirst few lines of modified file:")
            with open(output_file, 'r') as f:
                for i, line in enumerate(f):
                    if i < 5:
                        print(line.rstrip())
            return True
        except Exception as e:
            print(f"Error saving file: {e}")
            return False
    return False

# Path to your PDB file
input_pdb_file = "/content/drive/MyDrive/PDB-files/PDB-files-for-mod/RA-MF/RA-MF_Ub2-K11-str.pdb"

# Process the file
success = process_pdb_file(input_pdb_file)
if success:
    print("\nChain renaming completed successfully!")
else:
    print("\nFailed to process PDB file.")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Processing: RA-MF_Ub2-K11-str.pdb
Input path: /content/drive/MyDrive/PDB-files/PDB-files-for-mod/RA-MF/RA-MF_Ub2-K11-str.pdb
Output path: /content/drive/MyDrive/PDB-files/PDB-files-for-mod/RA-MF/mod_RA_MF_Ub2_K11_str.pdb
Processed 2340 lines
Modified 1168 ATOM/HETATM/TER records

Successfully saved modified PDB to: /content/drive/MyDrive/PDB-files/PDB-files-for-mod/RA-MF/mod_RA_MF_Ub2_K11_str.pdb

First few lines of modified file:
CRYST1  157.232  157.232  157.232  90.00  90.00  90.00 P 43 3 2      1
ATOM      1  N   MET A   1       4.453  15.829 -59.881  1.00 59.64      A    N
ANISOU    1  N   MET B   1     8141   5192   9328    921    491   -209       N
ATOM      2  CA  MET A   1       3.866  14.881 -58.897  1.00 60.38      A    C
ANISOU    2  CA  MET B   1     8098   5605   9239    983    603   -416       C

Chain renaming completed successfully!


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive
