<html>
    <summary></summary>
         <div> <p></p> </div>
         <div style="font-size: 20px; width: 800px;"> 
              <h1>
               <left>Process MUSTANG Structural Alignment for Dihedral Angle Diffusion (DAD) Model</left>
              </h1>
              <p><left>============================================================================</left> </p>
<pre>May, 2025
Dihedral Angle Diffusion (DAD) model for structural phylogenetics
Clementine Yan, Walter Xie, Alex Popinga, Alexei Drummond
Notebook by: Alex Popinga
</pre>
         </div>
    </p>

</html>

<details>
  <summary>Copyright info</summary>
<details>



#### This notebook will process the ferritin protein structural alignment by doing the following steps:
- ##### Step 1 - Extract the individual structures and the amino acid sequence alignment from the PDB structural alignment (Input: PDB alignment file; Output: PDB individual structure files, FASTA file) and ensure they are the same length.
    * ##### Step 1.a - Readd structure meta-data (protein ID) to MUSTANG alignment file (MUSTANG does not add IDs to each structure in its results).
    * ##### Step 1.b - Extract the individual structures and output as separate PDB files, and extract the FASTA amino acid sequence alignment.
    * ##### Step 1.c - Truncate the FASTA alignment to ensure equal length of sequences.
    * ##### Step 1.d - Truncate the PDBs to ensure equal length of structures.
- ##### Step 2 - Compute alignment scores (RMSD, TM-score, Q-score).
- ##### **Step 3 - Convert the Cartesian coordinates of alpha-Carbons in the individual PDB structures to dihedral angles ready for structural phylogenetic inference.**

### Preliminary - install/import

In [2]:
# Ensure the Bio module is installed
#%pip install biopython

# Import the necessary libraries
import os
import re
import csv
import math
import Bio
from Bio import SeqIO
from Bio import PDB
import numpy as np
from itertools import combinations

### Step 1.a - Add the "HEADER"s and protein structure ID (from PDB) to each structure within the MUSTANG alignment by matching it with the downloaded X-ray diffraction structures.

In [None]:
from Bio.PDB.Polypeptide import PPBuilder

def extract_sequence(pdb_file):
    """
    Extracts the amino acid sequence from a PDB file.
    Returns the sequence as a string.
    """
    parser = PDB.PDBParser(QUIET=True)
    structure = parser.get_structure("protein", pdb_file)
    ppb = PPBuilder()
    
    sequence = ""
    for pp in ppb.build_peptides(structure):
        sequence += pp.get_sequence()

    return sequence

def match_pdb_sequences(results_pdb, pdb_folder):
    """
    Matches structures in results.pdb with structures in pdb_folder based on sequence similarity.
    Returns a list of structure IDs in the correct order.
    """
    # Extract sequences from reference PDB files
    pdb_sequences = {}
    for pdb_file in os.listdir(pdb_folder):
        if pdb_file.endswith(".pdb"):
            pdb_path = os.path.join(pdb_folder, pdb_file)
            pdb_sequences[pdb_file] = extract_sequence(pdb_path)

    # Extract sequences from results.pdb
    parser = PDB.PDBParser(QUIET=True)
    structure = parser.get_structure("results", results_pdb)
    ppb = PPBuilder()

    results_sequences = []
    for model in structure:
        for chain in model:
            for pp in ppb.build_peptides(chain):
                results_sequences.append(pp.get_sequence())

    # Match structures by sequence
    matched_pdbs = []
    for res_seq in results_sequences:
        best_match = None
        best_identity = 0

        for pdb_id, pdb_seq in pdb_sequences.items():
            identity = sum(1 for a, b in zip(res_seq, pdb_seq) if a == b) / min(len(res_seq), len(pdb_seq))

            if identity > best_identity:
                best_match = pdb_id
                best_identity = identity

        if best_match:
            matched_pdbs.append(best_match)

    return matched_pdbs

def insert_headers(results_pdb, pdb_folder, output_pdb):
    """
    Processes the MUSTANG results PDB file and inserts HEADER lines with structure IDs
    based on sequence matching.
    """
    matched_pdbs = match_pdb_sequences(results_pdb, pdb_folder)
    
    new_lines = []
    first_header_inserted = False
    pdb_index = 0

    with open(results_pdb, "r") as infile:
        lines = infile.readlines()

    for i, line in enumerate(lines):
        new_lines.append(line)

        # Insert the first HEADER after the last REMARK line
        if not first_header_inserted and line.startswith("REMARK") and (i == len(lines) - 1 or not lines[i + 1].startswith("REMARK")):
            if pdb_index < len(matched_pdbs):
                new_lines.append(f"HEADER    {matched_pdbs[pdb_index]}\n")
                pdb_index += 1
                first_header_inserted = True

        # Insert HEADER after each TER, except before END
        if line.startswith("TER") and i < len(lines) - 1 and not lines[i + 1].startswith("END"):
            if pdb_index < len(matched_pdbs):
                new_lines.append(f"HEADER    {matched_pdbs[pdb_index]}\n")
                pdb_index += 1

    with open(output_pdb, "w") as outfile:
        outfile.writelines(new_lines)

# Process our MUSTANG alignment of ferritin structures
results_pdb = "MUSTANG Structural Alignments/results.pdb"
pdb_folder = "ferritin_xray_structures_PDB/"
output_pdb = "MUSTANG Structural Alignments/ferritins_MUSTANG_alignment.pdb"

insert_headers(results_pdb, pdb_folder, output_pdb)
print(f"Processed {results_pdb} -> {output_pdb}")


Processed MUSTANG Structural Alignments/results.pdb -> MUSTANG Structural Alignments/results_with_headers.pdb


### Step 1.b - Extract the individual ferritin structures and amino acid sequences and output as individual PDB files and FASTA file, respectively.

In [35]:
def extract_ferritin_structures(input_pdb_file):
    with open(input_pdb_file, 'r') as file:
        lines = file.readlines()
    
    current_structure = []
    structure_id = None
    sequences = {}
    
    for line in lines:
        if line.startswith("HEADER"):  # Identify the start of a new structure
            if structure_id and current_structure:
                write_pdb(structure_id, current_structure)
                sequences[structure_id] = extract_sequence(current_structure)
            structure_id = line.split()[-1]  # Last word in HEADER is assumed to be the ID
            structure_id = re.sub(r"-tran\.pdb$", "", structure_id)  # Remove "-tran.pdb" suffix
            current_structure = [line]
        elif line.startswith("TER"):  # Termination of the current structure
            current_structure.append(line)
            if structure_id:
                write_pdb(structure_id, current_structure)
                sequences[structure_id] = extract_sequence(current_structure)
            current_structure = []
            structure_id = None
        elif structure_id:
            current_structure.append(line)
    
    write_fasta(input_pdb_file, sequences)

def write_pdb(structure_id, pdb_lines):
    output_dir = "MUSTANG Structural Alignments/individual ferritin structures after alignment"
    os.makedirs(output_dir, exist_ok=True)  # Ensure the directory exists
    filename = os.path.join(output_dir, f"{structure_id}")
    with open(filename, 'w') as out_file:
        out_file.writelines(pdb_lines)
    print(f"Extracted: {structure_id}")

def extract_sequence(pdb_lines):
    amino_acids = {
        'ALA': 'A', 'ARG': 'R', 'ASN': 'N', 'ASP': 'D', 'CYS': 'C',
        'GLN': 'Q', 'GLU': 'E', 'GLY': 'G', 'HIS': 'H', 'ILE': 'I',
        'LEU': 'L', 'LYS': 'K', 'MET': 'M', 'PHE': 'F', 'PRO': 'P',
        'SER': 'S', 'THR': 'T', 'TRP': 'W', 'TYR': 'Y', 'VAL': 'V'
    }
    sequence = []
    seen_residues = set()
    
    for line in pdb_lines:
        if line.startswith("ATOM") and line[13:15].strip() == "CA":
            res_name = line[17:20].strip()
            res_num = line[22:26].strip()
            
            if (res_name, res_num) not in seen_residues:
                seen_residues.add((res_name, res_num))
                if res_name in amino_acids:
                    sequence.append(amino_acids[res_name])
    
    return "".join(sequence)

def write_fasta(input_pdb_file, sequences):
    fasta_filename = os.path.splitext(os.path.basename(input_pdb_file))[0] + "_sequence_alignment.fasta"
    with open(fasta_filename, 'w') as fasta_file:
        for structure_id, sequence in sequences.items():
            fasta_file.write(f">{structure_id}\n{sequence}\n")
    print(f"FASTA file created: {fasta_filename}")
    
# Specify the alignment path and file name and call our extract_ferritin_structures function.   
if __name__ == "__main__":
    input_pdb_file_m = "MUSTANG Structural Alignments/ferritins_MUSTANG_alignment.pdb"
    extract_ferritin_structures(input_pdb_file_m)

Extracted: 1BCFA.pdb
Extracted: 1jigA.pdb
Extracted: 1nfvA.pdb
Extracted: 1uvhA.pdb
Extracted: 2jd70.pdb
Extracted: 1o9rA.pdb
Extracted: 1vlgA.pdb
Extracted: 2uw1A.pdb
Extracted: 1dpsA.pdb
Extracted: 1jtsA.pdb
Extracted: 1otkA.pdb
Extracted: 1eumA.pdb
Extracted: 1krqA.pdb
Extracted: 1qghA.pdb
Extracted: 2chpA.pdb
Extracted: 2vzbA.pdb
Extracted: 1jgcA.pdb
Extracted: 1lb3A.pdb
Extracted: 1r03A.pdb
Extracted: 1ji4A.pdb
Extracted: 2fkzA.pdb
Extracted: 3e6sA.pdb
Extracted: 1ji5A.pdb
Extracted: 1tjoA.pdb
Extracted: 2fzfA.pdb
FASTA file created: ferritins_MUSTANG_alignment_sequence_alignment.fasta


### Step 1.c - Truncate FASTA alignment so that all sequences are the same length (as the shortest sequence).

In [30]:
def read_sequences(fasta_file):
    """Parse FASTA file and save sequences."""
    sequences = list(SeqIO.parse(fasta_file, "fasta"))
    
    return sequences

def truncate_sequences(sequences):
    """Finds the shortest sequence length and truncates all sequences to that length."""
    min_length = min(len(record.seq) for record in sequences)
    
    truncated_sequences = []
    for record in sequences:
        truncated_record = record[:min_length]  # Truncate sequence
        truncated_sequences.append(truncated_record)
    
    return truncated_sequences, min_length

def save_truncated_fasta(sequences, output_file):
    """Writes truncated sequences to a new FASTA file."""
    with open(output_file, "w") as f:
        SeqIO.write(sequences, f, "fasta")

def main():
    input_fasta = "ferritins_sequence_alignment.fasta"  # Change this to your input file
    output_fasta = "ferritins_truncated_alignment.fasta"
    
    # Read sequences from input FASTA file
    sequences = read_sequences(input_fasta)

    # Truncate sequences
    truncated_sequences, min_length = truncate_sequences(sequences)
    
    # Display shortest sequence length
    print(f"\nShortest sequence length: {min_length}")
    
    # Save truncated alignment
    save_truncated_fasta(truncated_sequences, output_fasta)
    print(f"Truncated alignment saved to: {output_fasta}")

if __name__ == "__main__":
    main()


Shortest sequence length: 142
Truncated alignment saved to: ferritins_truncated_alignment.fasta


### Step 1.d - Truncate PDB structures according to truncated sequences.

In [31]:
class PDBTruncator:
    def __init__(self, fasta_file, pdb_folder, output_folder):
        """
        Initialise with the truncated FASTA file and the folder containing the PDB files.
        """
        self.fasta_file = fasta_file
        self.pdb_folder = pdb_folder
        self.output_folder = output_folder
        self.parser = PDB.PDBParser(QUIET=True)
        self.io = PDB.PDBIO()
        os.makedirs(self.output_folder, exist_ok=True)

    def get_truncated_lengths(self):
        """
        Extracts sequence lengths from the truncated FASTA file.
        Returns a dictionary mapping sequence IDs to their truncated length.
        """
        lengths = {}
        for record in SeqIO.parse(self.fasta_file, "fasta"):
            seq_id = record.id.split()[0]  
            seq_id = seq_id.replace(".pdb", "")
            lengths[seq_id] = len(record.seq)  # Get the truncated length
        return lengths

    def truncate_pdb(self, pdb_file, truncated_length):
        """
        Truncates a PDB file to match the corresponding truncated sequence length.
        """
        structure = self.parser.get_structure("protein", pdb_file)
        model = structure[0]  # Assume single model
        truncated_residues = []
        
        for chain in model:
            residues = [res for res in chain.get_residues() if "CA" in res]
            truncated_residues = residues[:truncated_length]  # Keep only truncated residues
            
            # Create a new structure with the truncated residues
            truncated_model = PDB.Model.Model(0)
            truncated_chain = PDB.Chain.Chain(chain.id)
            for res in truncated_residues:
                truncated_chain.add(res.copy())
            truncated_model.add(truncated_chain)
            
            truncated_structure = PDB.Structure.Structure("truncated")
            truncated_structure.add(truncated_model)

            return truncated_structure  # Return truncated structure
    
    def process_pdb_files(self):
        """
        Loops through PDB files and truncates them based on the truncated FASTA alignment.
        """
        truncated_lengths = self.get_truncated_lengths()

        for pdb_filename in os.listdir(self.pdb_folder):
            if pdb_filename.endswith(".pdb"):
                pdb_id = os.path.splitext(pdb_filename)[0]  # Extract sequence ID
                pdb_path = os.path.join(self.pdb_folder, pdb_filename)

                if pdb_id in truncated_lengths:
                    truncated_length = truncated_lengths[pdb_id]
                    truncated_structure = self.truncate_pdb(pdb_path, truncated_length)

                    # Save truncated PDB file
                    output_pdb_path = os.path.join(self.output_folder, pdb_filename)
                    self.io.set_structure(truncated_structure)
                    self.io.save(output_pdb_path)
                    print(f"Truncated PDB saved: {output_pdb_path}")
                else:
                    print(f"Skipping {pdb_filename}: No matching sequence in FASTA.")

if __name__ == "__main__":
    fasta_file = "ferritins_truncated_alignment.fasta"
    pdb_folder = "MUSTANG Structural Alignments/individual ferritin structures after alignment/"
    output_folder = "truncated_ferritins/"

    truncator = PDBTruncator(fasta_file, pdb_folder, output_folder)
    truncator.process_pdb_files()


Truncated PDB saved: truncated_ferritins/1jgcA.pdb
Truncated PDB saved: truncated_ferritins/1krqA.pdb
Truncated PDB saved: truncated_ferritins/2vzbA.pdb
Truncated PDB saved: truncated_ferritins/1o9rA.pdb
Truncated PDB saved: truncated_ferritins/1nfvA.pdb
Truncated PDB saved: truncated_ferritins/1lb3A.pdb
Truncated PDB saved: truncated_ferritins/2chpA.pdb
Truncated PDB saved: truncated_ferritins/1jtsA.pdb
Truncated PDB saved: truncated_ferritins/1uvhA.pdb
Truncated PDB saved: truncated_ferritins/1dpsA.pdb
Truncated PDB saved: truncated_ferritins/1jigA.pdb
Truncated PDB saved: truncated_ferritins/1vlgA.pdb
Truncated PDB saved: truncated_ferritins/3e6sA.pdb
Truncated PDB saved: truncated_ferritins/2uw1A.pdb
Truncated PDB saved: truncated_ferritins/1otkA.pdb
Truncated PDB saved: truncated_ferritins/2fkzA.pdb
Truncated PDB saved: truncated_ferritins/1r03A.pdb
Truncated PDB saved: truncated_ferritins/1ji5A.pdb
Truncated PDB saved: truncated_ferritins/1qghA.pdb
Truncated PDB saved: truncated_

### Step 2 - Compute MUSTANG alignment scores (RMSD, TM-score, Q-score).

In [22]:
class ProteinAlignmentEvaluator:
    def __init__(self, pdb_files):
        """Initialize with a list of PDB files containing aligned structures."""
        self.pdb_files = pdb_files
        self.parser = PDB.PDBParser(QUIET=True)
        
    def get_ca_coordinates(self, pdb_file):
        """Extracts C-alpha coordinates from a PDB file."""
        structure = self.parser.get_structure("protein", pdb_file)
        ca_coords = []
        for model in structure:
            for chain in model:
                for residue in chain:
                    if "CA" in residue:
                        ca_coords.append(residue["CA"].coord)
        return np.array(ca_coords)

    def compute_rmsd(self, coords1, coords2):
        """Computes RMSD between two aligned coordinate sets."""
        return np.sqrt(np.mean(np.sum((coords1 - coords2) ** 2, axis=1)))

    def compute_tm_score(self, coords1, coords2):
        """Computes TM-score based on aligned structures."""
        L = min(len(coords1), len(coords2))
        d0 = 1.24 * (L ** (1/3)) - 1.8  # TM-score normalization factor
        distances = np.sqrt(np.sum((coords1 - coords2) ** 2, axis=1))
        tm_score = np.sum(1 / (1 + (distances / d0) ** 2)) / L
        return tm_score

    def compute_q_score(self, coords1, coords2, threshold=4.0):
        """Computes Q-score: fraction of residue pairs within a distance threshold."""
        distances = np.sqrt(np.sum((coords1 - coords2) ** 2, axis=1))
        q_score = np.sum(distances < threshold) / len(distances)
        return q_score

    def evaluate(self):
        """Computes RMSD, TM-score, and Q-score for all pairs of aligned structures."""
        all_coords = {pdb: self.get_ca_coordinates(pdb) for pdb in self.pdb_files}

        results = []
        for pdb1, pdb2 in combinations(self.pdb_files, 2):
            coords1 = all_coords[pdb1]
            coords2 = all_coords[pdb2]
            
            if len(coords1) != len(coords2):
                print(f"Skipping {pdb1} and {pdb2} due to mismatched lengths.")
                continue

            rmsd = self.compute_rmsd(coords1, coords2)
            tm_score = self.compute_tm_score(coords1, coords2)
            q_score = self.compute_q_score(coords1, coords2)

            results.append((pdb1, pdb2, rmsd, tm_score, q_score))

        return results

# Example usage
if __name__ == "__main__":
    pdb_folder = "truncated_ferritins/"

    pdb_files = sorted([os.path.join(pdb_folder, f) for f in os.listdir(pdb_folder) if f.endswith(".pdb")])

    evaluator = ProteinAlignmentEvaluator(pdb_files)
    results = evaluator.evaluate()

    # Print results
    print("PDB1\tPDB2\tRMSD\tTM-score\tQ-score")
    for r in results:
        print(f"{r[0]}\t{r[1]}\t{r[2]:.3f}\t{r[3]:.3f}\t{r[4]:.3f}")

PDB1	PDB2	RMSD	TM-score	Q-score
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1dpsA.pdb	23.038	0.064	0.000
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1eumA.pdb	5.008	0.506	0.296
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1jgcA.pdb	0.723	0.977	1.000
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1ji4A.pdb	8.230	0.278	0.014
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1ji5A.pdb	7.740	0.332	0.007
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1jigA.pdb	8.472	0.367	0.268
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1jtsA.pdb	19.545	0.081	0.000
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1krqA.pdb	3.312	0.714	0.746
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1lb3A.pdb	11.930	0.178	0.000
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1nfvA.pdb	5.733	0.410	0.000
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1o9rA.pdb	20.926	0.074	0.000
truncated_ferritins/1BCFA.pdb	truncated_ferritins/1otkA.pdb	9.577	0.269	0.007
truncated_ferritins/1BCFA.pd

### **Step 3 - Convert protein structures into phi and psi dihedral angles**

In [21]:
import os
import csv
import math
from Bio import PDB

# Hydrophobicity scale (Kyte-Doolittle)
HYDROPHOBICITY_MAP = {
    "A": 1.8, "C": 2.5, "D": -3.5, "E": -3.5, "F": 2.8,
    "G": -0.4, "H": -3.2, "I": 4.5, "K": -3.9, "L": 3.8,
    "M": 1.9, "N": -3.5, "P": -1.6, "Q": -3.5, "R": -4.5,
    "S": -0.8, "T": -0.7, "V": 4.2, "W": -0.9, "Y": -1.3
}

# Three-letter to one-letter amino acid mapping
THREE_TO_ONE_MAP = {
    "ALA": "A", "CYS": "C", "ASP": "D", "GLU": "E", "PHE": "F",
    "GLY": "G", "HIS": "H", "ILE": "I", "LYS": "K", "LEU": "L",
    "MET": "M", "ASN": "N", "PRO": "P", "GLN": "Q", "ARG": "R",
    "SER": "S", "THR": "T", "VAL": "V", "TRP": "W", "TYR": "Y"
}

def compute_phi_psi(structure, pdb_filename):
    """
    Extracts phi and psi angles from a PDB structure.
    """
    model = structure[0]  # Only use first model
    angles = []
    
    for chain in model:
        polypeptides = PDB.PPBuilder().build_peptides(chain)
        for poly in polypeptides:
            phi_psi_list = poly.get_phi_psi_list()
            for i, (phi, psi) in enumerate(phi_psi_list):
                residue = poly[i]
                resname = residue.resname
                one_letter_code = THREE_TO_ONE_MAP.get(resname, "?")
                hydrophobicity = HYDROPHOBICITY_MAP.get(one_letter_code, 0.0)
                
                phi_val = math.degrees(phi) if phi is not None else None
                psi_val = math.degrees(psi) if psi is not None else None
                
                angles.append([
                    pdb_filename, chain.id, resname, residue.id[1],
                    phi_val, psi_val, hydrophobicity
                ])
    return angles

def write_csv(output_file, all_angles):
    """
    Writes extracted angles from all PDB files into a single CSV file.
    """
    with open(output_file, "w", newline="") as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(["PDB_File", "Chain", "Residue", "ResidueNumber", "Phi", "Psi", "Hydrophobicity"])
        for row in all_angles:
            writer.writerow(row)

def process_pdb_files(input_folder, output_csv):
    """
    Processes multiple PDB files in a directory and writes a single CSV file.
    """
    parser = PDB.PDBParser(QUIET=True)
    all_angles = []

    pdb_files = [f for f in os.listdir(input_folder) if f.endswith(".pdb")]

    for pdb_filename in pdb_files:
        pdb_path = os.path.join(input_folder, pdb_filename)
        try:
            structure = parser.get_structure(pdb_filename, pdb_path)
            angles = compute_phi_psi(structure, pdb_filename)
            all_angles.extend(angles)
            print(f"Processed {pdb_filename} ({len(angles)} residues)")
        except Exception as e:
            print(f"Error processing {pdb_filename}: {e}")

    # Save all data to CSV
    write_csv(output_csv, all_angles)
    print(f"\nFinal CSV saved: {output_csv} with {len(all_angles)} total entries.")

if __name__ == "__main__":
    input_folder = "truncated_ferritins/"  
    output_csv = "ferritin_dihedral_angles.csv"   
    process_pdb_files(input_folder, output_csv)

Processed 1jgcA.pdb (142 residues)
Processed 1krqA.pdb (142 residues)
Processed 2vzbA.pdb (142 residues)
Processed 1o9rA.pdb (142 residues)
Processed 1nfvA.pdb (142 residues)
Processed 1lb3A.pdb (142 residues)
Processed 2chpA.pdb (142 residues)
Processed 1jtsA.pdb (142 residues)
Processed 1uvhA.pdb (142 residues)
Processed 1dpsA.pdb (142 residues)
Processed 1jigA.pdb (142 residues)
Processed 1vlgA.pdb (142 residues)
Processed 3e6sA.pdb (142 residues)
Processed 2uw1A.pdb (142 residues)
Processed 1otkA.pdb (142 residues)
Processed 2fkzA.pdb (142 residues)
Processed 1r03A.pdb (142 residues)
Processed 1ji5A.pdb (142 residues)
Processed 1qghA.pdb (142 residues)
Processed 2jd70.pdb (142 residues)
Processed 2fzfA.pdb (142 residues)
Processed 1tjoA.pdb (142 residues)
Processed 1eumA.pdb (142 residues)
Processed 1BCFA.pdb (142 residues)
Processed 1ji4A.pdb (142 residues)

Final CSV saved: ferritin_dihedral_angles.csv with 3550 total entries.
