## Downloading PDB Files Using Biopython

This notebook demonstrates how to download a list of unique PDB files using Biopython.

### Steps:
1. Install Biopython.
2. Use Biopython's `PDBList` module to download each PDB file from the RCSB PDB database.


In [1]:
# List of unique PDB IDs
pdb_ids = ['6ki6', '6kbs', '5zvb', '5zva', '5xs0', '5ke6', '5ed4', '5co8', 
           '4yg4', '4x23', '4wcg', '4qtk', '4qju', '4nm6', '4m9e', '4lll', 
           '4ljr', '4l0z', '4kmf', '4k4g', '4hqb', '4hn5', '4gzn', '4ch1', 
           '4bnc', '3wpd', '3wpc', '3qi5', '3ode', '3od8', '3mx9', '3gqc', 
           '3bs1', '2vs7', '2vla', '2qby', '2moe', '2i05', '1w7a', '1qzg', 
           '1qbj', '1pvi', '1puf', '1par', '1mse', '1lmb', '1k82', '1j5n', 
           '1io4', '1g9z', '1az0']


In [2]:
# Install Biopython (if not already installed)
!pip install biopython





[notice] A new release of pip is available: 23.1.2 -> 24.2
[notice] To update, run: C:\Users\paul\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [3]:
from Bio.PDB import PDBList

# Initialize PDBList
pdb_list = PDBList()

# Download each PDB file
for pdb_id in pdb_ids:
    pdb_list.retrieve_pdb_file(pdb_id, file_format='pdb', pdir='pdb_files')


Downloading PDB structure '6ki6'...
Downloading PDB structure '6kbs'...
Downloading PDB structure '5zvb'...
Downloading PDB structure '5zva'...
Downloading PDB structure '5xs0'...
Downloading PDB structure '5ke6'...
Downloading PDB structure '5ed4'...
Downloading PDB structure '5co8'...
Downloading PDB structure '4yg4'...
Downloading PDB structure '4x23'...
Downloading PDB structure '4wcg'...
Downloading PDB structure '4qtk'...
Downloading PDB structure '4qju'...
Downloading PDB structure '4nm6'...
Downloading PDB structure '4m9e'...
Downloading PDB structure '4lll'...
Downloading PDB structure '4ljr'...
Downloading PDB structure '4l0z'...
Downloading PDB structure '4kmf'...
Downloading PDB structure '4k4g'...
Downloading PDB structure '4hqb'...
Downloading PDB structure '4hn5'...
Downloading PDB structure '4gzn'...
Downloading PDB structure '4ch1'...
Downloading PDB structure '4bnc'...
Downloading PDB structure '3wpd'...
Downloading PDB structure '3wpc'...
Downloading PDB structure '3

# CONVERT ENT TO PDB

In [4]:
from Bio.PDB import PDBList
import os

# Initialize PDBList
pdb_list = PDBList()

# Directory to save the PDB files
save_dir = 'pdb_files'

# Create the directory if it doesn't exist
if not os.path.exists(save_dir):
    os.makedirs(save_dir)

# Download each PDB file
for pdb_id in pdb_ids:
    # Download in pdb format
    pdb_file = pdb_list.retrieve_pdb_file(pdb_id, file_format='pdb', pdir=save_dir)
    
    # Rename .ent file to .pdb if necessary
    if pdb_file.endswith('.ent'):
        new_name = os.path.join(save_dir, f'{pdb_id}.pdb')
        os.rename(pdb_file, new_name)
        print(f'Renamed {pdb_file} to {new_name}')
    else:
        print(f'{pdb_file} is already in PDB format')


Structure exists: 'pdb_files\pdb6ki6.ent' 
Renamed pdb_files\pdb6ki6.ent to pdb_files\6ki6.pdb
Structure exists: 'pdb_files\pdb6kbs.ent' 
Renamed pdb_files\pdb6kbs.ent to pdb_files\6kbs.pdb
Structure exists: 'pdb_files\pdb5zvb.ent' 
Renamed pdb_files\pdb5zvb.ent to pdb_files\5zvb.pdb
Structure exists: 'pdb_files\pdb5zva.ent' 
Renamed pdb_files\pdb5zva.ent to pdb_files\5zva.pdb
Structure exists: 'pdb_files\pdb5xs0.ent' 
Renamed pdb_files\pdb5xs0.ent to pdb_files\5xs0.pdb
Structure exists: 'pdb_files\pdb5ke6.ent' 
Renamed pdb_files\pdb5ke6.ent to pdb_files\5ke6.pdb
Structure exists: 'pdb_files\pdb5ed4.ent' 
Renamed pdb_files\pdb5ed4.ent to pdb_files\5ed4.pdb
Structure exists: 'pdb_files\pdb5co8.ent' 
Renamed pdb_files\pdb5co8.ent to pdb_files\5co8.pdb
Structure exists: 'pdb_files\pdb4yg4.ent' 
Renamed pdb_files\pdb4yg4.ent to pdb_files\4yg4.pdb
Structure exists: 'pdb_files\pdb4x23.ent' 
Renamed pdb_files\pdb4x23.ent to pdb_files\4x23.pdb
Structure exists: 'pdb_files\pdb4wcg.ent' 
Renamed