# AI-PSCI-010: Protein Structure Visualization

**AI in Pharmaceutical Sciences: Bench to Bedside**  
VCU School of Pharmacy | VIP Program | Spring 2026

---

**Week 5 | Module: AI Tools Overview | Estimated Time: 60-90 minutes**

**Prerequisites**: AI-PSCI-009 (Protein Data Acquisition & Target Selection)

---

**üéØ This talktorial uses YOUR chosen target from AI-PSCI-009!**

You'll learn to create publication-quality visualizations of your target protein, highlighting binding sites, ligands, and key structural features.

## üéØ Learning Objectives

After completing this talktorial, you will be able to:

1. Visualize protein structures interactively using py3Dmol
2. Apply different representation styles (cartoon, surface, stick, sphere)
3. Use coloring schemes to highlight structural features
4. Identify and visualize binding site residues
5. Create publication-quality molecular graphics
6. Compare experimental and AlphaFold structures

---

## üìö Background

### Why Visualize Protein Structures?

Protein structure visualization is essential for:

- **Understanding drug binding**: See exactly how drugs fit into binding pockets
- **Identifying key residues**: Find amino acids critical for binding or catalysis
- **Communicating findings**: Create figures for papers, posters, and presentations
- **Analyzing mutations**: Visualize how mutations might affect structure

### Visualization Tools

**py3Dmol**
- JavaScript-based 3D viewer embedded in Jupyter notebooks
- Interactive: rotate, zoom, pan with mouse
- Lightweight and fast
- Best for quick exploration and simple figures

**PyMOL** (external tool, not covered here)
- Industry standard for publication figures
- More features but requires separate installation
- We'll focus on py3Dmol for in-notebook work

### Representation Styles

| Style | Shows | Best For |
|-------|-------|----------|
| **Cartoon** | Secondary structure (Œ±-helices, Œ≤-sheets) | Overall fold, architecture |
| **Surface** | Molecular surface | Binding pockets, shape |
| **Stick** | Bonds and atoms | Detailed residue/ligand views |
| **Sphere** | Van der Waals radii | Space-filling, size comparison |
| **Line** | Bonds only | Fast rendering, many atoms |

### Coloring Schemes

- **Spectrum**: N-terminus (blue) to C-terminus (red)
- **Chain**: Different colors for each chain
- **Secondary structure**: Helices, sheets, loops in different colors
- **B-factor**: Flexibility/mobility (blue=rigid, red=flexible)
- **Element**: Standard CPK colors (C=gray, O=red, N=blue)

### Key Concepts

- **Binding site**: Region where drug/substrate binds (active site for enzymes)
- **Selection syntax**: py3Dmol uses selection dictionaries like `{"resi": "100-120"}`
- **Heteroatoms (HETATM)**: Non-protein atoms (ligands, waters, cofactors)
- **Chain ID**: Letter identifying each polypeptide chain (A, B, C...)

---

## üõ†Ô∏è Setup

Run this cell to install required packages:

In [None]:
#@title üõ†Ô∏è Install Packages
!pip install biopython py3Dmol requests -q
print("‚úÖ Packages installed successfully!")

Import the required libraries:

In [None]:
#@title üì¶ Import Libraries
import requests
import py3Dmol
import pandas as pd
import numpy as np
from Bio.PDB import PDBParser, Selection
from collections import Counter
import warnings
warnings.filterwarnings('ignore')

print("‚úÖ All libraries imported!")

---

## üéØ Target Configuration

Select the same target you chose in AI-PSCI-009:

In [None]:
#@title üéØ Select Your Drug Target

TARGET = "DHFR" #@param ["DHFR", "ABL1", "EGFR", "AChE", "COX-2", "DPP-4"]

# Complete target configuration with binding site residues
TARGET_CONFIG = {
    "DHFR": {
        "pdb": "1RX1",
        "uniprot": "P0ABQ4",
        "chembl": "CHEMBL202",
        "drug": "Trimethoprim",
        "drug_3letter": "TOP",
        "organism": "Escherichia coli",
        "full_name": "Dihydrofolate reductase",
        "binding_site_residues": [5, 6, 7, 27, 28, 30, 31, 32, 57, 94, 100],
        "catalytic_residues": [],
        "description": "Folate metabolism enzyme, target of trimethoprim antibiotics"
    },
    "ABL1": {
        "pdb": "1IEP",
        "uniprot": "P00519",
        "chembl": "CHEMBL1862",
        "drug": "Imatinib",
        "drug_3letter": "STI",
        "organism": "Homo sapiens",
        "full_name": "Tyrosine-protein kinase ABL1",
        "binding_site_residues": [253, 255, 271, 286, 290, 315, 317, 318, 355, 360, 380, 381, 382],
        "catalytic_residues": [363],
        "description": "Oncogenic kinase in CML, revolutionized cancer therapy"
    },
    "EGFR": {
        "pdb": "1M17",
        "uniprot": "P00533",
        "chembl": "CHEMBL203",
        "drug": "Erlotinib",
        "drug_3letter": "AQ4",
        "organism": "Homo sapiens",
        "full_name": "Epidermal growth factor receptor",
        "binding_site_residues": [718, 719, 720, 721, 726, 743, 790, 791, 793, 797, 800, 854, 855],
        "catalytic_residues": [837],
        "description": "Receptor tyrosine kinase, major target in lung cancer"
    },
    "AChE": {
        "pdb": "4EY7",
        "uniprot": "P22303",
        "chembl": "CHEMBL220",
        "drug": "Donepezil",
        "drug_3letter": "E20",
        "organism": "Homo sapiens",
        "full_name": "Acetylcholinesterase",
        "binding_site_residues": [86, 124, 202, 203, 295, 297, 337, 341, 449, 450],
        "catalytic_residues": [203, 337, 450],
        "description": "Neurotransmitter-degrading enzyme, Alzheimer's target"
    },
    "COX-2": {
        "pdb": "3LN1",
        "uniprot": "P35354",
        "chembl": "CHEMBL230",
        "drug": "Celecoxib",
        "drug_3letter": "CEL",
        "organism": "Homo sapiens",
        "full_name": "Prostaglandin G/H synthase 2",
        "binding_site_residues": [89, 90, 96, 120, 355, 359, 509, 513, 516, 523, 527, 530, 531],
        "catalytic_residues": [385],
        "description": "Inflammatory enzyme, selective inhibition reduces side effects"
    },
    "DPP-4": {
        "pdb": "1X70",
        "uniprot": "P27487",
        "chembl": "CHEMBL284",
        "drug": "Sitagliptin",
        "drug_3letter": "715",
        "organism": "Homo sapiens",
        "full_name": "Dipeptidyl peptidase 4",
        "binding_site_residues": [125, 186, 203, 205, 206, 207, 226, 228, 229, 630, 708, 710, 740],
        "catalytic_residues": [630, 708, 740],
        "description": "Metabolic enzyme, incretin degradation in diabetes"
    }
}

# Get configuration for selected target
config = TARGET_CONFIG[TARGET]

print("=" * 60)
print(f"üéØ Target: {TARGET}")
print("=" * 60)
print(f"\nüß¨ Full Name: {config['full_name']}")
print(f"ü¶† Organism: {config['organism']}")
print(f"\nüìä PDB: {config['pdb']}")
print(f"üíä Reference Drug: {config['drug']} ({config['drug_3letter']})")
print(f"\nüî¨ Binding Site Residues: {config['binding_site_residues']}")
if config['catalytic_residues']:
    print(f"‚öóÔ∏è Catalytic Residues: {config['catalytic_residues']}")
print(f"\nüìù {config['description']}")
print("\n‚úÖ Target configuration loaded!")

---

## üî¨ Guided Inquiry 1: Loading and Basic Visualization

### Context

Before we can visualize our protein, we need to load the structure. py3Dmol can load structures directly from the PDB or from local files. Let's start with the basics: loading your target and displaying it in cartoon representation.

### Your Task

Using your AI assistant, write code to:

1. Download the PDB structure for your target using the `requests` library
2. Create a py3Dmol viewer with appropriate size (800x600 pixels)
3. Display the structure in cartoon representation
4. Add zoom and centering to show the full structure

üí° **Prompting Tips**:
- Ask: "How do I load a PDB structure into py3Dmol?"
- Your PDB ID is stored in `config['pdb']`
- Use `view.setStyle()` to set representation

### Verification

After running your code, confirm:
- [ ] 3D structure is displayed and interactive
- [ ] You can rotate, zoom, and pan with mouse
- [ ] Structure shows helices and sheets (cartoon style)

üìì **Lab Notebook**: Take a screenshot of your target. How many chains do you see? What secondary structure elements are prominent?

In [None]:
# Your code here



---

## üî¨ Guided Inquiry 2: Representation Styles

### Context

Different representations reveal different aspects of protein structure. **Cartoon** shows the fold, **surface** shows the shape, **stick** shows the chemistry. Let's compare them side-by-side.

### Your Task

Using your AI assistant, write code to:

1. Create a function that displays the structure with a specified representation
2. Display your target in four different styles:
   - Cartoon (secondary structure)
   - Surface (molecular surface)
   - Stick (all bonds)
   - Sphere (space-filling)

üí° **Prompting Tips**:
- Ask: "What representation styles are available in py3Dmol?"
- Each style has options: `{'cartoon': {}}`, `{'surface': {}}`, etc.
- For surface, use `addSurface()` method

### Verification

After running your code, confirm:
- [ ] All four representations display correctly
- [ ] Surface shows the overall shape/cavities
- [ ] Stick shows individual atom connections
- [ ] Sphere shows space-filling view

üìì **Lab Notebook**: Which representation best shows the binding pocket? Which is best for showing protein fold?

In [None]:
# Your code here



---

## üî¨ Guided Inquiry 3: Coloring Schemes

### Context

Color is a powerful way to encode information in molecular graphics. We can color by **chain**, **secondary structure**, **B-factor** (flexibility), or **custom** schemes to highlight specific regions.

### Your Task

Using your AI assistant, write code to:

1. Color the structure by **chain** (different color for each chain)
2. Color by **secondary structure** (helix, sheet, loop in different colors)
3. Color by **B-factor** (temperature factor showing flexibility)
4. Explain what each coloring reveals about the structure

üí° **Prompting Tips**:
- Ask: "How do I color a protein by chain in py3Dmol?"
- Use `colorscheme` parameter or explicit `color` values
- B-factor coloring uses `{'prop': 'b', 'gradient': 'rwb'}`

### Verification

After running your code, confirm:
- [ ] Chain coloring shows distinct chains
- [ ] Secondary structure elements are distinguishable
- [ ] B-factor coloring shows variation across structure

üìì **Lab Notebook**: What does B-factor coloring reveal? Are some regions more flexible than others?

In [None]:
# Your code here



---

## üî¨ Guided Inquiry 4: Highlighting the Binding Site

### Context

The **binding site** is where drugs bind to the protein. Understanding its location and composition is crucial for drug design. We'll highlight the binding site residues and show them in stick representation.

### Your Task

Using your AI assistant, write code to:

1. Display the protein in cartoon representation
2. Highlight binding site residues in a different style (e.g., sticks)
3. Color binding site residues to stand out (e.g., red or orange)
4. List the amino acid types in the binding site

üí° **Prompting Tips**:
- Ask: "How do I select specific residues in py3Dmol?"
- Use `{'resi': [5, 6, 7, 27]}` for residue selection
- Your binding site residues are in `config['binding_site_residues']`

### Verification

After running your code, confirm:
- [ ] Binding site residues are visually distinct from rest of protein
- [ ] You can see the sidechains of binding site residues
- [ ] Binding site forms a pocket or groove

üìì **Lab Notebook**: What types of amino acids are in your binding site? Are they hydrophobic, polar, or charged?

In [None]:
# Your code here



---

## üî¨ Guided Inquiry 5: Visualizing the Bound Ligand

### Context

Most PDB structures of drug targets include a **bound ligand** (drug or inhibitor). Visualizing how the ligand sits in the binding pocket is fundamental to understanding drug action and designing new compounds.

### Your Task

Using your AI assistant, write code to:

1. Display the protein with the ligand highlighted
2. Show the ligand in stick representation with different colors
3. Show binding site residues that contact the ligand
4. Add a transparent surface around the binding pocket

üí° **Prompting Tips**:
- Ask: "How do I select and display a ligand in py3Dmol?"
- Ligands are selected with `{'hetflag': True}` or `{'resn': 'LIG'}`
- Your drug's 3-letter code is in `config['drug_3letter']`

### Verification

After running your code, confirm:
- [ ] Ligand is visible in the binding pocket
- [ ] Ligand atoms are colored by element (C, N, O, etc.)
- [ ] Binding site residues surround the ligand
- [ ] Can see complementarity between ligand and pocket

üìì **Lab Notebook**: How does the ligand fit into the binding pocket? What interactions can you see?

In [None]:
# Your code here



---

## üî¨ Guided Inquiry 6: Comparing Structures

### Context

Comparing structures helps us understand flexibility, conformational changes, and prediction accuracy. Let's compare the experimental PDB structure with the AlphaFold prediction we downloaded in AI-PSCI-009.

### Your Task

Using your AI assistant, write code to:

1. Download the AlphaFold structure for your target
2. Display both structures side-by-side or overlaid
3. Color differently to distinguish them
4. Focus on the binding site region

üí° **Prompting Tips**:
- Ask: "How do I overlay two PDB structures in py3Dmol?"
- AlphaFold URL: `https://alphafold.ebi.ac.uk/files/AF-{UNIPROT}-F1-model_v4.pdb`
- Use different colors for each structure

### Verification

After running your code, confirm:
- [ ] Both structures are visible
- [ ] Can distinguish experimental from predicted
- [ ] Binding site regions are comparable
- [ ] Any differences are noted

üìì **Lab Notebook**: How well does AlphaFold predict the binding site? Are there significant differences?

In [None]:
# Your code here



---

## ‚úÖ Checkpoint

Before moving on to the next talktorial, confirm you can:

- [ ] Load and display PDB structures in py3Dmol
- [ ] Apply different representation styles (cartoon, surface, stick, sphere)
- [ ] Use coloring schemes (spectrum, chain, B-factor)
- [ ] Select and highlight specific residues
- [ ] Visualize bound ligands in the binding pocket
- [ ] Compare experimental and predicted structures

### Your lab notebook should include:

- [ ] Screenshots of your target in multiple representations
- [ ] Binding site composition analysis
- [ ] Drug-binding site visualization
- [ ] Notes on experimental vs AlphaFold comparison
- [ ] Key binding site residues for your target

---

## ü§î Reflection Questions

Answer these in your lab notebook:

1. **Representation Choice**: You're preparing a figure showing how your drug fits in the binding pocket. Which representation styles would you combine? Why?

2. **Binding Site Analysis**: Based on the amino acid composition of your binding site, what types of interactions do you expect between the drug and protein?

3. **Structure Comparison**: How similar is the AlphaFold prediction to the experimental structure? Where are the biggest differences? Does this affect drug design?

---

## üìñ Further Reading

- [py3Dmol Documentation](https://3dmol.csb.pitt.edu/) - Official py3Dmol guide
- [RCSB PDB Mol* Viewer](https://www.rcsb.org/3d-view) - Alternative web viewer
- [PyMOL Wiki](https://pymolwiki.org/) - Industry-standard visualization
- [Protein Visualization Best Practices](https://proteopedia.org/wiki/index.php/Molecular_graphics) - Tips for publication figures

---

## üîó Connection to Research

Protein structure visualization is fundamental to drug discovery:

- **Lead optimization**: Visualize SAR to guide chemical modifications
- **Selectivity design**: Compare binding sites of related proteins
- **Resistance analysis**: See how mutations affect drug binding
- **Communication**: Create figures for papers, patents, and presentations

### What's Next?

In **AI-PSCI-011: AlphaFold2 for Structure Prediction**, you will:
1. Run AlphaFold2/ColabFold on your target sequence
2. Interpret confidence scores (pLDDT, PAE)
3. Predict structures of mutant proteins
4. Compare predicted vs experimental structures quantitatively

The visualization skills you learned today will be essential for analyzing AlphaFold predictions!

---

*AI-PSCI-010 Complete. Proceed to AI-PSCI-011: AlphaFold2 for Structure Prediction.*