# Chapter 1: The PatientOne Story - Hands-On Notebook

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lynnlangit/precision-medicine-mcp/blob/main/docs/book/companion-notebooks/chapter-01-patientone-story.ipynb)

**Book**: *Building AI-Orchestrated Precision Oncology Systems*

**Learning Objectives**:
- Run the complete PatientOne analysis using Claude or Gemini
- Understand AI orchestration of MCP servers
- Compare traditional vs. AI-orchestrated workflows
- Explore multi-modal precision medicine analysis

**Time**: 30-40 minutes

---

## Setup

First, install required packages and set up authentication.

In [None]:
# Install dependencies
!pip install -q anthropic google-generativeai requests pandas matplotlib seaborn

In [None]:
import os
import json
from getpass import getpass
import anthropic
import google.generativeai as genai

# Choose your AI provider
AI_PROVIDER = "claude"  # Options: "claude" or "gemini"

# Set API key
if AI_PROVIDER == "claude":
    if "ANTHROPIC_API_KEY" not in os.environ:
        os.environ["ANTHROPIC_API_KEY"] = getpass("Enter your Anthropic API key: ")
    client = anthropic.Anthropic()
    print("✓ Claude client initialized")
elif AI_PROVIDER == "gemini":
    if "GOOGLE_API_KEY" not in os.environ:
        os.environ["GOOGLE_API_KEY"] = getpass("Enter your Google API key: ")
    genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
    print("✓ Gemini client initialized")

print(f"\nUsing AI provider: {AI_PROVIDER.upper()}")

## PatientOne: Sarah Anderson

**Clinical Scenario**:
- 58-year-old woman
- Stage IV high-grade serous ovarian cancer (HGSOC)
- Initial response to platinum chemotherapy
- Platinum-resistant recurrence at 8 months
- CA-125 rising: 45 → 310 U/mL

**Question**: What treatment should she receive next?

**Available Data**:
- Clinical records (FHIR)
- Somatic genomic variants (VCF)
- Multi-omics (RNA, protein, phospho from PDX models)
- Spatial transcriptomics (10X Visium, 900 spots)
- Histology imaging (H&E and MxIF)

---

## Part 1: Quick Clinical Overview (5 minutes)

We'll start with a simple analysis using just clinical and genomic data.

In [None]:
# Define the PatientOne quick analysis prompt
QUICK_PROMPT = """
I want to analyze patient PAT001-OVC-2025 for a quick clinical and genomic overview.

Please use the MCP servers to:
1. Get clinical summary (demographics, medications, CA-125 trends)
2. Parse somatic variants from VCF file
3. Identify pathogenic mutations
4. Compare to TCGA ovarian cancer cohort

Data location: gs://sample-inputs-patientone/PAT001-OVC-2025/

Provide a concise summary of key findings.
"""

print("Prompt prepared. This will call:")
print("  - mcp-mockepic (clinical data)")
print("  - mcp-fgbio (genomic variants)")
print("  - mcp-tcga (cohort comparison)")
print("\nExpected time: 3-5 minutes")

In [None]:
# MCP server endpoints (deployed on Cloud Run)
MCP_SERVERS = {
    "mockepic": "https://mcp-mockepic-ondu7mwjpa-uc.a.run.app/sse",
    "fgbio": "https://mcp-fgbio-ondu7mwjpa-uc.a.run.app/sse",
    "tcga": "https://mcp-tcga-ondu7mwjpa-uc.a.run.app/sse",
    "multiomics": "https://mcp-multiomics-ondu7mwjpa-uc.a.run.app/sse",
    "spatialtools": "https://mcp-spatialtools-ondu7mwjpa-uc.a.run.app/sse",
    "deepcell": "https://mcp-deepcell-ondu7mwjpa-uc.a.run.app/sse",
    "openimagedata": "https://mcp-openimagedata-ondu7mwjpa-uc.a.run.app/sse",
}

def run_claude_analysis(prompt, servers_to_use=["mockepic", "fgbio", "tcga"]):
    """Run analysis using Claude with MCP servers."""
    
    # Configure MCP servers for Claude
    mcp_config = [
        {"type": "url", "url": MCP_SERVERS[name], "name": name}
        for name in servers_to_use
    ]
    
    # Call Claude with MCP integration
    response = client.beta.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=4096,
        messages=[{"role": "user", "content": prompt}],
        mcp_servers=mcp_config,
        tools=[{"type": "mcp_toolset", "mcp_server_name": name} for name in servers_to_use],
        betas=["mcp-client-2025-11-20"]
    )
    
    return response.content[0].text

print("✓ Claude + MCP integration configured")
print(f"\nAvailable MCP servers: {len(MCP_SERVERS)}")
for name, url in MCP_SERVERS.items():
    print(f"  - {name}: {url}")

In [None]:
# Run the quick analysis
print("Starting PatientOne quick analysis...\n")
print("This will take 3-5 minutes. Claude is:")
print("  1. Connecting to MCP servers")
print("  2. Loading clinical data")
print("  3. Parsing genomic variants")
print("  4. Comparing to TCGA cohort\n")

if AI_PROVIDER == "claude":
    result = run_claude_analysis(QUICK_PROMPT)
    print("\n" + "="*80)
    print("ANALYSIS RESULTS")
    print("="*80 + "\n")
    print(result)
else:
    print("Note: Gemini requires custom MCP client implementation.")
    print("See Chapter 2 for details on building Gemini MCP integration.")

### Expected Findings

Your analysis should reveal:

**Clinical Summary**:
- Patient: Sarah Anderson, 58 years old
- Diagnosis: Stage IV HGSOC
- BRCA1 germline mutation carrier
- CA-125 trajectory: 1200 → 45 → 310 U/mL (platinum resistance)

**Genomic Alterations**:
- TP53 R175H (pathogenic, loss of DNA repair)
- PIK3CA E545K (pathogenic, PI3K pathway activation)
- PTEN loss of heterozygosity (tumor suppressor loss)
- BRCA1 germline mutation (homologous recombination deficiency)

**TCGA Comparison**:
- Molecular subtype: C1 Immunoreactive
- Similar to 30% of TCGA ovarian cancer cohort
- High immune infiltration signature (but clinically immune-excluded)

---

## Part 2: Multi-Omics Integration (10 minutes)

Now we'll add multi-omics data to understand resistance mechanisms.

In [None]:
MULTIOMICS_PROMPT = """
Continue analyzing PAT001-OVC-2025. Now add multi-omics data:

Using mcp-multiomics server:
1. Load PDX model data (RNA, protein, phospho)
2. Compare resistant vs. sensitive samples
3. Run Stouffer meta-analysis across all three modalities
4. Identify consistently dysregulated pathways
5. Highlight druggable targets

Data: gs://sample-inputs-patientone/PAT001-OVC-2025/multiomics/

Focus on: PI3K/AKT/mTOR pathway, DNA repair, drug efflux
"""

print("Multi-omics analysis prompt prepared.")
print("\nThis integrates:")
print("  - RNA-seq (15 PDX samples: 7 sensitive, 8 resistant)")
print("  - Proteomics (same 15 samples)")
print("  - Phosphoproteomics (same 15 samples)")
print("\nStatistical method: Stouffer's Z-score meta-analysis")
print("Expected time: 5-8 minutes")

In [None]:
# Run multi-omics analysis
if AI_PROVIDER == "claude":
    print("Running multi-omics integration...\n")
    multiomics_result = run_claude_analysis(
        MULTIOMICS_PROMPT, 
        servers_to_use=["multiomics", "fgbio"]
    )
    print("\n" + "="*80)
    print("MULTI-OMICS RESULTS")
    print("="*80 + "\n")
    print(multiomics_result)

### Expected Multi-Omics Findings

**PI3K/AKT/mTOR Pathway Activation** (p < 0.001):
- PIK3CA ↑ (consistent with E545K mutation)
- AKT1 ↑ (protein level)
- mTOR ↑ (phosphorylation increased)
- RPS6KB1 ↑ (downstream target)

**Drug Efflux**:
- ABCB1 (MDR1) ↑ 3.2-fold (RNA and protein)

**Anti-Apoptotic**:
- BCL2L1 ↑ 2.1-fold

**DNA Repair** (dysregulated):
- BRCA1 ↓ (expected with germline mutation)
- PTEN ↓ (genomic loss confirmed at protein level)
- ATM ↓

**Key Insight**: Genomic PIK3CA mutation is *active* (pathway is upregulated at protein/phospho level). This validates PIK3CA as a treatment target.

---

## Part 3: Spatial Transcriptomics (10 minutes)

Where are the resistant cells hiding? Spatial data reveals tumor microenvironment.

In [None]:
SPATIAL_PROMPT = """
Analyze spatial transcriptomics for PAT001-OVC-2025:

Using mcp-spatialtools server:
1. Load 10X Visium data (900 spots, 6 annotated regions)
2. Run spatial differential expression
3. Calculate Moran's I for spatial autocorrelation
4. Generate spatial heatmap for top genes
5. Quantify immune cell distribution

Data: gs://sample-inputs-patientone/PAT001-OVC-2025/spatial/

Regions: tumor_core, tumor_proliferative, tumor_interface, stroma_reactive, stroma_collagen, immune_infiltrated

Focus on: Where are CD8+ T cells? Where is high proliferation (Ki67)?
"""

print("Spatial transcriptomics prompt prepared.")
print("\nVisium data:")
print("  - 900 spatial spots across tissue")
print("  - 31 key genes measured per spot")
print("  - 6 annotated tissue regions")
print("\nExpected time: 4-6 minutes")

In [None]:
# Run spatial analysis
if AI_PROVIDER == "claude":
    print("Running spatial transcriptomics analysis...\n")
    spatial_result = run_claude_analysis(
        SPATIAL_PROMPT,
        servers_to_use=["spatialtools"]
    )
    print("\n" + "="*80)
    print("SPATIAL TRANSCRIPTOMICS RESULTS")
    print("="*80 + "\n")
    print(spatial_result)

### Expected Spatial Findings

**Immune Exclusion Phenotype**:
- CD8+ T cells: HIGH in immune_infiltrated region, LOW in tumor_core
- Thick stromal barrier (stroma_collagen) separates immune cells from tumor
- Distance: ~500-800 μm between CD8+ cells and tumor_core

**Proliferative Heterogeneity**:
- Ki67, PCNA, TOP2A: HIGH in tumor_proliferative region
- MKI67 expression: 2.5-fold higher in proliferative vs. core

**Resistance Signature Spatial Pattern**:
- PI3K/AKT markers: Highest in tumor_core (matches multi-omics)
- ABCB1 (drug efflux): Concentrated in tumor_interface

**Clinical Implication**: Immunotherapy alone unlikely to work—T cells can't reach tumor. Need to combine with stroma-targeting agents or PI3K inhibitors.

---

## Part 4: Integration and Recommendations

Synthesize all findings into actionable treatment plan.

In [None]:
INTEGRATION_PROMPT = """
Synthesize all findings for PAT001-OVC-2025 and generate treatment recommendations:

Integrate:
- Clinical: Platinum-resistant, CA-125 rising, BRCA1 germline mutation
- Genomics: PIK3CA E545K, TP53 R175H, PTEN loss
- Multi-omics: PI3K/AKT/mTOR pathway activation confirmed
- Spatial: Immune exclusion, high proliferation, stromal barrier
- Imaging: (if analyzed) 45% Ki67+, low CD8+ density

Provide:
1. Primary treatment recommendation
2. Combination therapy rationale
3. Clinical trial matches (if available)
4. Alternative options if primary fails
5. Monitoring strategy

Format as clinical decision support report.
"""

print("Integration prompt prepared.")
print("\nThis synthesizes:")
print("  ✓ Clinical context")
print("  ✓ Genomic alterations")
print("  ✓ Multi-omics resistance mechanisms")
print("  ✓ Spatial microenvironment")
print("\nExpected time: 2-3 minutes")

In [None]:
# Generate final recommendations
if AI_PROVIDER == "claude":
    print("Generating treatment recommendations...\n")
    final_report = run_claude_analysis(INTEGRATION_PROMPT)
    print("\n" + "="*80)
    print("FINAL TREATMENT RECOMMENDATIONS")
    print("="*80 + "\n")
    print(final_report)

### Expected Treatment Recommendations

**Primary Recommendation**: PI3K inhibitor (Alpelisib) targeting PIK3CA E545K

**Rationale**:
1. Genomic mutation (PIK3CA E545K) is present
2. Pathway is *active* (multi-omics confirms upregulation)
3. FDA-approved for PIK3CA-mutant cancers
4. Preclinical data in ovarian cancer

**Combination Therapy**: Alpelisib + Paclitaxel
- Clinical trial: NCT03602859
- Synergistic in platinum-resistant ovarian cancer

**Secondary Recommendation**: Anti-PD-1 immunotherapy
- Rationale: C1 immunoreactive subtype suggests immune-active tumor
- Caveat: Spatial data shows immune exclusion—consider stroma-modulating agents
- Consider: Alpelisib + pembrolizumab (addresses both resistance mechanisms)

**PARP Inhibitor Consideration**:
- Patient has BRCA1 germline mutation → HRD-positive
- However: PI3K/AKT pathway activation may limit PARP inhibitor efficacy
- Recommendation: Reserve for later line or combine with PI3K inhibitor

**Monitoring**:
- CA-125 every 2 weeks
- CT imaging at 8 weeks
- ctDNA for PIK3CA E545K variant allele fraction

---

## Summary: Traditional vs. AI-Orchestrated

Let's compare what we just did to the traditional workflow.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Workflow comparison data
comparison_data = {
    "Aspect": ["Time", "Specialists", "Code Written", "Tools Used", "Cost"],
    "Traditional": ["40 hours", "3-4", "500-1000 lines", "15-20", "$3,200"],
    "AI-Orchestrated": ["35 minutes", "1", "0 lines", "12 MCP servers", "$1-2"],
    "Improvement": ["95% faster", "67% reduction", "100% reduction", "Integrated", "95% cheaper"]
}

df = pd.DataFrame(comparison_data)
print("\n" + "="*80)
print("WORKFLOW COMPARISON")
print("="*80 + "\n")
print(df.to_string(index=False))

# Visualize time comparison
fig, ax = plt.subplots(1, 2, figsize=(12, 4))

# Time comparison
times = [40*60, 35]  # Convert to minutes
labels = ['Traditional\n(40 hours)', 'AI-Orchestrated\n(35 minutes)']
colors = ['#e74c3c', '#27ae60']
ax[0].bar(labels, times, color=colors)
ax[0].set_ylabel('Time (minutes)')
ax[0].set_title('Analysis Time Comparison')
ax[0].set_yscale('log')

# Cost comparison
costs = [3200, 1.5]
ax[1].bar(labels, costs, color=colors)
ax[1].set_ylabel('Cost (USD)')
ax[1].set_title('Cost per Analysis Comparison')
ax[1].set_yscale('log')

plt.tight_layout()
plt.savefig('workflow_comparison.png', dpi=150, bbox_inches='tight')
plt.show()

print("\n✓ Comparison chart saved as 'workflow_comparison.png'")

## Key Takeaways from Chapter 1

**What You Learned**:
1. ✅ Traditional precision oncology analysis takes 40 hours and costs $3,200
2. ✅ AI-orchestrated analysis takes 35 minutes and costs $1-2
3. ✅ Claude/Gemini coordinates 12 MCP servers via natural language prompts
4. ✅ Multi-modal integration (5 data types) becomes feasible at scale
5. ✅ You ran the PatientOne analysis yourself using real deployed servers

**What Changed**:
- **Time**: 95% reduction (40 hours → 35 minutes)
- **Cost**: 95% reduction ($3,200 → $1-2)
- **Expertise**: 1 oncologist instead of 3-4 specialists
- **Code**: 0 lines written (natural language only)

**Why This Matters**:
- Precision medicine becomes accessible to community hospitals
- Treatment decisions can be made during clinic visits
- Multi-modal analysis becomes the default, not the exception

---

## Exercises

**Exercise 1**: Modify the quick analysis prompt to focus only on DNA repair pathway genes. What additional mutations would be clinically relevant?

**Exercise 2**: The multi-omics analysis identified ABCB1 (MDR1) upregulation. Research what drug this might confer resistance to. Would you change the treatment recommendation?

**Exercise 3**: Based on the spatial analysis showing immune exclusion, propose a combination therapy that addresses both PI3K activation AND the stromal barrier. Hint: Look up FAP inhibitors or CXCL12 blockers.

**Exercise 4**: Calculate the cost savings if your institution analyzed 100 patients/year using AI orchestration vs. traditional methods.

---

## Next Steps

**Chapter 2**: Learn how the Model Context Protocol works and how Claude coordinates multiple servers.

**Try with Your Data**: Replace PatientOne data with your own files:
- Clinical: FHIR JSON from your EHR
- Genomics: Your VCF files
- Multi-omics: Your RNA/protein data

**Explore Other Notebooks**:
- Chapter 4: Clinical data deep dive (FHIR integration)
- Chapter 7: Spatial transcriptomics algorithms (STAR alignment, ComBat)
- Chapter 8: DeepCell segmentation (cell phenotyping)

---

**Questions or issues?**
- GitHub Issues: https://github.com/lynnlangit/precision-medicine-mcp/issues
- Documentation: https://github.com/lynnlangit/precision-medicine-mcp

**License**: Apache 2.0
