# üè• Medical LLMs: Hands-On Practice

## Table of Contents
1. [Environment Setup and Package Installation](#practice-1-environment-setup-and-package-installation)
2. [Loading and Using Pre-trained Medical Models](#practice-2-loading-and-using-pre-trained-medical-models)
3. [Clinical Text Processing and NER](#practice-3-clinical-text-processing-and-ner)
4. [Medical Question Answering System](#practice-4-medical-question-answering-system)
5. [Text Classification for Clinical Notes](#practice-5-text-classification-for-clinical-notes)
6. [Building a Simple Medical Chatbot](#practice-6-building-a-simple-medical-chatbot)
7. [Model Evaluation on MedQA Dataset](#practice-7-model-evaluation-on-medqa-dataset)
8. [HIPAA Compliance: Data Anonymization](#practice-8-hipaa-compliance-data-anonymization)

## Installing and Importing Essential Libraries

In [None]:
# Install required packages (run once)
# !pip install transformers datasets torch accelerate sentencepiece
# !pip install scikit-learn pandas numpy matplotlib seaborn
# !pip install presidio-analyzer presidio-anonymizer

# Import essential libraries
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
import torch

# Visualization settings
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 11
sns.set_style('whitegrid')

print("‚úÖ All libraries loaded successfully!")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

---
## Practice 1: Environment Setup and Package Installation

### üéØ Learning Objectives
- Set up the development environment for medical NLP
- Understand the key libraries used in medical AI
- Verify GPU availability for model training

### üìñ Key Concepts
**Transformers Library:** Hugging Face's library providing pre-trained models like BioGPT, PubMedGPT  
**CUDA:** NVIDIA's parallel computing platform for GPU acceleration

In [None]:
# 1.1 Check system capabilities
def check_system_info():
    """Display system information relevant to ML tasks"""
    print("System Information")
    print("=" * 50)
    
    # Python version
    import sys
    print(f"Python version: {sys.version.split()[0]}")
    
    # PyTorch info
    print(f"\nPyTorch version: {torch.__version__}")
    print(f"CUDA available: {torch.cuda.is_available()}")
    
    if torch.cuda.is_available():
        print(f"CUDA version: {torch.version.cuda}")
        print(f"GPU device: {torch.cuda.get_device_name(0)}")
        print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    else:
        print("‚ö†Ô∏è  No GPU detected. Training will use CPU (slower).")
    
    print("\n‚úÖ Environment check complete!")

check_system_info()

---
## Practice 2: Loading and Using Pre-trained Medical Models

### üéØ Learning Objectives
- Load pre-trained medical language models
- Understand model architectures (BERT, GPT)
- Generate text using medical models

### üìñ Key Concepts
**BioGPT:** 1.5B parameter model trained on PubMed abstracts  
**Model Pipeline:** High-level API for easy model usage

In [None]:
# 2.1 Load a medical text generation model
def load_medical_model():
    """Load and test a pre-trained medical language model"""
    
    print("Loading Medical Language Model...")
    print("=" * 50)
    
    # Using a smaller model for demonstration (distilbert-based)
    # For production, use: "microsoft/BioGPT" or "stanford-crfm/BioMedLM"
    model_name = "emilyalsentzer/Bio_ClinicalBERT"
    
    try:
        # Load model using pipeline
        from transformers import AutoTokenizer, AutoModel
        
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        model = AutoModel.from_pretrained(model_name)
        
        print(f"‚úÖ Model loaded: {model_name}")
        print(f"Model type: {type(model).__name__}")
        print(f"Number of parameters: {sum(p.numel() for p in model.parameters()):,}")
        
        return tokenizer, model
    
    except Exception as e:
        print(f"‚ùå Error loading model: {e}")
        print("Note: Some models require authentication or special access.")
        return None, None

tokenizer, model = load_medical_model()

In [None]:
# 2.2 Test text encoding with medical tokenizer
def test_medical_tokenization():
    """Demonstrate how medical text is tokenized"""
    
    if tokenizer is None:
        print("‚ö†Ô∏è  Model not loaded. Skipping tokenization test.")
        return
    
    # Sample medical text
    medical_text = "Patient presents with fever, cough, and shortness of breath. Diagnosis: pneumonia."
    
    print("Medical Text Tokenization")
    print("=" * 50)
    print(f"Original text: {medical_text}")
    print()
    
    # Tokenize
    tokens = tokenizer.tokenize(medical_text)
    token_ids = tokenizer.encode(medical_text)
    
    print(f"Number of tokens: {len(tokens)}")
    print(f"Tokens: {tokens[:15]}...")  # Show first 15 tokens
    print(f"\nToken IDs (first 10): {token_ids[:10]}")
    
    # Decode back
    decoded = tokenizer.decode(token_ids)
    print(f"\nDecoded text: {decoded}")
    
    print("\n‚úÖ Tokenization test complete!")

test_medical_tokenization()

---
## Practice 3: Clinical Text Processing and NER

### üéØ Learning Objectives
- Extract medical entities from clinical text
- Identify diseases, medications, and symptoms
- Understand Named Entity Recognition (NER) in healthcare

### üìñ Key Concepts
**Named Entity Recognition:** Identifying and classifying medical terms  
**Clinical Entities:** Diseases, medications, procedures, symptoms, lab results

In [None]:
# 3.1 Medical Named Entity Recognition
def medical_ner_demo():
    """Demonstrate medical NER using a pre-trained model"""
    
    print("Medical Named Entity Recognition")
    print("=" * 50)
    
    # Sample clinical note
    clinical_note = """
    Patient is a 65-year-old male with history of type 2 diabetes mellitus and hypertension.
    Currently taking metformin 1000mg twice daily and lisinopril 10mg once daily.
    Blood pressure: 145/90 mmHg. HbA1c: 7.8%. Recommending adjustment of diabetes medications.
    """
    
    print("Clinical Note:")
    print(clinical_note)
    print()
    
    try:
        # Load NER pipeline (using general NER for demonstration)
        # For production, use medical-specific models like "allenai/biomed_roberta_base"
        ner_pipeline = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english", grouped_entities=True)
        
        entities = ner_pipeline(clinical_note)
        
        print("Extracted Entities:")
        print("-" * 50)
        for entity in entities[:10]:  # Show first 10 entities
            print(f"Text: '{entity['word']}' | Type: {entity['entity_group']} | Score: {entity['score']:.3f}")
        
        print("\n‚úÖ NER extraction complete!")
        return entities
        
    except Exception as e:
        print(f"‚ùå Error in NER: {e}")
        print("Note: Using a general NER model. For medical NER, use specialized models.")
        return []

entities = medical_ner_demo()

In [None]:
# 3.2 Manual entity extraction using regex patterns
def simple_medical_entity_extraction():
    """Extract common medical entities using pattern matching"""
    import re
    
    clinical_text = """
    Patient reports chest pain and shortness of breath. History of hypertension.
    Prescribed aspirin 81mg daily and atorvastatin 20mg at bedtime.
    BP: 140/85, HR: 72 bpm, Temp: 98.6¬∞F.
    """
    
    print("Simple Medical Entity Extraction")
    print("=" * 50)
    print(f"Text: {clinical_text}\n")
    
    # Define patterns for common medical entities
    patterns = {
        'Medication': r'\b(aspirin|atorvastatin|metformin|lisinopril|insulin)\b',
        'Dosage': r'\d+\s*mg',
        'Vital Signs': r'(BP|HR|Temp):\s*[\d./]+',
        'Symptoms': r'\b(pain|fever|cough|nausea|dizziness|shortness of breath)\b',
    }
    
    results = {}
    for entity_type, pattern in patterns.items():
        matches = re.findall(pattern, clinical_text, re.IGNORECASE)
        results[entity_type] = matches
        print(f"{entity_type}: {matches}")
    
    print("\n‚úÖ Pattern-based extraction complete!")
    return results

extracted_entities = simple_medical_entity_extraction()

---
## Practice 4: Medical Question Answering System

### üéØ Learning Objectives
- Build a simple medical QA system
- Use context-based question answering
- Evaluate answer quality

### üìñ Key Concepts
**Question Answering:** Finding answers in medical text  
**Context Window:** The text passage used to find answers

In [None]:
# 4.1 Medical Question Answering
def medical_qa_demo():
    """Demonstrate medical question answering"""
    
    print("Medical Question Answering System")
    print("=" * 50)
    
    # Medical context
    context = """
    Diabetes mellitus is a chronic metabolic disorder characterized by elevated blood glucose levels.
    Type 2 diabetes is the most common form, accounting for 90-95% of all diabetes cases.
    Treatment typically involves lifestyle modifications, oral medications like metformin,
    and in some cases, insulin therapy. Regular monitoring of HbA1c levels is essential,
    with a target of less than 7% for most patients.
    """
    
    questions = [
        "What is diabetes mellitus?",
        "What is the most common type of diabetes?",
        "What is the target HbA1c level?",
        "What medications are used for diabetes?"
    ]
    
    try:
        # Load QA pipeline
        qa_pipeline = pipeline("question-answering", model="distilbert-base-cased-distilled-squad")
        
        print("Context:")
        print(context)
        print("\nQuestions and Answers:")
        print("-" * 50)
        
        for question in questions:
            result = qa_pipeline(question=question, context=context)
            print(f"\nQ: {question}")
            print(f"A: {result['answer']} (confidence: {result['score']:.3f})")
        
        print("\n‚úÖ QA demonstration complete!")
        
    except Exception as e:
        print(f"‚ùå Error in QA: {e}")
        print("Note: Make sure the model is downloaded correctly.")

medical_qa_demo()

---
## Practice 5: Text Classification for Clinical Notes

### üéØ Learning Objectives
- Classify clinical notes by urgency or specialty
- Build a simple text classifier
- Evaluate classification performance

### üìñ Key Concepts
**Text Classification:** Categorizing medical text  
**Sentiment/Urgency Analysis:** Determining priority levels

In [None]:
# 5.1 Clinical note classification
def classify_clinical_urgency():
    """Classify clinical notes by urgency level"""
    
    print("Clinical Note Urgency Classification")
    print("=" * 50)
    
    # Sample clinical notes with varying urgency
    notes = [
        "Patient reports mild headache, resolving with rest. Follow-up in 2 weeks.",
        "URGENT: Patient experiencing severe chest pain, diaphoresis, and dyspnea. Call 911.",
        "Routine follow-up for hypertension. Blood pressure well controlled on current medications.",
        "Patient fell and unable to move right leg. Possible fracture. Immediate orthopedic consult needed.",
        "Annual physical exam. All vitals within normal limits. Patient is healthy."
    ]
    
    # Simple rule-based urgency classification
    urgent_keywords = ['urgent', 'severe', 'emergency', 'acute', 'immediate', 'call 911', 'unable to']
    moderate_keywords = ['follow-up', 'consult', 'monitor', 'possible']
    
    print("Classifying clinical notes...\n")
    
    for i, note in enumerate(notes, 1):
        note_lower = note.lower()
        
        if any(keyword in note_lower for keyword in urgent_keywords):
            urgency = "üî¥ URGENT"
        elif any(keyword in note_lower for keyword in moderate_keywords):
            urgency = "üü° MODERATE"
        else:
            urgency = "üü¢ ROUTINE"
        
        print(f"Note {i}: {urgency}")
        print(f"Text: {note[:80]}...")
        print()
    
    print("‚úÖ Classification complete!")

classify_clinical_urgency()

---
## Practice 6: Building a Simple Medical Chatbot

### üéØ Learning Objectives
- Create an interactive medical information chatbot
- Handle common medical queries
- Understand chatbot architecture

### üìñ Key Concepts
**Conversational AI:** Interactive dialogue systems  
**Intent Recognition:** Understanding user queries

In [None]:
# 6.1 Simple rule-based medical chatbot
def simple_medical_chatbot():
    """A simple rule-based medical information chatbot"""
    
    print("Simple Medical Information Chatbot")
    print("=" * 50)
    print("Type 'quit' to exit\n")
    
    # Knowledge base (simple dictionary)
    knowledge_base = {
        'diabetes': "Diabetes is a chronic condition affecting blood sugar regulation. Common types include Type 1 and Type 2.",
        'hypertension': "Hypertension (high blood pressure) is a condition where blood pressure is consistently elevated above 130/80 mmHg.",
        'fever': "Fever is a temporary increase in body temperature, often due to infection. Normal: 98.6¬∞F (37¬∞C).",
        'covid': "COVID-19 is caused by SARS-CoV-2 virus. Common symptoms include fever, cough, and fatigue.",
        'medication': "Always take medications as prescribed. Consult your doctor before stopping any medication.",
    }
    
    # Sample questions for demonstration
    sample_questions = [
        "What is diabetes?",
        "Tell me about hypertension",
        "What should I know about fever?",
    ]
    
    print("Demo mode - asking sample questions:\n")
    
    for question in sample_questions:
        print(f"User: {question}")
        
        # Simple keyword matching
        question_lower = question.lower()
        response_found = False
        
        for keyword, response in knowledge_base.items():
            if keyword in question_lower:
                print(f"Bot: {response}")
                response_found = True
                break
        
        if not response_found:
            print("Bot: I don't have information about that. Please consult a healthcare professional.")
        
        print()
    
    print("‚ö†Ô∏è  DISCLAIMER: This is a simple demo. Always consult healthcare professionals for medical advice.")
    print("‚úÖ Chatbot demo complete!")

simple_medical_chatbot()

---
## Practice 7: Model Evaluation on MedQA Dataset

### üéØ Learning Objectives
- Evaluate model performance on medical benchmarks
- Calculate accuracy, precision, and recall
- Understand medical AI evaluation metrics

### üìñ Key Concepts
**MedQA:** Medical question answering benchmark dataset  
**Performance Metrics:** Accuracy, F1 score, AUROC

In [None]:
# 7.1 Simulate model evaluation
def evaluate_medical_model():
    """Simulate model evaluation on medical QA tasks"""
    from sklearn.metrics import accuracy_score, precision_recall_fscore_support, confusion_matrix
    
    print("Medical Model Evaluation")
    print("=" * 50)
    
    # Simulated predictions (in real scenario, these would come from your model)
    np.random.seed(42)
    n_samples = 100
    
    # True labels (0: incorrect, 1: correct)
    y_true = np.random.randint(0, 2, n_samples)
    
    # Predicted labels (simulating 85% accuracy)
    y_pred = y_true.copy()
    flip_indices = np.random.choice(n_samples, size=15, replace=False)
    y_pred[flip_indices] = 1 - y_pred[flip_indices]
    
    # Calculate metrics
    accuracy = accuracy_score(y_true, y_pred)
    precision, recall, f1, _ = precision_recall_fscore_support(y_true, y_pred, average='binary')
    cm = confusion_matrix(y_true, y_pred)
    
    print("\nPerformance Metrics:")
    print("-" * 50)
    print(f"Accuracy:  {accuracy:.3f} (85% is typical for good medical models)")
    print(f"Precision: {precision:.3f}")
    print(f"Recall:    {recall:.3f}")
    print(f"F1 Score:  {f1:.3f}")
    
    print("\nConfusion Matrix:")
    print(cm)
    
    # Visualization
    plt.figure(figsize=(8, 6))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
                xticklabels=['Incorrect', 'Correct'],
                yticklabels=['Incorrect', 'Correct'])
    plt.title('Confusion Matrix - Medical QA Model')
    plt.ylabel('True Label')
    plt.xlabel('Predicted Label')
    plt.tight_layout()
    plt.show()
    
    print("\n‚úÖ Model evaluation complete!")
    print("\nüìä Benchmark Comparison:")
    print("  - GPT-4 on MedQA: 87%")
    print("  - MedPaLM 2: 86.5%")
    print("  - Human physicians: ~80%")

evaluate_medical_model()

---
## Practice 8: HIPAA Compliance - Data Anonymization

### üéØ Learning Objectives
- Understand HIPAA requirements for PHI
- Implement data de-identification
- Remove 18 PHI identifiers from text

### üìñ Key Concepts
**PHI (Protected Health Information):** 18 identifiers that must be removed  
**De-identification:** Process of removing or masking PHI

In [None]:
# 8.1 Simple PHI detection and removal
def anonymize_medical_text():
    """Demonstrate simple PHI detection and anonymization"""
    import re
    
    print("Medical Text Anonymization (HIPAA Compliance)")
    print("=" * 50)
    
    # Sample clinical note with PHI
    clinical_note = """
    Patient: John Doe
    DOB: 01/15/1980
    SSN: 123-45-6789
    Address: 123 Main Street, Boston, MA 02101
    Phone: (555) 123-4567
    Email: john.doe@email.com
    
    Chief Complaint: Chest pain
    Patient reports onset of chest pain on 11/14/2024.
    Medical Record Number: MRN-987654
    """
    
    print("Original Clinical Note (with PHI):")
    print(clinical_note)
    print()
    
    # Define anonymization patterns
    anonymization_patterns = [
        (r'\b[A-Z][a-z]+ [A-Z][a-z]+\b', '[NAME]'),  # Names
        (r'\b\d{2}/\d{2}/\d{4}\b', '[DATE]'),  # Dates
        (r'\b\d{3}-\d{2}-\d{4}\b', '[SSN]'),  # SSN
        (r'\d+ [A-Za-z ]+ (Street|St|Avenue|Ave|Road|Rd)', '[ADDRESS]'),  # Addresses
        (r'\([0-9]{3}\) [0-9]{3}-[0-9]{4}', '[PHONE]'),  # Phone numbers
        (r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', '[EMAIL]'),  # Email
        (r'MRN-\d+', '[MRN]'),  # Medical record numbers
    ]
    
    # Apply anonymization
    anonymized_note = clinical_note
    for pattern, replacement in anonymization_patterns:
        anonymized_note = re.sub(pattern, replacement, anonymized_note)
    
    print("Anonymized Clinical Note (HIPAA Compliant):")
    print(anonymized_note)
    print()
    
    print("‚úÖ Anonymization complete!")
    print("\n‚ö†Ô∏è  Note: This is a simplified demo. Production systems should use:")
    print("   - Microsoft Presidio or similar tools")
    print("   - Comprehensive PHI detection (all 18 identifiers)")
    print("   - Audit logging and access controls")
    print("   - Encryption at rest and in transit")

anonymize_medical_text()

In [None]:
# 8.2 Visualize PHI detection statistics
def phi_detection_statistics():
    """Show statistics about PHI detection"""
    
    print("PHI Detection Statistics")
    print("=" * 50)
    
    # Simulated detection results
    phi_types = ['Names', 'Dates', 'SSN', 'Addresses', 'Phone', 'Email', 'MRN', 'Other']
    detection_counts = [45, 38, 12, 25, 30, 18, 42, 15]
    
    # Create DataFrame
    df = pd.DataFrame({
        'PHI Type': phi_types,
        'Detected': detection_counts
    })
    
    print(df.to_string(index=False))
    print(f"\nTotal PHI instances detected: {sum(detection_counts)}")
    
    # Visualization
    plt.figure(figsize=(10, 6))
    plt.bar(phi_types, detection_counts, color='#1E64C8')
    plt.title('PHI Detection by Type', fontsize=14, fontweight='bold')
    plt.xlabel('PHI Type')
    plt.ylabel('Number Detected')
    plt.xticks(rotation=45)
    plt.grid(axis='y', alpha=0.3)
    plt.tight_layout()
    plt.show()
    
    print("\n‚úÖ Statistics visualization complete!")

phi_detection_statistics()

---
## üéØ Practice Complete!

### Summary of What We Learned:

1. **Environment Setup**: Installing and configuring medical NLP libraries
2. **Pre-trained Models**: Loading and using medical language models (BioGPT, ClinicalBERT)
3. **Clinical NER**: Extracting medical entities from clinical text
4. **Medical QA**: Building question-answering systems for healthcare
5. **Text Classification**: Categorizing clinical notes by urgency
6. **Medical Chatbot**: Creating interactive medical information systems
7. **Model Evaluation**: Assessing performance on medical benchmarks
8. **HIPAA Compliance**: Anonymizing PHI for data privacy

### Key Insights:
- Medical AI requires specialized models trained on clinical data
- HIPAA compliance is critical - always anonymize PHI
- Current medical LLMs achieve 85-90% accuracy on benchmark tests
- Evaluation should include clinical validation, not just technical metrics

### Next Steps:
- Fine-tune models on your specific medical domain
- Implement comprehensive PHI detection using Presidio
- Deploy models with proper security and audit logging
- Conduct clinical validation studies
- Prepare for FDA regulatory approval (if applicable)

### Assignment (from Lecture):
**Build a Medical Question-Answering System**
- Dataset: MedQA or PubMedQA
- Deadline: 2 weeks
- Evaluation: Accuracy, code quality, documentation
- Bonus: Add HIPAA-compliant data handling

### Resources:
- Hugging Face Medical Models: https://huggingface.co/models?filter=medical
- MedQA Dataset: https://github.com/jind11/MedQA
- HIPAA Guidelines: https://www.hhs.gov/hipaa
- Microsoft Presidio: https://microsoft.github.io/presidio/

---
**‚ö†Ô∏è Important Disclaimer:**

This notebook is for educational purposes only. Medical AI systems must:
- Undergo rigorous clinical validation
- Comply with HIPAA and local regulations
- Receive appropriate FDA clearance (if applicable)
- Be supervised by licensed healthcare professionals
- Never replace professional medical judgment

Always consult healthcare professionals for medical advice.