# 🩺 Physician Notetaker - Interactive Demo

This notebook demonstrates all features of the Physician Notetaker system:
1. **Medical NLP Summarization** - Extract medical entities, summarize conversations
2. **Sentiment & Intent Analysis** - Analyze patient emotions and intentions
3. **SOAP Note Generation** - Generate structured clinical notes

## Setup
First, let's import all necessary modules and initialize the system.

In [None]:
# Import required libraries
import sys
import json
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Import our modules
from src.medical_nlp import MedicalNLPProcessor
from src.sentiment_analysis import SentimentIntentAnalyzer
from src.soap_generator import SOAPNoteGenerator
from config import SAMPLE_CONVERSATION

print("✅ Modules imported successfully!")

In [None]:
# Initialize all components
print("Initializing Medical NLP Processor...")
nlp_processor = MedicalNLPProcessor()

print("\nInitializing Sentiment Analyzer...")
sentiment_analyzer = SentimentIntentAnalyzer()

print("\nInitializing SOAP Generator...")
soap_generator = SOAPNoteGenerator()

print("\n✅ All components initialized!")

## Sample Conversation

Let's view the sample physician-patient conversation we'll be analyzing:

In [None]:
print("SAMPLE MEDICAL CONVERSATION")
print("=" * 80)
print(SAMPLE_CONVERSATION)

---

# 1️⃣ Medical NLP Summarization

Extract key medical details including:
- Named Entity Recognition (NER)
- Text Summarization
- Keyword Extraction

In [None]:
# Generate structured medical summary
print("📋 STRUCTURED MEDICAL SUMMARY")
print("=" * 80)

medical_summary = nlp_processor.generate_structured_summary(SAMPLE_CONVERSATION)
print(json.dumps(medical_summary, indent=2))

In [None]:
# Extract medical entities
print("🔍 EXTRACTED MEDICAL ENTITIES")
print("=" * 80)

entities = nlp_processor.extract_entities(SAMPLE_CONVERSATION)

for entity_type, items in entities.items():
    if items:
        print(f"\n{entity_type.upper()}:")
        for item in items:
            print(f"  • {item}")

In [None]:
# Extract keywords
print("🔑 MEDICAL KEYWORDS")
print("=" * 80)

keywords = nlp_processor.extract_keywords(SAMPLE_CONVERSATION)
print("\nTop Medical Keywords:")
for keyword, score in keywords[:10]:
    print(f"  • {keyword}: {score:.4f}")

In [None]:
# Handle ambiguous data
print("⚠️  AMBIGUOUS DATA HANDLING")
print("=" * 80)

ambiguous_result = nlp_processor.handle_ambiguous_data(SAMPLE_CONVERSATION)

print("\nConfidence Scores:")
for field, score in ambiguous_result['confidence_scores'].items():
    status = "✅" if score >= 0.7 else "⚠️"
    print(f"  {status} {field}: {score:.2f}")

if ambiguous_result['low_confidence_fields']:
    print(f"\nLow Confidence Fields: {', '.join(ambiguous_result['low_confidence_fields'])}")
    print("\nRecommendations:")
    for rec in ambiguous_result['recommendations']:
        print(f"  • {rec}")
else:
    print("\n✅ All fields have high confidence!")

---

# 2️⃣ Sentiment & Intent Analysis

Analyze patient sentiment and detect their intentions throughout the conversation.

In [None]:
# Test individual patient statements
print("😊 PATIENT STATEMENT ANALYSIS")
print("=" * 80)

test_statements = [
    "I'm a bit worried about my back pain, but I hope it gets better soon.",
    "Good morning, doctor. I'm doing better, but I still have some discomfort now and then.",
    "The first four weeks were rough. My neck and back pain were really bad.",
    "That's a relief!",
    "Thank you, doctor. I appreciate it."
]

for i, statement in enumerate(test_statements, 1):
    print(f"\n{'='*80}")
    print(f"Statement {i}: \"{statement}\"")
    print("-" * 80)
    
    result = sentiment_analyzer.analyze_patient_dialogue(statement)
    
    # Display sentiment with emoji
    sentiment_emoji = {
        'Anxious': '😰',
        'Neutral': '😐',
        'Reassured': '😊'
    }
    
    emoji = sentiment_emoji.get(result['Sentiment'], '😐')
    print(f"  {emoji} Sentiment: {result['Sentiment']} (Confidence: {result['Sentiment_Confidence']})")
    print(f"  🎯 Intent: {result['Intent']} (Confidence: {result['Intent_Confidence']})")
    
    if result['Secondary_Intents']:
        print(f"  📌 Secondary Intents: {', '.join(result['Secondary_Intents'])}")

In [None]:
# Analyze full conversation
print("\n\n📊 FULL CONVERSATION ANALYSIS")
print("=" * 80)

conversation_summary = sentiment_analyzer.get_conversation_summary(SAMPLE_CONVERSATION)

print(f"\nTotal Patient Statements: {conversation_summary['Total_Patient_Statements']}")
print(f"Overall Sentiment: {conversation_summary['Overall_Sentiment']}")
print(f"Most Common Intent: {conversation_summary['Most_Common_Intent']}")

print("\nSentiment Distribution:")
for sentiment, count in conversation_summary['Sentiment_Distribution'].items():
    bar = '█' * count
    print(f"  {sentiment:12s}: {bar} ({count})")

print("\nIntent Distribution:")
for intent, count in conversation_summary['Intent_Distribution'].items():
    bar = '█' * count
    print(f"  {intent:25s}: {bar} ({count})")

In [None]:
# Visualize sentiment over time
import matplotlib.pyplot as plt

analyses = conversation_summary['Detailed_Analysis']

sentiments_order = [a['Sentiment'] for a in analyses]
sentiment_map = {'Anxious': 1, 'Neutral': 2, 'Reassured': 3}
sentiment_values = [sentiment_map.get(s, 2) for s in sentiments_order]

plt.figure(figsize=(12, 4))
plt.plot(range(1, len(sentiment_values) + 1), sentiment_values, marker='o', linewidth=2, markersize=8)
plt.yticks([1, 2, 3], ['Anxious', 'Neutral', 'Reassured'])
plt.xlabel('Patient Statement Number')
plt.ylabel('Sentiment')
plt.title('Patient Sentiment Throughout Conversation')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("\n📈 Sentiment trend shows patient's emotional journey during consultation.")

---

# 3️⃣ SOAP Note Generation (Bonus)

Generate a structured SOAP (Subjective, Objective, Assessment, Plan) note from the conversation.

In [None]:
# Generate SOAP note
print("📝 SOAP NOTE GENERATION")
print("=" * 80)

soap_note = soap_generator.generate_soap_note(SAMPLE_CONVERSATION)
print(json.dumps(soap_note, indent=2))

In [None]:
# Display SOAP note in formatted text
print("\n📋 FORMATTED SOAP NOTE")
print("=" * 80)

print("\n[S] SUBJECTIVE")
print("-" * 80)
print(f"Chief Complaint: {soap_note['Subjective']['Chief_Complaint']}")
print(f"\nHistory of Present Illness:")
print(f"  {soap_note['Subjective']['History_of_Present_Illness']}")

print("\n[O] OBJECTIVE")
print("-" * 80)
print(f"Physical Exam: {soap_note['Objective']['Physical_Exam']}")
print(f"Observations: {soap_note['Objective']['Observations']}")

print("\n[A] ASSESSMENT")
print("-" * 80)
print(f"Diagnosis: {soap_note['Assessment']['Diagnosis']}")
print(f"Severity: {soap_note['Assessment']['Severity']}")
if 'Additional_Notes' in soap_note['Assessment']:
    print(f"Notes: {soap_note['Assessment']['Additional_Notes']}")

print("\n[P] PLAN")
print("-" * 80)
print(f"Treatment:")
for treatment in soap_note['Plan']['Treatment']:
    print(f"  • {treatment}")
print(f"\nFollow-Up: {soap_note['Plan']['Follow_Up']}")
if 'Patient_Education' in soap_note['Plan']:
    print(f"Patient Education: {soap_note['Plan']['Patient_Education']}")

---

# 🎯 Complete Analysis Report

Let's generate a comprehensive report combining all analyses:

In [None]:
# Generate complete report
from main import PhysicianNotetaker

print("🩺 COMPREHENSIVE MEDICAL ANALYSIS REPORT")
print("=" * 80)

app = PhysicianNotetaker()
complete_results = app.process_conversation(SAMPLE_CONVERSATION)

# Save to file
app.save_results(complete_results, 'analysis_results.json')
print("\n💾 Complete results saved to: analysis_results.json")

---

# 🧪 Try Your Own Conversation

You can test the system with your own medical conversation:

In [None]:
# Custom conversation input
custom_conversation = """
Doctor: How are you feeling today?
Patient: I had a car accident. My neck and back hurt a lot for four weeks.
Doctor: Did you receive treatment?
Patient: Yes, I had ten physiotherapy sessions, and now I only have occasional back pain.
"""

print("Analyzing custom conversation...\n")

# Quick analysis
custom_summary = nlp_processor.generate_structured_summary(custom_conversation)
custom_soap = soap_generator.generate_soap_note(custom_conversation)

print("Medical Summary:")
print(json.dumps(custom_summary, indent=2))

print("\n\nSOAP Note:")
print(json.dumps(custom_soap, indent=2))

---

# 📚 Answers to Technical Questions

## Question 1: Handling Ambiguous or Missing Medical Data

**Approach:**
1. **Confidence Scoring**: Assign confidence scores to each extracted field
2. **Multiple Evidence Sources**: Use pattern matching + NER + contextual analysis
3. **Flagging Low Confidence**: Alert when data quality is below threshold
4. **Recommendations**: Provide actionable suggestions for verification

## Question 2: Pre-trained NLP Models for Medical Summarization

**Recommended Models:**
- **SciBERT / BioClinicalBERT**: Medical domain pre-training
- **SciSpacy**: Medical NER and entity linking
- **BART / T5**: Abstractive summarization
- **DistilBERT**: Efficient sentiment analysis

## Question 3: Fine-tuning BERT for Medical Sentiment

**Process:**
1. Start with BioClinicalBERT (medical domain)
2. Collect labeled patient conversations (5K-10K examples)
3. Fine-tune with learning rate 2e-5, 3-5 epochs
4. Use medical-specific sentiment labels: Anxious, Neutral, Reassured
5. Validate with clinical experts

## Question 4: Datasets for Healthcare Sentiment Model

**Recommended Datasets:**
- MedDialog (medical conversations)
- MIMIC-III Clinical Notes
- Custom annotated patient conversations
- Reddit medical communities (with proper annotation)

## Question 5: Training NLP for SOAP Format

**Approach:**
1. **Sequence-to-Sequence Model**: T5 or BART for generation
2. **Section Classification**: BERT to classify sentences into SOAP sections
3. **Template-based**: Rules + NER for structured output
4. **Hybrid**: Combine deep learning entity extraction with rule-based structuring

## Question 6: Improving SOAP Note Generation Accuracy

**Techniques:**
- **Rule-based constraints**: Enforce SOAP structure
- **Entity preservation**: Ensure medical terms are not lost
- **Multi-task learning**: Train on section classification + generation simultaneously
- **Post-processing**: Validate with medical ontologies (UMLS, SNOMED)
- **Expert validation**: Clinical review and feedback loop

---

# ✅ Summary

This notebook demonstrated:
1. ✅ Medical NLP with entity extraction and summarization
2. ✅ Sentiment analysis detecting patient emotional states
3. ✅ Intent detection identifying patient communication goals
4. ✅ SOAP note generation for clinical documentation
5. ✅ Comprehensive quality checking with confidence scores

## Next Steps:
- Fine-tune models on medical datasets
- Add more entity types (medications, dosages, etc.)
- Implement real-time transcription integration
- Add multi-language support
- Deploy as web API using FastAPI