# ü©∫ Physician Notetaker - Demo Notebook

This notebook demonstrates the complete NLP pipeline for medical transcription analysis.

**Features:**
1. Medical Named Entity Recognition (NER)
2. Text Summarization
3. Sentiment & Intent Analysis
4. Keyword Extraction
5. SOAP Note Generation

**Author:** Himanshu Sharma  
**For:** Emitrr AI Engineer Intern Assignment

## Setup

In [None]:
# Install dependencies (uncomment if needed)
# !pip install -r ../requirements.txt
# !python -m spacy download en_core_web_lg

In [None]:
import sys
import json
from pathlib import Path

# Add parent directory to path for imports
sys.path.insert(0, str(Path.cwd().parent))

# Import our modules
from src.medical_ner import MedicalNERExtractor, extract_medical_entities
from src.summarizer import MedicalSummarizer, summarize_transcript
from src.sentiment_analyzer import MedicalSentimentAnalyzer, analyze_sentiment
from src.keyword_extractor import MedicalKeywordExtractor, extract_keywords
from src.soap_generator import SOAPNoteGenerator, generate_soap_note
from src.pipeline import PhysicianNotetaker

print("‚úÖ All modules imported successfully!")

## Load Sample Conversation

In [None]:
# Load the sample conversation from the assignment
SAMPLE_CONVERSATION = """
Physician: Good morning, Ms. Jones. How are you feeling today?

Patient: Good morning, doctor. I'm doing better, but I still have some discomfort now and then.

Physician: I understand you were in a car accident last September. Can you walk me through what happened?

Patient: Yes, it was on September 1st, around 12:30 in the afternoon. I was driving from Cheadle Hulme to Manchester when I had to stop in traffic. Out of nowhere, another car hit me from behind, which pushed my car into the one in front.

Physician: That sounds like a strong impact. Were you wearing your seatbelt?

Patient: Yes, I always do.

Physician: What did you feel immediately after the accident?

Patient: At first, I was just shocked. But then I realized I had hit my head on the steering wheel, and I could feel pain in my neck and back almost right away.

Physician: Did you seek medical attention at that time?

Patient: Yes, I went to Moss Bank Accident and Emergency. They checked me over and said it was a whiplash injury, but they didn't do any X-rays. They just gave me some advice and sent me home.

Physician: How did things progress after that?

Patient: The first four weeks were rough. My neck and back pain were really bad‚ÄîI had trouble sleeping and had to take painkillers regularly. It started improving after that, but I had to go through ten sessions of physiotherapy to help with the stiffness and discomfort.

Physician: That makes sense. Are you still experiencing pain now?

Patient: It's not constant, but I do get occasional backaches. It's nothing like before, though.

Physician: That's good to hear. Have you noticed any other effects, like anxiety while driving or difficulty concentrating?

Patient: No, nothing like that. I don't feel nervous driving, and I haven't had any emotional issues from the accident.

Physician: And how has this impacted your daily life? Work, hobbies, anything like that?

Patient: I had to take a week off work, but after that, I was back to my usual routine. It hasn't really stopped me from doing anything.

Physician: That's encouraging. Let's go ahead and do a physical examination to check your mobility and any lingering pain.

[Physical Examination Conducted]

Physician: Everything looks good. Your neck and back have a full range of movement, and there's no tenderness or signs of lasting damage. Your muscles and spine seem to be in good condition.

Patient: That's a relief!

Physician: Yes, your recovery so far has been quite positive. Given your progress, I'd expect you to make a full recovery within six months of the accident. There are no signs of long-term damage or degeneration.

Patient: That's great to hear. So, I don't need to worry about this affecting me in the future?

Physician: That's right. I don't foresee any long-term impact on your work or daily life. If anything changes or you experience worsening symptoms, you can always come back for a follow-up. But at this point, you're on track for a full recovery.

Patient: Thank you, doctor. I appreciate it.

Physician: You're very welcome, Ms. Jones. Take care, and don't hesitate to reach out if you need anything.
"""

print("üìÑ Sample conversation loaded!")
print(f"Length: {len(SAMPLE_CONVERSATION)} characters")

---
## 1. Medical Named Entity Recognition (NER)

Extract medical entities: Symptoms, Treatments, Diagnoses, Prognosis

In [None]:
# Initialize NER extractor
ner_extractor = MedicalNERExtractor()

# Extract entities
entities = ner_extractor.extract(SAMPLE_CONVERSATION)
entities = ner_extractor.handle_ambiguous_data(entities, SAMPLE_CONVERSATION)

print("üîç Extracted Medical Entities:\n")
print(json.dumps(entities.to_dict(), indent=2))

In [None]:
# Show confidence scores for each entity
print("\nüìä Entity Confidence Scores:\n")

for symptom in entities.symptoms:
    print(f"  Symptom: {symptom.text:<20} Confidence: {symptom.confidence:.2f}")

for treatment in entities.treatments:
    print(f"  Treatment: {treatment.text:<18} Confidence: {treatment.confidence:.2f}")

for diagnosis in entities.diagnoses:
    print(f"  Diagnosis: {diagnosis.text:<18} Confidence: {diagnosis.confidence:.2f}")

---
## 2. Text Summarization

Convert the transcript into a structured medical report.

In [None]:
# Initialize summarizer (uses Gemini API if available, else extractive)
summarizer = MedicalSummarizer()

# Generate summary
summary = summarizer.summarize(SAMPLE_CONVERSATION)

print("üìã Structured Medical Summary:\n")
print(json.dumps(summary.to_dict(), indent=2))

### Expected Output Format (from Assignment)

```json
{
  "Patient_Name": "Janet Jones",
  "Symptoms": ["Neck pain", "Back pain", "Head impact"],
  "Diagnosis": "Whiplash injury",
  "Treatment": ["10 physiotherapy sessions", "Painkillers"],
  "Current_Status": "Occasional backache",
  "Prognosis": "Full recovery expected within six months"
}
```

---
## 3. Keyword Extraction

Identify important medical phrases and keywords.

In [None]:
# Initialize keyword extractor
keyword_extractor = MedicalKeywordExtractor()

# Extract keywords
keywords = keyword_extractor.extract(SAMPLE_CONVERSATION)

print("üîë Extracted Keywords:\n")
print("Top Keywords (by TF-IDF score):")
for term, score in keywords.keywords[:10]:
    print(f"  {term:<25} Score: {score:.3f}")

print("\nMedical Terms Found:")
print(f"  {', '.join(keywords.medical_terms[:10])}")

print("\nKey Phrases:")
for phrase in keywords.phrases[:5]:
    print(f"  - {phrase}")

---
## 4. Sentiment & Intent Analysis

Classify patient sentiment and detect intent.

In [None]:
# Initialize sentiment analyzer
sentiment_analyzer = MedicalSentimentAnalyzer()

# Analyze full conversation
sentiment_result = sentiment_analyzer.analyze(SAMPLE_CONVERSATION)

print("üé≠ Sentiment Analysis Results:\n")
print(json.dumps(sentiment_result.to_dict(), indent=2))

In [None]:
# Test with specific patient statements
test_statements = [
    "I'm a bit worried about my back pain, but I hope it gets better soon.",
    "That's a relief! I'm so glad to hear that.",
    "My neck and back were really hurting for weeks."
]

print("üî¨ Testing Individual Statements:\n")
for statement in test_statements:
    result = sentiment_analyzer.analyze(statement)
    print(f"Statement: \"{statement[:50]}...\"")
    print(f"  Sentiment: {result.sentiment.sentiment}")
    print(f"  Intent: {result.intent.primary_intent}")
    print()

### Expected Output Format (from Assignment)

```json
{
  "Sentiment": "Anxious",
  "Intent": "Seeking reassurance"
}
```

---
## 5. SOAP Note Generation (Bonus)

Generate a structured SOAP note from the transcript.

In [None]:
# Initialize SOAP generator
soap_generator = SOAPNoteGenerator()

# Generate SOAP note
soap_note = soap_generator.generate(SAMPLE_CONVERSATION)

print("üìã Generated SOAP Note:\n")
print(json.dumps(soap_note.to_dict(), indent=2))

In [None]:
# Display in clinical format
print("\n" + soap_note.to_clinical_format())

### Expected Output Format (from Assignment)

```json
{
  "Subjective": {
    "Chief_Complaint": "Neck and back pain",
    "History_of_Present_Illness": "Patient had a car accident, experienced pain for four weeks, now occasional back pain."
  },
  "Objective": {
    "Physical_Exam": "Full range of motion in cervical and lumbar spine, no tenderness.",
    "Observations": "Patient appears in normal health, normal gait."
  },
  "Assessment": {
    "Diagnosis": "Whiplash injury and lower back strain",
    "Severity": "Mild, improving"
  },
  "Plan": {
    "Treatment": "Continue physiotherapy as needed, use analgesics for pain relief.",
    "Follow_Up": "Patient to return if pain worsens or persists beyond six months."
  }
}
```

---
## 6. Full Pipeline Demo

Run the complete analysis pipeline.

In [None]:
# Initialize the complete pipeline
pipeline = PhysicianNotetaker()

# Process the transcript
full_result = pipeline.process(SAMPLE_CONVERSATION)

print("üöÄ Complete Pipeline Results:\n")
print(full_result.to_json())

---
## 7. Testing with Sample Input from Assignment

In [None]:
# Test with the sample input from the assignment
ASSIGNMENT_SAMPLE = """
Doctor: How are you feeling today?
Patient: I had a car accident. My neck and back hurt a lot for four weeks.
Doctor: Did you receive treatment?
Patient: Yes, I had ten physiotherapy sessions, and now I only have occasional back pain.
"""

print("üìù Testing with Assignment Sample Input\n")
print("=" * 50)

# Medical Summary
summary = summarize_transcript(ASSIGNMENT_SAMPLE)
print("\nüìã Medical Summary:")
print(json.dumps(summary, indent=2))

# Sentiment
sentiment = analyze_sentiment("I'm a bit worried about my back pain, but I hope it gets better soon.")
print("\nüé≠ Sentiment Analysis:")
print(json.dumps({"Sentiment": sentiment["Sentiment"], "Intent": sentiment["Intent"]}, indent=2))

# SOAP Note
soap = generate_soap_note(ASSIGNMENT_SAMPLE)
print("\nüìã SOAP Note:")
print(json.dumps(soap, indent=2))

---
## Summary

This notebook demonstrated:

1. ‚úÖ **Medical NER** - Extracted symptoms, treatments, diagnoses with confidence scores
2. ‚úÖ **Text Summarization** - Generated structured medical summary in JSON
3. ‚úÖ **Keyword Extraction** - Identified important medical phrases
4. ‚úÖ **Sentiment Analysis** - Classified patient sentiment and intent
5. ‚úÖ **SOAP Note Generation** - Created structured clinical notes

### Key Technical Features:
- Confidence scoring for extractions
- Fallback mechanisms (works without API keys)
- Modular, extensible architecture
- Clinical formatting for healthcare use

### For Production Use:
- Add HIPAA compliance measures
- Fine-tune models on domain-specific data
- Implement human-in-the-loop verification
- Add audit logging and data encryption