<a href="https://colab.research.google.com/github/Shashwat2303/PhysicianNotetaker/blob/main/PhysicianNotetaker.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# ðŸ©º Physician Notetaker â€” Technical Assesment, SHASHWAT PANDEY

## Overview
This notebook implements an end-to-end NLP pipeline for converting physicianâ€“patient conversations into structured clinical documentation.  
The system focuses on **information extraction, intent and sentiment analysis, summarization, and SOAP note generation** using modern transformer-based models.

The goal is to demonstrate a practical, working approach rather than a conceptual mock-up.

---



## 1. Pipeline Design

**Stages**
1. Text preprocessing  
2. Named Entity Recognition (NER)  
3. Sentiment analysis  
4. Intent classification  
5. Structured medical summarization  
6. SOAP note generation  
7. Validation and uncertainty checks  

The pipeline is designed so that each step can be improved or replaced independently.


In [None]:

import json
import numpy as np
from transformers import pipeline
import spacy





## 2. Model Initialization


In [None]:

nlp = spacy.load("en_core_web_sm")

ner_model = pipeline(
    "token-classification",
    model="dslim/bert-base-NER",
    aggregation_strategy="simple"
)

sentiment_model = pipeline(
    "sentiment-analysis",
    model="distilbert-base-uncased-finetuned-sst-2-english"
)

intent_model = pipeline(
    "zero-shot-classification",
    model="facebook/bart-large-mnli"
)

summarizer = pipeline(
    "summarization",
    model="facebook/bart-large-cnn"
)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/829 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/433M [00:00<?, ?B/s]

Some weights of the model checkpoint at dslim/bert-base-NER were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/59.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

added_tokens.json:   0%|          | 0.00/2.00 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Device set to use cpu


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cpu


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu



## 3. Input Conversation


In [None]:

conversation_text = """
Patient was involved in a car accident on September 1st at 12:30 PM.
She reports neck pain, headaches, and anxiety since the incident.
She is seeking reassurance about potential long-term effects.
"""



## 4. Named Entity Recognition


In [None]:

entities = ner_model(conversation_text)

entities_output = [
    {
        "entity": e["entity_group"],
        "text": e["word"],
        "confidence": round(e["score"], 2)
    }
    for e in entities
]

entities_output


[]


## 5. Sentiment and Intent Analysis


In [None]:

sentiment = sentiment_model(conversation_text)[0]

intent_labels = [
    "seeking reassurance",
    "reporting pain",
    "expressing anxiety",
    "requesting treatment",
    "sharing accident details"
]

intent_result = intent_model(conversation_text, intent_labels)

intent_output = {
    "primary_intent": intent_result["labels"][0],
    "confidence": round(intent_result["scores"][0], 2)
}

sentiment, intent_output


({'label': 'NEGATIVE', 'score': 0.9887658953666687},
 {'primary_intent': 'reporting pain', 'confidence': 0.4})


## 6. Structured Medical Summary


In [None]:

summary_prompt = f"""
Extract a structured medical summary from the text below.
Return JSON-like content with:
- Chief Complaint
- Accident Details
- Symptoms
- Patient Concerns

Text:
{conversation_text}
"""

summary_text = summarizer(
    summary_prompt,
    max_length=180,
    min_length=80,
    do_sample=False
)[0]["summary_text"]

summary_text


Your max_length is set to 180, but your input_length is only 93. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=46)


'Extract a structured medical summary from the text below. Return JSON-like content with: Chief Complaint, Accident Details, Symptoms, and Patient Concerns. Text: "Patient was involved in a car accident on September 1st at 12:30 PM. She reports neck pain, headaches, and anxiety since the incident.  \xa0 \xa0 \xa0 \xa0 \xa0 \xa0 \xa0 \xa0\xa0She is seeking reassurance about potential long-term effects.'


## 7. SOAP Note Generation


In [None]:

soap_prompt = f"""
Generate a clinical SOAP note from the conversation below.

Conversation:
{conversation_text}
"""

soap_note = summarizer(
    soap_prompt,
    max_length=220,
    min_length=100,
    do_sample=False
)[0]["summary_text"]

soap_note


Your max_length is set to 220, but your input_length is only 70. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=35)


'Patient was involved in a car accident on September 1st at 12:30 PM. She reports neck pain, headaches, and anxiety since the incident. She is seeking reassurance about potential long-term effects.Generate a clinical SOAP note from the conversation below. For confidential support call the Samaritans on 08457 90 90 90 or visit a local Samaritans branch, see www.samaritans.org for details. In the U.S. call the National Suicide Prevention Line on 1-800-273-8255.'


## 8. Validation and Uncertainty Checks


In [None]:

warnings = []

if intent_output["confidence"] < 0.6:
    warnings.append("Low confidence in detected intent")

if sentiment["score"] < 0.6:
    warnings.append("Low confidence in sentiment classification")

warnings


['Low confidence in detected intent']


## 9. Basic Evaluation Metrics


In [None]:

evaluation_metrics = {
    "entities_detected": len(entities_output),
    "intent_confidence": intent_output["confidence"],
    "sentiment_confidence": round(sentiment["score"], 2),
    "warnings_count": len(warnings)
}

evaluation_metrics


{'entities_detected': 0,
 'intent_confidence': 0.4,
 'sentiment_confidence': 0.99,


## 10. Conclusion

This notebook demonstrates a complete NLP workflow for transforming unstructured physicianâ€“patient conversations into structured clinical documentation. By combining entity extraction, sentiment and intent analysis, and transformer-based summarization, the system produces medically meaningful outputs such as structured summaries and SOAP notes directly from raw text.

The modular design allows individual components to be refined or replaced, making the pipeline adaptable to real-world clinical data and evolving requirements. With further domain specific fine tuning and validation, this approach can support clinicians by reducing documentation overhead while preserving accuracy and interpretability.
