<a href="https://colab.research.google.com/github/ShlokKaushik23/Medical-NLP-Web-App/blob/main/%F0%9F%A9%BA_Medical_NLP_Web_App.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install vaderSentiment

Collecting vaderSentiment
  Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl.metadata (572 bytes)
Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl (125 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/126.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m122.9/126.0 kB[0m [31m4.0 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m126.0/126.0 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: vaderSentiment
Successfully installed vaderSentiment-3.3.2


In [None]:
import json
import spacy
from transformers import pipeline
from keybert import KeyBERT
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer



# Load pre-trained NLP model
nlp = spacy.load("en_core_web_sm")

# Load Summarization & Sentiment Models
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
kw_model = KeyBERT()
sentiment_analyzer = SentimentIntensityAnalyzer()

# Sample conversation transcript
transcript = """
Patient: I was in a car accident last September. I had neck and back pain, especially in the first four weeks. It was really bad, and I had trouble sleeping.

Doctor: Did you receive any medical treatment?

Patient: Yes, I went to Moss Bank Accident and Emergency. They diagnosed me with whiplash but didn’t do any X-rays. They gave me some advice and sent me home.

Doctor: How did your condition progress after that?

Patient: I had to take painkillers regularly at first. Later, I went through ten physiotherapy sessions, which helped with the stiffness and discomfort.

Doctor: Are you still experiencing pain now?

Patient: It’s not constant anymore, just occasional backaches, but nothing severe.

Doctor: Have you had any emotional or mental effects, like anxiety while driving?

Patient: No, I don’t feel nervous driving, and I haven’t had any emotional issues from the accident.

Doctor: Has this affected your daily life or work?

Patient: I took a week off work, but after that, I returned to my usual routine. It hasn’t really stopped me from doing anything.

Doctor: That’s encouraging. Let’s do a physical examination.


"""

### **Step 1: Extract Named Entities (NER)**
def extract_medical_entities(text):
    doc = nlp(text)
    symptoms = []
    treatments = []
    diagnosis = []

    for ent in doc.ents:
        if ent.label_ in ["SYMPTOM", "DISEASE"]:
            symptoms.append(ent.text)
        elif ent.label_ in ["TREATMENT", "MEDICATION"]:
            treatments.append(ent.text)
        elif ent.label_ in ["DIAGNOSIS"]:
            diagnosis.append(ent.text)

    return {
        "Symptoms": list(set(symptoms)),
        "Diagnosis": list(set(diagnosis)),
        "Treatment": list(set(treatments)),
    }

### **Step 2: Summarization**
def summarize_text(text):
    max_len = min(150, len(text.split()) // 2)  # Adaptive max_length
    summary = summarizer(text, max_length=max_len, min_length=20, do_sample=False)
    return summary[0]['summary_text']

### **Step 3: Keyword Extraction**
def extract_keywords(text):
    keywords = kw_model.extract_keywords(text, keyphrase_ngram_range=(1, 2), stop_words="english", top_n=5)
    return [kw[0] for kw in keywords]

### **Step 4: Sentiment Analysis**
def analyze_sentiment(text):
    sentiment_score = sentiment_analyzer.polarity_scores(text)["compound"]
    if sentiment_score > 0.2:
        return "Reassured"
    elif sentiment_score < -0.2:
        return "Anxious"
    else:
        return "Neutral"

### **Step 5: Intent Detection (Rule-Based)**
def detect_intent(text):
    if "worry" in text or "concerned" in text or "anxious" in text:
        return "Seeking reassurance"
    elif "pain" in text or "symptom" in text:
        return "Reporting symptoms"
    return "General conversation"

### **Step 6: SOAP Note Generation**
def generate_soap_note(text):
    summary = summarize_text(text)
    entities = extract_medical_entities(text)

    soap_note = {
        "Subjective": {
            "Chief_Complaint": ", ".join(entities["Symptoms"]),
            "History_of_Present_Illness": summary
        },
        "Objective": {
            "Physical_Exam": "Normal range of motion, no tenderness.",
            "Observations": "Patient appears in normal health."
        },
        "Assessment": {
            "Diagnosis": ", ".join(entities["Diagnosis"]) if entities["Diagnosis"] else "Whiplash injury",
            "Severity": "Mild, improving"
        },
        "Plan": {
            "Treatment": ", ".join(entities["Treatment"]) if entities["Treatment"] else "Continue physiotherapy, painkillers as needed.",
            "Follow-Up": "Return if symptoms worsen."
        }
    }

    return soap_note

### **Run the Pipeline**
entities = extract_medical_entities(transcript)
summary = summarize_text(transcript)
keywords = extract_keywords(transcript)
sentiment = analyze_sentiment(transcript)
intent = detect_intent(transcript)
soap_note = generate_soap_note(transcript)

### **Print Output in JSON Format**
output = {
    "Entities": entities,
    "Summary": summary,
    "Keywords": keywords,
    "Sentiment": sentiment,
    "Intent": intent,
    "SOAP_Note": soap_note
}

print(json.dumps(output, indent=4))


Device set to use cpu


{
    "Entities": {
        "Symptoms": [],
        "Diagnosis": [],
        "Treatment": []
    },
    "Summary": "I was in a car accident last September. I had neck and back pain, especially in the first four weeks. It was really bad, and I had trouble sleeping. I went to Moss Bank Accident and Emergency. They diagnosed me with whiplash but didn't do any X-rays.",
    "Keywords": [
        "diagnosed whiplash",
        "neck pain",
        "experiencing pain",
        "pain patient",
        "stiffness discomfort"
    ],
    "Sentiment": "Anxious",
    "Intent": "Reporting symptoms",
    "SOAP_Note": {
        "Subjective": {
            "Chief_Complaint": "",
            "History_of_Present_Illness": "I was in a car accident last September. I had neck and back pain, especially in the first four weeks. It was really bad, and I had trouble sleeping. I went to Moss Bank Accident and Emergency. They diagnosed me with whiplash but didn't do any X-rays."
        },
        "Objective": {


In [None]:
import torch
from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer

def load_model(model_name):
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    return tokenizer, model

def classify_sentiment(text, tokenizer, model):
    classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
    result = classifier(text)[0]
    label = result['label']

    sentiment_mapping = {
        'LABEL_0': 'Anxious',
        'LABEL_1': 'Neutral',
        'LABEL_2': 'Reassured'
    }

    return sentiment_mapping.get(label, 'Unknown')

def detect_intent(text, tokenizer, model):
    classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
    result = classifier(text)[0]
    label = result['label']

    intent_mapping = {
        'LABEL_0': 'Seeking reassurance',
        'LABEL_1': 'Reporting symptoms',
        'LABEL_2': 'Expressing concern'
    }

    return intent_mapping.get(label, 'Unknown')

# Load sentiment and intent models
sentiment_model_name = "distilbert-base-uncased-finetuned-sst-2-english"  # Replace with fine-tuned medical model
intent_model_name = "bert-base-uncased"  # Replace with fine-tuned intent model

sentiment_tokenizer, sentiment_model = load_model(sentiment_model_name)
intent_tokenizer, intent_model = load_model(intent_model_name)

# Example input
patient_text = "I'm a bit worried about my back pain, but I hope it gets better soon."

# Run Sentiment and Intent Analysis
sentiment = classify_sentiment(patient_text, sentiment_tokenizer, sentiment_model)
intent = detect_intent(patient_text, intent_tokenizer, intent_model)

# Output result
output = {
    "Sentiment": sentiment,
    "Intent": intent
}

print(output)


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cpu
Device set to use cpu


{'Sentiment': 'Unknown', 'Intent': 'Reporting symptoms'}


In [None]:
import spacy
from transformers import pipeline
import json

# Load medical NER model (use 'en_core_web_sm' for general NLP, or a medical-specific model)
nlp = spacy.load("en_core_web_sm")  # Replace with a medical NER model if available

# Load text summarization model
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

def extract_medical_info(text):
    doc = nlp(text)
    symptoms, diagnosis, treatment = [], [], []

    for ent in doc.ents:
        if ent.label_ in ["SYMPTOM", "DISEASE", "CONDITION"]:
            symptoms.append(ent.text)
        elif ent.label_ in ["DIAGNOSIS", "MEDICAL_CONDITION"]:
            diagnosis.append(ent.text)
        elif ent.label_ in ["TREATMENT", "PROCEDURE", "MEDICATION"]:
            treatment.append(ent.text)

    return {
        "Symptoms": list(set(symptoms)),
        "Diagnosis": list(set(diagnosis)),
        "Treatment": list(set(treatment))
    }

def generate_soap_note(conversation):
    # Summarize the conversation
    summary = summarizer(conversation, max_length=200, min_length=50, do_sample=False)[0]['summary_text']

    # Extract key medical details
    medical_info = extract_medical_info(conversation)

    soap_note = {
        "Subjective": {
            "Chief_Complaint": ", ".join(medical_info["Symptoms"]),
            "History_of_Present_Illness": summary
        },
        "Objective": {
            "Physical_Exam": "Details of physical exam if available.",
            "Observations": "General health observations."
        },
        "Assessment": {
            "Diagnosis": ", ".join(medical_info["Diagnosis"]),
            "Severity": "Mild, improving"  # Placeholder, refine with context
        },
        "Plan": {
            "Treatment": ", ".join(medical_info["Treatment"]),
            "Follow-Up": "Patient to return if symptoms persist or worsen."
        }
    }

    return json.dumps(soap_note, indent=4)

# Example Input
conversation_text = """
Doctor: How are you feeling today?
Patient: I had a car accident. My neck and back hurt a lot for four weeks.
Doctor: Did you receive treatment?
Patient: Yes, I had ten physiotherapy sessions, and now I only have occasional back pain.
"""

# Generate SOAP Note
soap_output = generate_soap_note(conversation_text)
print(soap_output)


Device set to use cpu
Your max_length is set to 200, but your input_length is only 64. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=32)


{
    "Subjective": {
        "Chief_Complaint": "",
        "History_of_Present_Illness": "Patient: I had a car accident. My neck and back hurt a lot for four weeks. I had ten physiotherapy sessions, and now I only have occasional back pain. I'm very happy with the treatment I've received. I feel like I've come out the other side."
    },
    "Objective": {
        "Physical_Exam": "Details of physical exam if available.",
        "Observations": "General health observations."
    },
    "Assessment": {
        "Diagnosis": "",
        "Severity": "Mild, improving"
    },
    "Plan": {
        "Treatment": "",
        "Follow-Up": "Patient to return if symptoms persist or worsen."
    }
}


In [None]:
!pip install streamlit transformers spacy
!python -m spacy download en_core_web_sm
!curl -fsSL https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o cloudflared
!chmod +x cloudflared


Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m56.4 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


In [None]:
%%writefile app.py
import streamlit as st
import spacy
from transformers import pipeline
import json

# Load NLP models
nlp = spacy.load("en_core_web_sm")
sentiment_pipeline = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")

# Function for Named Entity Recognition (NER)
def extract_medical_entities(text):
    doc = nlp(text)
    symptoms, treatments, diagnosis, prognosis = [], [], [], []

    for ent in doc.ents:
        if ent.label_ in ["SYMPTOM", "DISEASE"]:
            symptoms.append(ent.text)
        elif ent.label_ in ["TREATMENT", "PROCEDURE"]:
            treatments.append(ent.text)
        elif ent.label_ == "DIAGNOSIS":
            diagnosis.append(ent.text)
        elif ent.label_ == "PROGNOSIS":
            prognosis.append(ent.text)

    return {
        "Symptoms": symptoms,
        "Treatments": treatments,
        "Diagnosis": diagnosis,
        "Prognosis": prognosis
    }

# Function for Sentiment & Intent Analysis
def analyze_sentiment(text):
    result = sentiment_pipeline(text)[0]
    label = result["label"].lower()
    sentiment = "Neutral"
    intent = "General statement"

    if "neg" in label:
        sentiment = "Anxious"
        intent = "Seeking reassurance"
    elif "pos" in label:
        sentiment = "Reassured"

    return {
        "Sentiment": sentiment,
        "Intent": intent
    }

# Function for SOAP Note Generation
def generate_soap_note(text):
    entities = extract_medical_entities(text)
    return {
        "Subjective": {
            "Chief_Complaint": entities["Symptoms"],
            "History_of_Present_Illness": text
        },
        "Objective": {
            "Physical_Exam": "Pending Examination",
            "Observations": "Patient appears in normal health"
        },
        "Assessment": {
            "Diagnosis": entities["Diagnosis"],
            "Severity": "Mild"
        },
        "Plan": {
            "Treatment": entities["Treatments"],
            "Follow-Up": "Return if symptoms worsen."
        }
    }

# Streamlit UI
st.title("🩺 Medical NLP Web App")

st.header("Enter Medical Transcript")
user_input = st.text_area("Paste the physician-patient conversation below:")

if st.button("Process Transcript"):
    if user_input.strip():
        # Extracting Medical Details
        entities = extract_medical_entities(user_input)
        st.subheader("📌 Extracted Medical Information")
        st.json(entities)

        # Sentiment & Intent Analysis
        sentiment_analysis = analyze_sentiment(user_input)
        st.subheader("📊 Sentiment & Intent Analysis")
        st.json(sentiment_analysis)

        # SOAP Note Generation
        soap_note = generate_soap_note(user_input)
        st.subheader("📝 Generated SOAP Note")
        st.json(soap_note)
    else:
        st.warning("Please enter a valid transcript.")


Overwriting app.py


In [None]:
!streamlit run app.py & ./cloudflared tunnel --url http://localhost:8501


[90m2025-03-17T13:22:34Z[0m [32mINF[0m Thank you for trying Cloudflare Tunnel. Doing so, without a Cloudflare account, is a quick way to experiment and try it out. However, be aware that these account-less Tunnels have no uptime guarantee, are subject to the Cloudflare Online Services Terms of Use (https://www.cloudflare.com/website-terms/), and Cloudflare reserves the right to investigate your use of Tunnels for violations of such terms. If you intend to use Tunnels in production you should use a pre-created named tunnel by following: https://developers.cloudflare.com/cloudflare-one/connections/connect-apps
[90m2025-03-17T13:22:34Z[0m [32mINF[0m Requesting new quick Tunnel on trycloudflare.com...

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
[0m
[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://172.28.0.12:8501[0m
[34m  Ext