<a href="https://colab.research.google.com/github/DanishM10/HighRiskProject-SimplifiedSummarizationOfSyntheticClinicalNotes/blob/main/HighRiskProject.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas as pd
import openai
import os

import getpass
import os
from google.colab import files
uploaded = files.upload()

# Load CSVs
conditions = pd.read_csv("conditions.csv")
observations = pd.read_csv("observations.csv")
medications = pd.read_csv("medications.csv")

# Set API Key
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

# Merge
cond_sampled = conditions.groupby("PATIENT").first().reset_index()
obs_sampled = observations.groupby("PATIENT").first().reset_index()
meds_sampled = medications.groupby("PATIENT").first().reset_index()

# Merge all three
merged = cond_sampled.merge(obs_sampled, on="PATIENT").merge(meds_sampled, on="PATIENT")

Saving conditions.csv to conditions.csv
Saving medications.csv to medications.csv
Saving observations.csv to observations.csv
Enter your OpenAI API key: ··········


In [2]:
# Generate clinical notes with condition, observation, and medication
notes = []

for i in range(10):  # Pick 10 patients
    condition = merged.loc[i, 'DESCRIPTION_x']
    observation = merged.loc[i, 'DESCRIPTION_y']
    value = merged.loc[i, 'VALUE']
    unit = merged.loc[i, 'UNITS']
    medication = merged.loc[i, 'DESCRIPTION']

    note = (f"Patient diagnosed with {condition}. "
            f"Observation shows {observation} at {value} {unit}. "
            f"Started treatment with {medication}.")
    notes.append(note)

# Display notes
for idx, note in enumerate(notes):
    print(f"\nNote {idx+1}:\n{note}")


Note 1:
Patient diagnosed with Seizure disorder (disorder). Observation shows Glucose [Mass/volume] in Serum or Plasma at 92.8 mg/dL. Started treatment with Naproxen sodium 220 MG Oral Tablet.

Note 2:
Patient diagnosed with Housing unsatisfactory (finding). Observation shows Hemoglobin A1c/Hemoglobin.total in Blood at 6.0 %. Started treatment with Naproxen sodium 220 MG Oral Tablet.

Note 3:
Patient diagnosed with Chronic sinusitis (disorder). Observation shows Hemoglobin A1c/Hemoglobin.total in Blood at 6.4 %. Started treatment with Naproxen sodium 220 MG Oral Tablet.

Note 4:
Patient diagnosed with Received higher education (finding). Observation shows Hemoglobin A1c/Hemoglobin.total in Blood at 6.3 %. Started treatment with Vitamin B12 5 MG/ML Injectable Solution.

Note 5:
Patient diagnosed with Received higher education (finding). Observation shows Body Height at 172.0 cm. Started treatment with Hydrochlorothiazide 25 MG Oral Tablet.

Note 6:
Patient diagnosed with Risk activity 

In [3]:
# GPT API Call
def call_gpt(prompt, model="gpt-3.5-turbo"):
    client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": "You are a helpful medical assistant."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.5
    )
    return response.choices[0].message.content

# Zero-shot
def zero_shot_prompt(note):
    return f"Summarize the following clinical note in simple, easy-to-understand language:\n\n{note}"

# Few-shot
def few_shot_prompt(note):
    return (
        "Simplify clinical notes for patients.\n"
        "Example:\n"
        "Note: 'Patient diagnosed with hypertension and prescribed ACE inhibitors.'\n"
        "Simplified: 'The patient has high blood pressure and was given medicine to control it.'\n\n"
        "Note: 'CT scan indicates pulmonary embolism.'\n"
        "Simplified: 'A scan found a blood clot in the lung.'\n\n"
        "Now simplify:\n"
        f"{note}"
    )

# Chain-of-Thought
def cot_prompt(note):
    return f"What does this clinical note mean? Let's think step by step.\n\n{note}"

# Tree-of-Thought
def tot_prompt(note):
    return (f"Given the following clinical note: {note}\n"
            "Please:\n"
            "1. List three possible interpretations of the note.\n"
            "2. For each interpretation, suggest a possible summary.\n"
            "3. Select the best summary based on patient understanding.")


In [4]:
# Store results
summarization_results = []

for idx, note in enumerate(notes):
    print(f"\nProcessing Note {idx+1}")

    zs_summary = call_gpt(zero_shot_prompt(note))
    fs_summary = call_gpt(few_shot_prompt(note))
    cot_summary = call_gpt(cot_prompt(note))
    tot_summary = call_gpt(tot_prompt(note))

    summarization_results.append({
        "Note": note,
        "Zero-shot Summary": zs_summary,
        "Few-shot Summary": fs_summary,
        "Chain-of-Thought Summary": cot_summary,
        "Tree-of-Thought Summary": tot_summary
    })

# Save to CSV
results_df = pd.DataFrame(summarization_results)
results_df.to_csv("final_gpt_summarization_results.csv", index=False)

print("\nSummarization Completed and saved.")


Processing Note 1

Processing Note 2

Processing Note 3

Processing Note 4

Processing Note 5

Processing Note 6

Processing Note 7

Processing Note 8

Processing Note 9

Processing Note 10

Summarization Completed and saved.


In [4]:
from IPython.display import display
# Read the saved CSV
results_df = pd.read_csv("final_gpt_summarization_results.csv")

# Display the table
display(results_df)

Unnamed: 0,Note,Zero-shot Summary,Few-shot Summary,Chain-of-Thought Summary,Tree-of-Thought Summary
0,Patient diagnosed with Seizure disorder (disor...,The patient has a seizure disorder. Their bloo...,The patient has a seizure disorder. Blood suga...,Let's break down the clinical note step by ste...,1. Possible Interpretations:\n a. The patien...
1,Patient diagnosed with Housing unsatisfactory ...,The patient has been diagnosed with unsatisfac...,Simplified: 'Patient has unstable housing. Blo...,Let's break down the clinical note step by ste...,1. Possible interpretations:\n a. The patien...
2,Patient diagnosed with Chronic sinusitis (diso...,The patient has a condition called chronic sin...,The patient has a long-term sinus infection. B...,Let's break down the clinical note step by ste...,1. Possible interpretations:\n- The patient ha...
3,Patient diagnosed with Received higher educati...,The patient has a higher level of education. B...,The patient has completed higher education. Te...,Let's break down the clinical note step by ste...,1. Possible Interpretations:\n a. The patien...
4,Patient diagnosed with Received higher educati...,The patient has been diagnosed with high blood...,The patient has completed higher education. Th...,Let's break down the clinical note step by ste...,1. Possible interpretations:\n a. The patien...
5,Patient diagnosed with Risk activity involveme...,The patient was found to have a condition rela...,"Simplified: ""Patient engaged in risky activiti...",Let's break down the clinical note step by ste...,1. Possible interpretations of the note:\n a...
6,Patient diagnosed with Received higher educati...,The patient has been diagnosed with a conditio...,"Simplified: ""Patient completed higher educatio...",Let's break down the clinical note step by ste...,1. Possible interpretations of the note:\n a...
7,Patient diagnosed with Chronic sinusitis (diso...,The patient has chronic sinusitis. Some blood ...,The patient has chronic sinusitis. Blood tests...,Let's break down the clinical note step by ste...,1. Possible interpretations:\n- The patient ha...
8,Patient diagnosed with Received higher educati...,"The patient has high blood sugar levels, so th...",Simplified: Patient found to have completed hi...,Let's break down the clinical note step by ste...,1. Possible Interpretations:\n a. The patien...
9,Patient diagnosed with Medication review due (...,The patient needs to review their medications....,"Simplified: ""Patient needs medication review. ...",Let's break down the clinical note step by ste...,1. Possible Interpretations:\n a. The patien...
