# Brief Hospital Course Generation

This notebook performs step 2 of the Brief Hospital Course pipeline, in which we generate a brief hospital course from a set of service-level SOAP notes passed into a GPT-3.5 model. 

In [3]:
import pandas as pd
import ast
import numpy as np

from tqdm import tqdm
# from tqdm.auto import tqdm  # for notebooks
tqdm.pandas()

import os
import openai

In [4]:
from dotenv import load_dotenv
load_dotenv()  # take environment variables from .env.

True

## Read in GPT-Generated SOAP Notes

In [5]:
soap_notes = pd.read_csv("soap_note_sample.csv")

## Read in Radiology Reports

In [6]:
radiology = pd.read_csv("/gpfs/milgram/project/rtaylor/shared/DischargeMe/public/train/radiology.csv.gz")

## Read in Encounter-Level Structured Data

In [7]:
# discharge summaries
discharges = pd.read_csv("/gpfs/milgram/project/rtaylor/shared/DischargeMe/public/train/discharge.csv.gz")

# ed stays
edstays = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/public/train/edstays.csv.gz')

# triage
triage = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/public/train/triage.csv.gz')

# ward transfers
transfers = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/mimiciv/hosp/transfers.csv.gz')

# higher-level services (ICU, CARD, etc)
services = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/mimiciv/hosp/services.csv.gz')

# get patient info
pts = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/mimiciv/hosp/patients.csv.gz')

# admission demographics
admissions = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/mimiciv/hosp/admissions.csv.gz')

In [44]:

# diagnoses
diags = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/mimiciv/hosp/diagnoses_icd.csv.gz')
diags_icd = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/mimiciv/hosp/d_icd_diagnoses.csv.gz')


In [45]:
# drop any potential repeats
diags_icd = diags_icd.groupby(["icd_code", "icd_version"]).first().reset_index()

# grab long_titles for procs/diags

diags = diags.merge(diags_icd, on=["icd_code", "icd_version"], how="left")

### Clean up/type cast data

In [8]:
discharges = discharges.astype({"charttime":"datetime64[ns]",
                               "storetime":"datetime64[ns]"})

### Define Extraction Functions

In [13]:
radiology

Unnamed: 0,note_id,subject_id,hadm_id,note_type,note_seq,charttime,storetime,text
0,10000032-RR-22,10000032,22841357,RR,22,2180-06-26 17:15:00,2180-06-26 19:28:00,EXAMINATION: LIVER OR GALLBLADDER US (SINGLE ...
1,10000032-RR-23,10000032,22841357,RR,23,2180-06-26 17:17:00,2180-06-26 17:28:00,EXAMINATION: CHEST (PA AND LAT)\n\nINDICATION...
2,10000117-RR-13,10000117,22927623,RR,13,2181-11-15 00:40:00,2181-11-15 07:54:00,EXAMINATION: CHEST (PA AND LAT)\n\nINDICATIO...
3,10000117-RR-14,10000117,22927623,RR,14,2181-11-15 00:47:00,2181-11-15 01:12:00,EXAMINATION: NECK SOFT TISSUES\n\nINDICATION...
4,10000935-RR-71,10000935,21738619,RR,71,2187-07-11 11:16:00,2187-07-11 11:42:00,"HISTORY: Recurrent vomiting, subjective fever..."
...,...,...,...,...,...,...,...,...
284397,19999987-RR-17,19999987,23865745,RR,17,2145-11-02 22:37:00,2145-11-03 18:55:00,"HISTORY: ___, with left occipital bleeding. ..."
284398,19999987-RR-18,19999987,23865745,RR,18,2145-11-03 04:35:00,2145-11-03 10:46:00,INDICATION: ___ female intubated for head ble...
284399,19999987-RR-19,19999987,23865745,RR,19,2145-11-03 16:40:00,2145-11-04 08:36:00,HISTORY: ___ woman with left occipital hemorr...
284400,19999987-RR-20,19999987,23865745,RR,20,2145-11-04 05:10:00,2145-11-04 08:58:00,PORTABLE CHEST OF ___\n\nCOMPARISON: ___ radi...


In [14]:
# create initial input text prompt
def get_demos(subject_id):
    # has gender, anchor-age, date-of-death if exists
    return pts[pts['subject_id'] == subject_id].squeeze()
    
def get_transfers(hadm_id):
    return transfers[transfers['hadm_id'] == hadm_id].sort_values("intime").squeeze()

def get_triage_info(stay_id):
    return triage[triage['stay_id'] == stay_id]
    
def get_diags(hadm_id):
    adm_diags = diags[diags['hadm_id'] == hadm_id]
    return adm_diags.sort_values("seq_num")

def get_soap_notes(hadm_id):
    return soap_notes[soap_notes['hadm_id'] == hadm_id]

def get_radiology_reports(hadm_id):
    adm_radiology = radiology[radiology['hadm_id'] == hadm_id]
    return adm_radiology.sort_values("note_seq")
    

In [58]:
def create_pt_prompt(discharge_row):

    pt_edstays = edstays[edstays['hadm_id'] == discharge_row['hadm_id']]

    demos = get_demos(discharge_row['subject_id'])
    if demos.empty:
        age = r"[UNKNOWN AGE]"
        sex = r"[UNKNOWN SEX]"
    else:
        age = demos['anchor_age']
        sex = demos['gender']

    ccs = []
    for stay_id in pt_edstays['stay_id'].tolist():
        # TODO: Drop in Triage Vitals/Pain Scale, etc. Currently only using CCs
        triage_info = get_triage_info(stay_id)
        ccs.append(triage_info['chiefcomplaint'].squeeze())
        
    chief_complaints = ", ".join(ccs)    
    
    if sex:
        pronoun = ["he","his"] if sex == "M" else ['she',"her"]
    else:
        pronoun = ["they", "their"]

    transfers = get_transfers(discharge_row['hadm_id'])
    transfers = transfers[~transfers['careunit'].isna()]
    careunits = transfers['careunit'].tolist()
    
    # transfers with dates
    tranfers = get_transfers(discharge_row['hadm_id'])
    diags = get_diags(discharge_row['hadm_id'])

    # get GPT-created SOAP Notes
    soap_notes = get_soap_notes(discharge_row['hadm_id'])
    # get radiology reports
    radiology_reps = get_radiology_reports(discharge_row['hadm_id'])
    
    prompt = f"___ is a {age} year old {sex} presenting to the ED with {chief_complaints}. Over the course of {pronoun[1]} hospital course, ___ started at {careunits[0]} and then visited {', '.join(careunits[1:])}. Over the course of their hospital stay, {pronoun[0]} was given the following diagnoses: {', '.join(diags['long_title'])} in order of importance to this admission.----------------------------------------------------The patient received the following radiology consult reports:{'----------------'.join(radiology_reps['text'].tolist())}.----------------------------------------------------Please use the following SOAP notes from the patient's complete hospital course to create a brief hospital course summary: {'----------------'.join(soap_notes['gpt_SOAP_note'].tolist())}"
    
    return prompt
    

## Generate SOAP Note List Prompt

In [59]:
soap_notes

Unnamed: 0.1,Unnamed: 0,subject_id,hadm_id,transfer_id,eventtype,careunit,intime,outtime,service_prompts,gpt_SOAP_note
0,1275771,16752762,22661610.0,37688078,ED,Emergency Department,2174-10-14 14:04:00,2174-10-14 23:08:00,___ is a 83 year old F that initially presente...,"Subjective:\nThe patient, ___, an 83-year-old ..."
1,1275769,16752762,22661610.0,31268773,admit,Medicine,2174-10-14 23:08:00,2174-10-21 12:26:37,___ is a 83 year old F that initially presente...,"Subjective:\nThe patient, an 83-year-old femal..."


In [60]:
discharges[discharges['hadm_id'] == soap_notes.iloc[0]['hadm_id']].squeeze()

note_id                                          16752762-DS-21
subject_id                                             16752762
hadm_id                                                22661610
note_type                                                    DS
note_seq                                                     21
charttime                                   2174-10-21 00:00:00
storetime                                   2174-10-21 13:50:00
text           \nName:  ___                 Unit No:   ___\n...
Name: 48389, dtype: object

In [62]:
bhc_prompt = create_pt_prompt(discharges[discharges['hadm_id'] == soap_notes.iloc[0]['hadm_id']].squeeze())

## Create SOAP notes from GPT API (0-shot)

In [63]:
import openai
openai.api_type = "azure"
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_version = "2023-07-01-preview"
openai.api_key = os.getenv("OPENAI_API_KEY")
engine = "decile-gpt-35-turbo-16k"


In [64]:
message_text = [{"role":"system","content":f"You are the discharging physician that is reviewing a patient's SOAP notes over their hospital course, starting in the ED and being discharged from your unit and writing a brief hospital course summary based on this information."},]

gpt_bhc_prompt = {"role":"user",
                 "content":bhc_prompt}

message_text.append(gpt_bhc_prompt)

print(f"Brief Hospital Course Prompt: {message_text}")

Brief Hospital Course Prompt: [{'role': 'system', 'content': "You are the discharging physician that is reviewing a patient's SOAP notes over their hospital course, starting in the ED and being discharged from your unit and writing a brief hospital course summary based on this information."}, {'role': 'user', 'content': "___ is a 83 year old F presenting to the ED with Abd pain, Vomiting. Over the course of her hospital course, ___ started at Emergency Department and then visited Medicine. Over the course of their hospital stay, she was given the following diagnoses: Sepsis, unspecified organism, Acidosis, Other ascites, Chronic atrial fibrillation, Chronic obstructive pulmonary disease, unspecified, Urinary tract infection, site not specified, Severe sepsis without septic shock, Unspecified Escherichia coli [E. coli] as the cause of diseases classified elsewhere, Do not resuscitate, Long term (current) use of anticoagulants, Personal history of nicotine dependence, Unspecified dementi

In [65]:
completion = openai.ChatCompletion.create(
  engine=engine,
  messages = message_text,
)

In [78]:
zero_shot_output = completion['choices'][0]['message']['content']; print(zero_shot_output)

Hospital Course Summary:
The patient, an 83-year-old female, presented to the ED with abdominal pain and vomiting. She was diagnosed with sepsis of unspecified organism, acidosis, other ascites, chronic atrial fibrillation, chronic obstructive pulmonary disease, urinary tract infection, and major depressive disorder, among other diagnoses. During her stay in the Medicine ward, she underwent PICC line placement and received various medications to manage her conditions. The patient's vital signs, laboratory values, and clinical progress were closely monitored. Microbiology cultures were ordered to identify the infectious organism. Collaborative care was provided by the medical team, nursing staff, and other healthcare professionals. Discharge planning and follow-up appointments were arranged.


## Create SOAP notes from GPT API (1-shot)

In [20]:
import openai
openai.api_type = "azure"
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_version = "2023-07-01-preview"
openai.api_key = os.getenv("OPENAI_API_KEY")
engine = "decile-gpt-35-turbo-16k"


In [84]:
targets = pd.read_csv('/gpfs/milgram/project/rtaylor/shared/DischargeMe/public/train/discharge_target.csv.gz')

In [88]:
bhc_oneshot_sample = targets.sample(1)['brief_hospital_course'].squeeze()

In [89]:
message_text2 = [{"role":"system","content":f"You are the discharging physician that is reviewing a patient's SOAP notes over their hospital course, starting in the ED and being discharged from your unit and writing a brief hospital course summary based on this information. Using the following brief hospital course as an example: {bhc_oneshot_sample}"},]

gpt_bhc_prompt2 = {"role":"user",
                 "content":bhc_prompt}

message_text2.append(gpt_bhc_prompt)

print(f"Brief Hospital Course Prompt: {message_text2}")

Brief Hospital Course Prompt: [{'role': 'system', 'content': "You are the discharging physician that is reviewing a patient's SOAP notes over their hospital course, starting in the ED and being discharged from your unit and writing a brief hospital course summary based on this information. Using the following brief hospital course as an example: ___ with stage IIIB NSCLC on chemo and XRT presenting with \northostasis and a fall in the setting of poor PO intake due to \nthroat pain.\n\n# Near Syncope s/p Fall: Patient was found to be orthostatic in \nthe setting of poor PO intake due to throat pain.  There was no \nevidence of arrhythmia on telemetry, head CT was negative, and \nwas not on any true opiates (only on tramadol).  He was volume \nrepleted shortly thereafter with resolution of orthostasis.  \n\n# Pain/Malnutrition: Patient presented with significant throat \npain, mostly with swallowing, likely due to his active \nmalignancy and his ongoing radiation treatment that was \neve

In [90]:
completion2 = openai.ChatCompletion.create(
  engine=engine,
  messages = message_text2,
)

In [91]:
one_shot_output = completion2['choices'][0]['message']['content']; print(one_shot_output)

Summary:
___ is an 83-year-old female with multiple comorbidities who presented to the emergency department with abdominal pain and vomiting. She was diagnosed with sepsis, acidosis, other ascites, chronic atrial fibrillation, chronic obstructive pulmonary disease, urinary tract infection, severe sepsis, and more. During her hospital stay, the patient underwent PICC line placement and received various medications and treatments. Collaborative care was provided by multiple specialties, and close monitoring of vital signs and laboratory values was conducted. Discharge planning and follow-up appointments are underway.


In [97]:
zero_shot_output

"Hospital Course Summary:\nThe patient, an 83-year-old female, presented to the ED with abdominal pain and vomiting. She was diagnosed with sepsis of unspecified organism, acidosis, other ascites, chronic atrial fibrillation, chronic obstructive pulmonary disease, urinary tract infection, and major depressive disorder, among other diagnoses. During her stay in the Medicine ward, she underwent PICC line placement and received various medications to manage her conditions. The patient's vital signs, laboratory values, and clinical progress were closely monitored. Microbiology cultures were ordered to identify the infectious organism. Collaborative care was provided by the medical team, nursing staff, and other healthcare professionals. Discharge planning and follow-up appointments were arranged."

## Compute ROUGE Scores from Output

In [76]:
from rouge_score import rouge_scorer


In [96]:
# discharges[discharges['hadm_id'] == soap_notes.iloc[0]['hadm_id']].squeeze()
pt_target = targets[targets['hadm_id'] == soap_notes.iloc[0]['hadm_id']]['brief_hospital_course'].squeeze()

In [98]:
scorer = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True)
scores1 = scorer.score(zero_shot_output,
                      pt_target)

In [99]:
scores1

{'rouge1': Score(precision=0.18831168831168832, recall=0.26126126126126126, fmeasure=0.2188679245283019),
 'rougeL': Score(precision=0.09740259740259741, recall=0.13513513513513514, fmeasure=0.11320754716981132)}

In [100]:
scorer = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True)
scores1 = scorer.score(one_shot_output,
                      pt_target)

In [101]:
scores1

{'rouge1': Score(precision=0.17532467532467533, recall=0.3176470588235294, fmeasure=0.22594142259414227),
 'rougeL': Score(precision=0.1038961038961039, recall=0.18823529411764706, fmeasure=0.13389121338912133)}

# Qualitative Eval

## Final outputs


In [102]:
print(one_shot_output)

Summary:
___ is an 83-year-old female with multiple comorbidities who presented to the emergency department with abdominal pain and vomiting. She was diagnosed with sepsis, acidosis, other ascites, chronic atrial fibrillation, chronic obstructive pulmonary disease, urinary tract infection, severe sepsis, and more. During her hospital stay, the patient underwent PICC line placement and received various medications and treatments. Collaborative care was provided by multiple specialties, and close monitoring of vital signs and laboratory values was conducted. Discharge planning and follow-up appointments are underway.


In [104]:
print(zero_shot_output)

Hospital Course Summary:
The patient, an 83-year-old female, presented to the ED with abdominal pain and vomiting. She was diagnosed with sepsis of unspecified organism, acidosis, other ascites, chronic atrial fibrillation, chronic obstructive pulmonary disease, urinary tract infection, and major depressive disorder, among other diagnoses. During her stay in the Medicine ward, she underwent PICC line placement and received various medications to manage her conditions. The patient's vital signs, laboratory values, and clinical progress were closely monitored. Microbiology cultures were ordered to identify the infectious organism. Collaborative care was provided by the medical team, nursing staff, and other healthcare professionals. Discharge planning and follow-up appointments were arranged.


In [105]:
print(pt_target)

___ female with history of COPD and atrial fibrillation on 
apixaban who presents with one day of weakness, 1x emesis, 1x 
diarrhea, found to have sepsis with lactic acidosis (lactate 4) 
with UTI as suspected source. 

#Severe sepsis: Patient presented with elevated lactate, WBC 
>13K, T 96.8, with UTI as suspected source. She received IV 
cipro in ED, which was broadened to zosyn given history of 
multidrug resistant E.coli UTI. After receiving 2L IVF, her 
lactate resolved to 1.8. She completed a 6 days of Zosyn in the 
hospital followed by a final dose of ertapenem to complete a ___fib: CHADS2 of 2 for age and HTN; CHADSVasc of 4. Dig level on 
admission 1.6. Continued metoprolol succinate25 mg daily, 
digoxin 0.125, and Apixaban 2.5 mg BID. 

#Dementia: continued home donepezil 5 mg

#Depression: continued home sertraline 50 mg, mirtazapine 15 mg 
qHS  

#COPD: not on home maintenance medications


## SOAP Notes

In [111]:
print("\n----------------------------------------------------------------------\n")
print(soap_notes['gpt_SOAP_note'].tolist()[0])
print("\n----------------------------------------------------------------------\n")
print(soap_notes['gpt_SOAP_note'].tolist()[1])


----------------------------------------------------------------------

Subjective:
The patient, ___, an 83-year-old female, presented to the Emergency Department with complaints of abdominal pain and vomiting. She appeared distressed and reported a recent decline in her overall health. According to her family, she has a history of chronic atrial fibrillation, chronic obstructive pulmonary disease, urinary tract infections, and recurrent episodes of depression. They also mentioned that she has a diagnosis of unspecified dementia without behavioral disturbance. The patient's family stated that she currently uses anticoagulants and has a history of nicotine dependence. It is important to note that the patient has a Do Not Resuscitate directive in place.

Objective:
During her stay in the Emergency Department ward, the patient's vital signs were stable within normal limits. However, she exhibited signs of sepsis, such as fever, elevated heart rate, and generalized abdominal tenderness. L

## Full Discharge Summary

In [116]:
print(discharges[discharges['hadm_id'] == soap_notes.iloc[0]['hadm_id']]['text'].squeeze())

 
Name:  ___                 Unit No:   ___
 
Admission Date:  ___              Discharge Date:   ___
 
Date of Birth:  ___             Sex:   F
 
Service: MEDICINE
 
Allergies: 
Percocet / Sulfa (Sulfonamide Antibiotics)
 
Attending: ___
 
Chief Complaint:
Weakness, nausea, vomiting, diarrhea
 
Major Surgical or Invasive Procedure:
None

 
History of Present Illness:
diarrhea. No known fever. Denies chest pain or shortness of 
breath. Reports chronic cough.  
 In the ED, initial vitals were: 96.8 84 128/76 16 96% RA  
 Labs were significant for lactate of 4.0, WBC 13.1 (85% PMNs), 
UA w/ > 182 WBCs, few bacteria, neg nitrites, INR 1.4. Trop neg, 
LFTs normal, dig level 1.6.  
 CXR and CT abd/pel w/ con were w/o acute processes.  
 Patient was given:  
 ___ 16:41 IVF 500 mL NS 500 mL  
 ___ 18:09 PO Thiamine 100 mg  
 ___ 18:09 PO FoLIC Acid 1 mg  
 ___ 18:09 PO Multivitamins 1 TAB  
 ___ 21:21 IVF 500 mL NS 500 mL  
 ___ 21:21 IV Ciprofloxacin 400 mg  
 Repeat lactate was 3.3.  
 Vita