## Summarising patient-doctor dialogue

This example uses LM Studio and an off-the-shelf large language studio (LLM) to summarise patient-doctor dialogue into a structured report.

Tools used:
- Python Data Analysis Library (Pandas) ([link](https://pandas.pydata.org/))
- LM Studio Python SDK ([link](https://lmstudio.ai/docs/python)). Note that the LM Studio desktop application must be running and the server is enabled within the application.

Data used:
- Clinical visit note summarisation corpus ([link](https://github.com/microsoft/clinical_visit_note_summarization_corpus))


In [1]:
# This cell loads required packages
import pandas as pd
import lmstudio as lms
from pydantic import BaseModel
from pprint import pprint


### Step 1: Load data as a Pandas DataFrame

In [None]:
df = pd.read_csv(
    filepath_or_buffer = "clinical_notes_corpus/data/mts-dialog/MTS_Dataset_TrainingSet.csv",
    index_col          = 0
    )

df.head(5)


Unnamed: 0_level_0,section_header,section_text,dialogue
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,GENHX,The patient is a 76-year-old white female who ...,Doctor: What brings you back into the clinic t...
1,GENHX,The patient is a 25-year-old right-handed Cauc...,Doctor: How're you feeling today? \nPatient: ...
2,GENHX,"This is a 22-year-old female, who presented to...","Doctor: Hello, miss. What is the reason for yo..."
3,MEDICATIONS,Prescribed medications were Salmeterol inhaler...,Doctor: Are you taking any over the counter me...
4,CC,"Burn, right arm.","Doctor: Hi, how are you? \nPatient: I burned m..."


### Step 2: Select data

The dataset contains a number of different types of dialogues.

In [3]:
df["section_header"].unique()


array(['GENHX', 'MEDICATIONS', 'CC', 'PASTMEDICALHX', 'ALLERGY',
       'FAM/SOCHX', 'PASTSURGICAL', 'OTHER_HISTORY', 'ASSESSMENT', 'ROS',
       'DISPOSITION', 'EXAM', 'PLAN', 'DIAGNOSIS', 'EDCOURSE',
       'IMMUNIZATIONS', 'LABS', 'IMAGING', 'PROCEDURES', 'GYNHX'],
      dtype=object)

In this cell, we select an example diaglogue from the dataset.

In [4]:
# Specify the type of data we need
header = "GENHX"

# Specify what record number we are going to use
record = 4

# This line fetches the dialogue text
dialogue = df.query(f"section_header=='{header}'")["dialogue"].iloc[record]

# This prints out the dialogue line by line, while adding line numbers
for i, line in enumerate(dialogue.splitlines()):
    print(f"{i:03d} {line}")


000 Doctor: I hear someone has a birthday coming up, how old are you turning, ma'am?
001 Patient: I'll be ninety later this month, doctor. 
002 Doctor: You're here for a follow up not very long after your first visit. What seems to be the problem today? 
003 Patient: Well, actually, my high blood pressure and right arm symptoms are basically gone. 
004 Doctor: Great, I saw your blood pressure reading for today, and it's much better. Are you taking the higher dose of Mavik we recommended? 
005 Patient: No, I'm not taking that right now. 
006 Doctor: How did you tolerate the higher dose? 
007 Patient: I didn't feel well at all, so I just went back to the old dose. 
008 Doctor: That was one M G per day, correct? 
009 Patient: Yes, doctor. I do think I could have some white coat high blood pressure. Doctors' offices make me nervous. 
010 Doctor: That does happen. Here, try this sample of Tarka, please. 
011 Patient: What is this for, doctor? 
012 Doctor: Well, since you're taking the Mavik

### Load LLM model

In this cell, we link to a model that has been loaded in the desktop version of LM Studio. Make sure that LM Studio application is running and the server is turned on!

In [5]:
model = lms.llm()


### Using LLM

The **system_prompt** creates the instructions for the LLM to create our structured report.

In [6]:
system_prompt = "Your job is to extract information from a conversation between a Doctor and Patient (tagged [DIALOGUE]) into a structured and concise report The report must contain the headers: 'Age', 'Gender', 'Reason for visit', 'Current symptoms', 'Diagnosis', 'Investigations', 'Medications', 'Medical History'. For 'current symptoms', use one bullet point for each symptom. For 'diagnosis', use one bullet for each condition explicitly state by the doctor. For 'investigations', use one bullet point for each examination, test or procedure conducted explicitly stated by the doctor. For 'medications', use one bullet point for each 'medication', and sub-bullets for 'dosage', 'currently taking (yes/no)', 'remarks'. For medical history', use one bullet point for each 'condition', and sub-bullets for 'diagnosed date', 'remarks'. If any information is not present in the dialogue, indicate it with 'Not specified'. Respond only with the report text. All medication names, condition names, information, remarks must indicate the source line number from the dialogue in the format '(xx line yy)' where xx is either Patient or Doctor and yy is the line number. Do not include information not explicitly stated in the dialogue and indicated by source."


Here we will send our instructions and dialogue to the LLM and retrieve our report.

In [None]:
prompt = (
    f"{system_prompt}\n"
    f"[DIALOGUE]\n{dialogue}"
    )

class Condition(BaseModel):
    condition:      str
    diagnosed_date: str
    remarks:        str

class Medication(BaseModel):
    medication:       str
    dosage:           str
    currently_taking: bool

class StructuredNote(BaseModel):
    age:              int
    gender:           str
    reason_for_visit: list[str]
    current_symptoms: list[str]
    diagnosis:        list[str]
    investigations:   list[str]
    medications:      list[Medication]
    medical_history:  list[Condition]

output = ""
prediction_stream = model.respond_stream(
    prompt,
    config={"temperature": 0.0},
    response_format=StructuredNote
    )
for fragment in prediction_stream:
    print(fragment.content, end="", flush=True)
    output += fragment.content
print()

structured_output = prediction_stream.result().parsed
pprint(structured_output)


{ "age": 90, "current_symptoms": [ "Stomach pain", "Mouth sores", "Tremors in upper body", "Pain in right arm" ], "diagnosis": [ "High blood pressure", "Degenerative changes in neck and right shoulder", "Possible rotator cuff injury" ], "gender": "Female", "investigations": [ "X-rays of the neck and right shoulder", "Blood pressure reading" ], "medical_history": [ { "condition": "High blood pressure", "diagnosed_date": "Not specified", "remarks": "Improved with medication adjustment" }, { "condition": "Muscle problem in right shoulder blade", "diagnosed_date": "Not specified", "remarks": "Pain basically gone" }, { "condition": "Mouth sores", "diagnosed_date": "Not specified", "remarks": "Recurring" }, { "condition": "Tremors", "diagnosed_date": "Since taking Lexapro", "remarks": "Upper body tremors" } ], "medications": [ { "currently_taking": true, "dosage": "Tylenol as needed", "medication": "Tylenol" }, { "currently_taking": true, "dosage": "Tums three times a day", "medication": "Tu