## Summarising patient-doctor dialogue

This example uses an LM Studio and an off-the-shelf LLM to summarise freeform patient-doctor dialogue into a structured report.

### Load data

In [1]:
import pandas as pd
ds = pd.read_csv(
    filepath_or_buffer="clinical_notes_corpus/data/mts-dialog/MTS_Dataset_TrainingSet.csv",
    index_col=0
)
ds.head(5)


Unnamed: 0_level_0,section_header,section_text,dialogue
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,GENHX,The patient is a 76-year-old white female who ...,Doctor: What brings you back into the clinic t...
1,GENHX,The patient is a 25-year-old right-handed Cauc...,Doctor: How're you feeling today? \nPatient: ...
2,GENHX,"This is a 22-year-old female, who presented to...","Doctor: Hello, miss. What is the reason for yo..."
3,MEDICATIONS,Prescribed medications were Salmeterol inhaler...,Doctor: Are you taking any over the counter me...
4,CC,"Burn, right arm.","Doctor: Hi, how are you? \nPatient: I burned m..."


In [2]:
ds["section_header"].unique()


array(['GENHX', 'MEDICATIONS', 'CC', 'PASTMEDICALHX', 'ALLERGY',
       'FAM/SOCHX', 'PASTSURGICAL', 'OTHER_HISTORY', 'ASSESSMENT', 'ROS',
       'DISPOSITION', 'EXAM', 'PLAN', 'DIAGNOSIS', 'EDCOURSE',
       'IMMUNIZATIONS', 'LABS', 'IMAGING', 'PROCEDURES', 'GYNHX'],
      dtype=object)

In [3]:
header = "GENHX"
record = 4
dialogue = ds.query(f"section_header=='{header}'")["dialogue"].iloc[record]
for i, line in enumerate(dialogue.splitlines()):
    print(f"{i:03d} {line}")


000 Doctor: I hear someone has a birthday coming up, how old are you turning, ma'am?
001 Patient: I'll be ninety later this month, doctor. 
002 Doctor: You're here for a follow up not very long after your first visit. What seems to be the problem today? 
003 Patient: Well, actually, my high blood pressure and right arm symptoms are basically gone. 
004 Doctor: Great, I saw your blood pressure reading for today, and it's much better. Are you taking the higher dose of Mavik we recommended? 
005 Patient: No, I'm not taking that right now. 
006 Doctor: How did you tolerate the higher dose? 
007 Patient: I didn't feel well at all, so I just went back to the old dose. 
008 Doctor: That was one M G per day, correct? 
009 Patient: Yes, doctor. I do think I could have some white coat high blood pressure. Doctors' offices make me nervous. 
010 Doctor: That does happen. Here, try this sample of Tarka, please. 
011 Patient: What is this for, doctor? 
012 Doctor: Well, since you're taking the Mavik

### Load LLM model

In [4]:
import lmstudio as lms
model = lms.llm("llama-3.3-70b-instruct")


The **system_prompt** creates the instructions for the LLM to create our structured report.

In [5]:
system_prompt = "Your job is to extract information from a conversation between a Doctor and Patient (tagged [DIALOGUE]) into a structured and concise report The report must contain the headers: 'Age', 'Gender', 'Reason for visit', 'Current symptoms', 'Diagnosis', 'Investigations', 'Medications', 'Medical History'. For 'current symptoms', use one bullet point for each symptom. For 'diagnosis', use one bullet for each condition explicitly state by the doctor. For 'investigations', use one bullet point for each examination, test or procedure conducted explicitly stated by the doctor. For 'medications', use one bullet point for each 'medication', and sub-bullets for 'dosage', 'currently taking (yes/no)', 'remarks'. For medical history', use one bullet point for each 'condition', and sub-bullets for 'diagnosed date', 'remarks'. If any information is not present in the dialogue, indicate it with 'Not specified'. Respond only with the report text. All medication names, condition names, information, remarks must indicate the source line number from the dialogue in the format '(xx line yy)' where xx is either Patient or Doctor and yy is the line number. Do not include information not explicitly stated in the dialogue and indicated by source."


Here we will send our instructions and dialogue to the LLM and retrieve our report.

In [6]:
prompt = (
    f"{system_prompt}\n"
    f"[DIALOGUE]\n{dialogue}"
)

output = ""
for fragment in model.respond_stream(prompt):
    print(fragment.content, end="", flush=True)
    output += fragment.content


## Age
90

## Gender
Female

## Reason for visit
Follow up on high blood pressure and right arm symptoms (Doctor line 3)

## Current symptoms
* Stomach pain (Patient line 43)
* Sores in the mouth (Patient line 47)
* Tremors on the upper body, including torso and arms (Patient line 51)

## Diagnosis
* High blood pressure (Patient line 5)
* Possible rotator cuff injury (Doctor line 31)
* Degenerative changes in the neck and right shoulder (Doctor line 31)

## Investigations
* Blood pressure reading (Doctor line 5)
* X rays of the neck and right shoulder (Doctor line 29)

## Medications
* Mavik 
  + Dosage: 1 MG per day (Patient line 13)
  + Currently taking: yes (Patient line 13)
  + Remarks: patient did not tolerate higher dose (Patient line 11)
* Tarka 
  + Dosage: Not specified
  + Currently taking: no (Patient line 19)
  + Remarks: patient cannot take due to size of pills (Patient line 19)
* Verapamil 
  + Dosage: Not specified
  + Currently taking: yes (Doctor line 17)
  + Remarks: 