# Automatic span annotations

We use GPT-4 to automatically annotate the text notes we generated with spans of where each symptom is mentioned in the note. This entails telling GPT-4 the symptoms experienced by the patient, and asking it to extract from the clinical note all phrases that mention these symptoms (whether positive or negative). If the symptom is not mentioned at all in the note, then no phrase should be extracted either. 

We use a different strategy for the normal notes and for the compact version of these notes. 

## Normal notes

We use the function below to generate the prompt for each note. 

In [4]:
import pickle
with open("../data/df_synsum.p", "rb") as file: 
    df = pickle.load(file)

In [9]:
sympt_dict = {"dyspnea": "dysp", "cough": "cough", "nasal symptoms": "nasal", "respiratory pain": "pain", "fever": "fever"}

def generate_prompt(row): 

    text_note = row["text"]
    text_note = text_note.replace("\n", " ")

    prompt = f"The following information is known about the patient's symptoms:\n"
    for sympt, sympt_col_name in sympt_dict.items():  
        sympt_val = row[sympt_col_name]
        prompt += f"- {sympt}: {sympt_val}\n"
    prompt += f"\nFollowing the instructions you received, please extract from the following clinical note all phrases (verbatim) that describe these symptoms:\n\"{text_note}\""

    return prompt

In [10]:
print(generate_prompt(df.loc[0]))

The following information is known about the patient's symptoms:
- dyspnea: no
- cough: no
- nasal symptoms: no
- respiratory pain: no
- fever: high

Following the instructions you received, please extract from the following clinical note all phrases (verbatim) that describe these symptoms:
"**History** Patient reports a significant increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever. There have been no respiratory symptoms such as pain, dyspnea, or cough. The patient illustrates general malaise and mentions feeling very fatigued due to the fever. No notable changes in daily routine or exposure to environments that might typically contribute to fever are reported. Recent stress levels and potential exposure to infectious agents during travels are also discussed.  **Physical Examination** Vital signs show elevated temperature (103 °F). Heart rate is slightly tachycardic at 98 bpm, corresponding with the fever. Oxygen saturation is within

In the system message, we show the LLM the annotation instructions. We explicitly ask the LLM to reply with a json object, so we can easily process the extracted phrases further. The annotations instructions are as follows: 

---

I will show you a clinical note containing information on a patient's symptoms. For each symptom, I will tell you whether the patient suffers from this symptom or not. 

Your task is to extract phrases from the note that mention these symptoms. The annotation must have the following JSON structure:  
[ 
   {
      "symptom": one of the symptoms ("dyspnea", "cough", "respiratory pain", "fever" or "nasal symptoms")
      "text": phrase in the text that mentions the symptom and whether it is present or absent
   }  
   {
      "symptom": ...
      "text":...
   }
   ...
]

Keep the following instructions in mind:  
- The same symptom may be mentioned multiple times. Include all phrases in which a symptom is mentioned. Consider both the "history" portion of the note, and the "physical examination" portion of the note.
- Also annotate a symptom if the note mentions that the patient does not suffer from it. 
- The phrases do not need to be full sentences, but need to be verbatim as they appear in the note. You are not allowed to alter any words. If you leave out words, use ...
- Order does not matter.
- You will reply only with the JSON itself, and you will not wrap in JSON markers.
- You can only extract phrases from the "clinical note", not from any of the other text in the prompt. 
- Not all symptoms are necessarily mentioned in the note. Do not include a symptom in the JSON if you cannot find any implicit or explicit mention of it in the clinical note. 

---

In [None]:
import openai

SYS_MESSAGE_NORMAL = """I will show you a clinical note containing information on a patient's symptoms. For each symptom, I will tell you whether the patient suffers from this symptom or not. 

Your task is to extract phrases from the note that mention these symptoms. The annotation must have the following JSON structure:  
[ 
   {
      "symptom": one of the symptoms ("dyspnea", "cough", "respiratory pain", "fever" or "nasal symptoms")
      "text": phrase in the text that mentions the symptom and whether it is present or absent
   }  
   {
      "symptom": ...
      "text":...
   }
   ...
]

Keep the following instructions in mind:  
- The same symptom may be mentioned multiple times. Include all phrases in which a symptom is mentioned. Consider both the "history" portion of the note, and the "physical examination" portion of the note.
- Also annotate a symptom if the note mentions that the patient does not suffer from it. 
- The phrases do not need to be full sentences, but need to be verbatim as they appear in the note. You are not allowed to alter any words. If you leave out words, use ...
- Order does not matter.
- You will reply only with the JSON itself, and you will not wrap in JSON markers.
- You can only extract phrases from the "clinical note", not from any of the other text in the prompt. 
- Not all symptoms are necessarily mentioned in the note. Do not include a symptom in the JSON if you cannot find any implicit or explicit mention of it in the clinical note. 
"""

def prompt_GPT_full(row, compl="normal"): 

   messages = []
   system_message = {"role": "system", "content": SYS_MESSAGE_NORMAL}
   messages.append(system_message)

   messages.append({"role": "user", "content": generate_prompt(row)})
   res = openai.chat.completions.create(
      model = "gpt-4o", 
      temperature = 0.2, 
      max_tokens = 2048,
      messages = messages
    )
   response = res.choices[0].message.content # response

   return response

In [14]:
response = prompt_GPT_full(df.loc[0])

In [15]:
print(response)

[
   {
      "symptom": "fever",
      "text": "a significant increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever"
   },
   {
      "symptom": "fever",
      "text": "feeling very fatigued due to the fever"
   },
   {
      "symptom": "fever",
      "text": "Vital signs show elevated temperature (103 °F)"
   },
   {
      "symptom": "fever",
      "text": "corresponding with the fever"
   },
   {
      "symptom": "fever",
      "text": "there are no evident physical findings to explain the fever apart from the stated temperature elevation"
   },
   {
      "symptom": "dyspnea",
      "text": "There have been no respiratory symptoms such as pain, dyspnea, or cough"
   },
   {
      "symptom": "cough",
      "text": "There have been no respiratory symptoms such as pain, dyspnea, or cough"
   },
   {
      "symptom": "respiratory pain",
      "text": "There have been no respiratory symptoms such as pain, dyspnea, or cough"
   }
]


In [41]:
import json
resp = json.loads(response)

We noticed that the LLM often hallucinates phrases that are actually not present in the note. We therefore process the responses by matching the phrases with the text. We use some regex matching to ensure that capital letters and punctuation don't form an issue. We also check whether the symptoms are named correctly in the JSON response (to ensure that the LLM is not extracting additional hallucinated symptoms). 

In [74]:
import re

def complete_annotations(ann_obj, compl, filter_empty=True): 

    completed_annotations = {}
    failed_attempts = {}

    for i, phrases in ann_obj.items():

        # retrieve note
        if compl == "normal":
            note = df.loc[int(i), "text"]
        elif compl == "adv": 
            note = df.loc[int(i), "advanced_text"]
        
        # find start and end character of phrase in the note
        # if simple regex fixes don't manage to find the phrase, we put the annotation aside for later
        for entry in phrases:

            phrase = entry["text"]

            if (len(phrase) != 0) or not filter_empty: # empty phrases are filtered out later
                start_idx = note.find(phrase) # check the note for the full phrase

                if start_idx == -1: 
                    regex_pattern = re.escape(phrase).replace(r'\.\.\.', r'.+?') # replace "..." with a regex pattern that matches any characters
                    regex_pattern = re.sub(r'\\ ', r'\\W*', regex_pattern)  # allow optional punctuation where spaces exist
                    match = re.search(regex_pattern, note, re.IGNORECASE) # ignore capital letters
                    if match: 
                        start_idx = match.start()
                        end_idx = match.end()
                    elif i not in failed_attempts: 
                        end_idx = -1
                        failed_attempts[i] = phrases
                else: 
                    end_idx = start_idx + len(phrase)

                entry["start"] = start_idx
                entry["end"] = end_idx

                # check if symptoms have the correct names 
                if entry["symptom"] not in ["dyspnea", "cough", "respiratory pain", "fever", "nasal symptoms"]: 
                    failed_attempts[i] = phrases

                if i not in failed_attempts:
                    new_phrase = note[start_idx:end_idx]
                    entry["text"] = new_phrase # make sure the phrase exactly corresponds to what is found in the text

        if i not in failed_attempts:
            if filter_empty:
                completed_annotations[i] = [entry for entry in phrases if len(entry["text"]) != 0] # leave out phrases that are empty ("")
            else: 
                completed_annotations[i] = phrases
            
    return completed_annotations, failed_attempts 

In [43]:
ann, failed = complete_annotations({"0": resp}, compl="normal")

In [44]:
ann

{'0': [{'symptom': 'fever',
   'text': 'a significant increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever',
   'start': 28,
   'end': 143},
  {'symptom': 'fever',
   'text': 'feeling very fatigued due to the fever',
   'start': 271,
   'end': 309},
  {'symptom': 'fever',
   'text': 'Vital signs show elevated temperature (103 °F)',
   'start': 556,
   'end': 602},
  {'symptom': 'fever',
   'text': 'corresponding with the fever',
   'start': 650,
   'end': 678},
  {'symptom': 'fever',
   'text': 'there are no evident physical findings to explain the fever apart from the stated temperature elevation',
   'start': 985,
   'end': 1088},
  {'symptom': 'dyspnea',
   'text': 'There have been no respiratory symptoms such as pain, dyspnea, or cough',
   'start': 145,
   'end': 216},
  {'symptom': 'cough',
   'text': 'There have been no respiratory symptoms such as pain, dyspnea, or cough',
   'start': 145,
   'end': 216},
  {'symptom': 'respirato

In [45]:
failed

{}

Records for which at least one of the phrases was not able to be matched to the text, go into the "failed" set. The other records receive "start" and "end" index annotations, indicating where the phrase can be found in the note.

Here, the "failed" set is empty. For the sake of showing what would happen to failed records, we adapt our correct annotation to a partially incorrect one. 

In [46]:
resp[0]["text"] = "an increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever"

In [47]:
ann, failed = complete_annotations({"0": resp}, compl="normal")

In [48]:
failed

{'0': [{'symptom': 'fever',
   'text': 'an increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever',
   'start': -1,
   'end': -1},
  {'symptom': 'fever',
   'text': 'feeling very fatigued due to the fever',
   'start': 271,
   'end': 309},
  {'symptom': 'fever',
   'text': 'Vital signs show elevated temperature (103 °F)',
   'start': 556,
   'end': 602},
  {'symptom': 'fever',
   'text': 'corresponding with the fever',
   'start': 650,
   'end': 678},
  {'symptom': 'fever',
   'text': 'there are no evident physical findings to explain the fever apart from the stated temperature elevation',
   'start': 985,
   'end': 1088},
  {'symptom': 'dyspnea',
   'text': 'There have been no respiratory symptoms such as pain, dyspnea, or cough',
   'start': 145,
   'end': 216},
  {'symptom': 'cough',
   'text': 'There have been no respiratory symptoms such as pain, dyspnea, or cough',
   'start': 145,
   'end': 216},
  {'symptom': 'respiratory pain',
  

Note that the phrase which could not be found in the text has start and end index -1. 

Due to the large number of notes, we cannot edit all these mistakes manually. We therefore constructed a second prompt that asks the LLM to correct these phrases to phrases that can actually be found in the text. 

In [49]:
SYS_MESSAGE = """I will show you a clinical note, together with one or more phrases that were extracted from it. However, some mistakes were made in extracting these phrases. You must correct them."""

def create_phrase_string(phrases): 
    str_format = ""
    for phrase in phrases: 
        str_format += f"- {phrase}\n"
    return str_format

def correcting_prompt(note, extracted_phrases): 
    
    messages = []
    system_message = {"role": "system", "content": SYS_MESSAGE}
    messages.append(system_message)

    note = note.replace("\n", " ")
    user_msg = f"The following is a clinical note:\n{note}\n\n"
    user_msg += "The following phrases were extracted from this note. However, they do not exactly match the text:\n"
    user_msg += create_phrase_string(extracted_phrases)
    user_msg += f"\n\nPlease correct the phrases so they map exactly to a phrase in the text. "
    user_msg += """You must reply with the following JSON format: 
{
   original phrase: corrected phrase
}

You will reply only with the JSON itself, and you will not wrap in JSON markers."""

    print(user_msg)

    messages.append({"role": "user", "content": user_msg})
    res = openai.chat.completions.create(
        model = "gpt-4o", 
        temperature = 0.2, 
        max_tokens = 2048,
        messages = messages
        )
    
    response = res.choices[0].message.content
    
    return response

In [52]:
idx = "0"
wrong_phrases = set([phrase["text"] for phrase in failed[idx] if phrase["start"] == -1])
note = df.loc[int(idx), "text"]
response = correcting_prompt(note, wrong_phrases)

The following is a clinical note:
**History** Patient reports a significant increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever. There have been no respiratory symptoms such as pain, dyspnea, or cough. The patient illustrates general malaise and mentions feeling very fatigued due to the fever. No notable changes in daily routine or exposure to environments that might typically contribute to fever are reported. Recent stress levels and potential exposure to infectious agents during travels are also discussed.  **Physical Examination** Vital signs show elevated temperature (103 °F). Heart rate is slightly tachycardic at 98 bpm, corresponding with the fever. Oxygen saturation is within normal limits at 98%, and lungs are clear to auscultation without any added sounds. Abdominal examination is normal, without tenderness or organomegaly. Skin shows no rashes, warmth, or lesions. Capillary refill time is adequate. Neurological assessment is n

In [53]:
print(response)

{
   "an increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever": "a significant increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever"
}


We then check again with the complete_annotations function whether the phrase can be found in the note. In this case, it can be, so the phrase is corrected. This was the case for the majority of incorrectly extracted phrases. There was a small set of notes (around 10) that were not able to be corrected this way, so we corrected them manually. 

The final span annotations for the normal notes can be found in "data/spans/normal_span_annotations.json". 

In [55]:
with open("../data/spans/normal_span_annotations.json", "r") as file:
    ann = json.load(file)

Example:

In [56]:
ann["5623"]

[{'symptom': 'fever',
  'text': 'mild fever onset over the past two days',
  'start': 28,
  'end': 67},
 {'symptom': 'fever',
  'text': 'Temperature: 37.8°C (low-grade fever)',
  'start': 436,
  'end': 473},
 {'symptom': 'respiratory pain',
  'text': 'notable respiratory pain primarily during deep inspiration and chest movements',
  'start': 78,
  'end': 156},
 {'symptom': 'cough',
  'text': 'The patient denies any cough',
  'start': 158,
  'end': 186},
 {'symptom': 'dyspnea',
  'text': 'The patient denies any cough or dyspnea',
  'start': 158,
  'end': 197},
 {'symptom': 'nasal symptoms',
  'text': 'The patient reports no recent travel, sick contacts, or changes in routine that could explain the symptoms',
  'start': 302,
  'end': 408}]

In [58]:
import textwrap
print(textwrap.fill(df.loc[5623, "text"], 100))

**History** Patient reports mild fever onset over the past two days. There is notable respiratory
pain primarily during deep inspiration and chest movements. The patient denies any cough or dyspnea.
It is stated that regular activities have become uncomfortable due to the persistent chest
discomfort. The patient reports no recent travel, sick contacts, or changes in routine that could
explain the symptoms.  **Physical Examination** Temperature: 37.8°C (low-grade fever). Lungs
auscultated with clear breath sounds and no adventitious sounds. Chest palpation reveals tenderness
in the costal region without any apparent soft tissue swelling. Respirations are regular, with no
signs of labored or distressed breathing. Vital signs: blood pressure 120/80 mmHg, heart rate 76
bpm, respiratory rate 16 breaths per minute. Cardiovascular examination is unremarkable. Skin: warm
and dry, no rash or lesions.


## Advanced notes

The advanced notes proved more challenging to make the LLM accurately extract symptom spans from scratch. So, we decided to start from the phrases extracted from the normal note, and ask the LLM to match these to phrases in the advanced note. 

The system instructions are as follows.

---

I will show you two versions of a clinical note. The first version describes a patient's visit to the doctor's office. The second one describes the same visit, but in a more compact style (using abbreviations and shortcuts), while preserving the overall message. 

I will show you a set of phrases which were extracted from the first version of the note. Your task is to map these to phrases in the second version of the note. 

You must reply with the following JSON format.
{
   phrase in version 1 : corresponding phrase extracted in version 2
}

Keep the following instructions in mind: 
- Please extract phrases verbatim. 
- Please use the empty string if you cannot find a phrase with the same meaning.
- The phrases you extract must have the same meaning, you cannot simply copy phrases that are in the same spot in the text.
- You will reply only with the JSON itself, and you will not wrap in JSON markers.

---

The full prompt is created as follows:

In [63]:
def create_phrase_string(phrases):
    list_str = "[\n"
    sel_phrases = []
    for phrase in phrases: 
        sent = phrase["text"]
        if sent not in sel_phrases: # only include each phrase once
            sel_phrases.append(sent)
            list_str += f"\"{sent}\",\n"
    list_str = list_str[:-2]
    list_str += "\n]"
    return list_str

In [62]:
SYS_MESSAGE_ADV = """I will show you two versions of a clinical note. The first version describes a patient's visit to the doctor's office. The second one describes the same visit, but in a more compact style (using abbreviations and shortcuts), while preserving the overall message. 

I will show you a set of phrases which were extracted from the first version of the note. Your task is to map these to phrases in the second version of the note. 

You must reply with the following JSON format.
{
   phrase in version 1 : corresponding phrase extracted in version 2
}

Keep the following instructions in mind: 
- Please extract phrases verbatim. 
- Please use the empty string if you cannot find a phrase with the same meaning.
- The phrases you extract must have the same meaning, you cannot simply copy phrases that are in the same spot in the text.
- You will reply only with the JSON itself, and you will not wrap in JSON markers.
"""

def prompt_GPT_advanced(note_ver1, note_ver2, phrases): 

   if len(phrases) == 0: 
      return "{}" # if no phrases to be extracted, don't prompt the LLM

   messages = []
   system_message = {"role": "system", "content": SYS_MESSAGE_ADV}
   messages.append(system_message)

   note_ver1 = note_ver1.replace("\n", " ")
   note_ver2 = note_ver2.replace("\n", " ")
   user_msg = f"Clinical note, version 1:\n{note_ver1}\n\n"
   user_msg += "Extracted phrases:\n"
   user_msg += create_phrase_string(phrases)
   user_msg += f"\n\nPlease extract the equivalent phrases from the second version of the note.\n\nClinical note, version 2:\n{note_ver2}"

   print(user_msg)

   messages.append({"role": "user", "content": user_msg})
   res = openai.chat.completions.create(
      model = "gpt-4o", 
      temperature = 0.2, 
      max_tokens = 2048,
      messages = messages
    )
   first_response = res.choices[0].message.content
   messages.append({"role": "assistant", "content": first_response}) # add response of assistant to chat history

   # additional check
   check_msg = "Please check if each extracted phrase has the same meaning as the original phrase. If not, substitute it by the empty string (""). The rest of the JSON must remain unchanged."
   messages.append({"role": "user", "content": check_msg})

   res = openai.chat.completions.create(
      model = "gpt-4o", 
      temperature = 0.2, 
      max_tokens = 2048,
      messages = messages
    )
   final_response = res.choices[0].message.content

   return final_response

In [65]:
normal_note = df.loc[0, "text"]
adv_note = df.loc[0, "advanced_text"]
extracted_phrases = ann["0"]
response = prompt_GPT_advanced(normal_note, adv_note, extracted_phrases)

Clinical note, version 1:
**History** Patient reports a significant increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever. There have been no respiratory symptoms such as pain, dyspnea, or cough. The patient illustrates general malaise and mentions feeling very fatigued due to the fever. No notable changes in daily routine or exposure to environments that might typically contribute to fever are reported. Recent stress levels and potential exposure to infectious agents during travels are also discussed.  **Physical Examination** Vital signs show elevated temperature (103 °F). Heart rate is slightly tachycardic at 98 bpm, corresponding with the fever. Oxygen saturation is within normal limits at 98%, and lungs are clear to auscultation without any added sounds. Abdominal examination is normal, without tenderness or organomegaly. Skin shows no rashes, warmth, or lesions. Capillary refill time is adequate. Neurological assessment is non-focal

In [66]:
print(response)

{
   "significant increase in body temperature over the last 48 hours, exceeding normal ranges, indicating a high fever": "Pt reports high fever for 48 hrs",
   "feeling very fatigued due to the fever": "Describes significant fatigue and malaise",
   "Vital signs show elevated temperature (103 °F)": "VS: Temp 103 °F",
   "corresponding with the fever": "",
   "There have been no respiratory symptoms such as pain, dyspnea, or cough": "denies resp pain, dyspnea, or cough"
}


Note that since the advanced note is more compact, it sometimes leaves out phrases. In this case "corresponding with the fever" was extracted from the normal note, but could not be matched with any phrase in the advanced note (since it is simply not there). When there is only one such case, we simply remove it from the annotations. When there are more than one such cases for a particular note, we try again with another LLM call (see further). 

Once we have these matched phrase pairs, we can go through the spans extracted from the normal note again, and see if we have a matched pair. Upfront, it seemed like a possibility that the phrase extracted from the normal note might be altered by the LLM in its response, making it impossible to find a match. However, in practice, we saw that this never happened.

In [68]:
def annotate_advanced_note(normal_ann, responses):

    adv_ann = {}
    failed = {}
    for i in responses: 
        orig_phrases = normal_ann[i]
        adv_phrases = responses[i]
        adv_entries = []
        for entry in orig_phrases: 
            phrase = entry["text"]
            try: 
                matched_phrase = adv_phrases[phrase]
            except: 
                # decide what to do when phrase cannot be matched exactly
                # look for similar phrase using regex 
                print("no exact match found!")
                matched_phrase = ""
                failed[i] = adv_phrases

            adv_entries.append({"symptom": entry["symptom"], "text": matched_phrase})
        adv_ann[i] = adv_entries

    return adv_ann, failed

In [67]:
resp = json.loads(response)

In [77]:
adv_ann, failed = annotate_advanced_note({"0": ann["0"]}, {"0": resp})

In [78]:
adv_ann

{'0': [{'symptom': 'fever', 'text': 'Pt reports high fever for 48 hrs'},
  {'symptom': 'fever', 'text': 'Describes significant fatigue and malaise'},
  {'symptom': 'fever', 'text': 'VS: Temp 103 °F'},
  {'symptom': 'fever', 'text': ''},
  {'symptom': 'dyspnea', 'text': 'denies resp pain, dyspnea, or cough'},
  {'symptom': 'cough', 'text': 'denies resp pain, dyspnea, or cough'},
  {'symptom': 'respiratory pain',
   'text': 'denies resp pain, dyspnea, or cough'}]}

Like before, we then try to match the phrases extracted from the advanced note with the text. If a phrase cannot be matched, the annotation is added to the batch of failed notes, and we correct the phrase using the same prompt we used for the normal notes. 

In [79]:
adv_ann, failed = complete_annotations(adv_ann, compl="adv", filter_empty=False)

In [80]:
adv_ann

{'0': [{'symptom': 'fever',
   'text': 'Pt reports high fever for 48 hrs',
   'start': 12,
   'end': 44},
  {'symptom': 'fever',
   'text': 'Describes significant fatigue and malaise',
   'start': 83,
   'end': 124},
  {'symptom': 'fever', 'text': 'VS: Temp 103 °F', 'start': 245, 'end': 260},
  {'symptom': 'fever', 'text': '', 'start': 0, 'end': 0},
  {'symptom': 'dyspnea',
   'text': 'denies resp pain, dyspnea, or cough',
   'start': 46,
   'end': 81},
  {'symptom': 'cough',
   'text': 'denies resp pain, dyspnea, or cough',
   'start': 46,
   'end': 81},
  {'symptom': 'respiratory pain',
   'text': 'denies resp pain, dyspnea, or cough',
   'start': 46,
   'end': 81}]}

As mentioned before, we keep track of the annotations for which no match could be found in the advanced note. If there are two or more such cases in the same note, we try to fix them with an additional LLM call. Instead of using the original annotations found in the normal note, and matching them with the advanced note, we ask the LLM to extract symptom phrases from the advanced note directly. By way of example, we adapt the annotations in our example so they fall in this category. Imagine the LLM did not find the denial of pain, dyspnea and cough in the advanced note: 

In [81]:
adv_ann["0"][4]["text"] = ""
adv_ann["0"][5]["text"] = ""
adv_ann["0"][6]["text"] = ""

We use the prompt to correct these cases. The system instructions are the following: 

---

I will show you a clinical note containing information on a patient's symptoms. For each symptom, I will tell you whether the patient suffers from this symptom or not. 

You will get to see phrases that have been extracted from the note, describing some of these symptoms. This input will have the following JSON structure:
[ 
   {
      "symptom": one of the symptoms ("dyspnea", "cough", "respiratory pain", "fever" or "nasal symptoms")
      "text": phrase in the text that mentions the symptom and whether it is present or absent
   }  
   {
      "symptom": ...
      "text":...
   }
   ...
]

Some phrases have not been filled in yet (indicated by "?" in the "text" field). Your task is to fill in this phrases, leaving the rest of the JSON and its structure untouched.

Keep the following instructions in mind:  
- Also annotate a symptom if the note mentions that the patient does not suffer from it. 
- The phrases do not need to be full sentences, but need to be verbatim as they appear in the note. You are not allowed to alter any words. If you leave out words, use ...
- You will reply only with the JSON itself, and you will not wrap in JSON markers.
- You can only extract phrases from the "clinical note", not from any of the other text in the prompt. 
- Not all symptoms are necessarily mentioned in the note. Simply fill in "" if you cannot find any implicit or explicit mention of it in the clinical note. 

---

In [83]:
def create_JSON_str(phrases):
    list_str = "[\n"
    for phrase in phrases: 
        sent = phrase["text"]
        sympt = phrase["symptom"]
        if len(sent) == 0:
            list_str += f"   {{\"symptom\":\"{sympt}\", \"text\": \"?\"}},\n"
        else: 
            list_str += f"   {{\"symptom\":\"{sympt}\", \"text\": \"{sent}\"}},\n"
    list_str = list_str[:-2]
    list_str += "\n]"
    return list_str

SYS_MESSAGE_CORR = """I will show you a clinical note containing information on a patient's symptoms. For each symptom, I will tell you whether the patient suffers from this symptom or not. 

You will get to see phrases that have been extracted from the note, describing some of these symptoms. This input will have the following JSON structure:
[ 
   {
      "symptom": one of the symptoms ("dyspnea", "cough", "respiratory pain", "fever" or "nasal symptoms")
      "text": phrase in the text that mentions the symptom and whether it is present or absent
   }  
   {
      "symptom": ...
      "text":...
   }
   ...
]

Some phrases have not been filled in yet (indicated by "?" in the "text" field). Your task is to fill in this phrases, leaving the rest of the JSON and its structure untouched.

Keep the following instructions in mind:  
- Also annotate a symptom if the note mentions that the patient does not suffer from it. 
- The phrases do not need to be full sentences, but need to be verbatim as they appear in the note. You are not allowed to alter any words. If you leave out words, use ...
- You will reply only with the JSON itself, and you will not wrap in JSON markers.
- You can only extract phrases from the "clinical note", not from any of the other text in the prompt. 
- Not all symptoms are necessarily mentioned in the note. Simply fill in "" if you cannot find any implicit or explicit mention of it in the clinical note. 
"""

sympt_dict = {"dyspnea": "dysp", "cough": "cough", "nasal symptoms": "nasal", "respiratory pain": "pain", "fever": "fever"}

def prompt_GPT_advanced_corrected(row, phrases): 

   if len(phrases) == 0: 
      return "{}" # if no phrases to be extracted, don't prompt the LLM

   messages = []
   system_message = {"role": "system", "content": SYS_MESSAGE_CORR}
   messages.append(system_message)

   note_adv = row["advanced_text"].replace("\n", " ")

   user_msg = f"The following information is known about the patient's symptoms:\n"
   for sympt, sympt_col_name in sympt_dict.items():  
      sympt_val = row[sympt_col_name]
      user_msg += f"- {sympt}: {sympt_val}\n"
   
   user_msg += f"\n Following the instructions you received, please extract from the following clinical note all phrases (verbatim) that describe these symptoms:\n{note_adv}\n\n"
   user_msg += "Please fill in the missing text (indicated with \"?\"). If you cannot find any mention of a symptom, simply fill in \"\".\n"
   user_msg += create_JSON_str(phrases)

   print(user_msg)
   
   messages.append({"role": "user", "content": user_msg})
   res = openai.chat.completions.create(
      model = "gpt-4o", 
      temperature = 0.2, 
      max_tokens = 2048,
      messages = messages
    )
   response = res.choices[0].message.content

   return response

In [85]:
response = prompt_GPT_advanced_corrected(df.loc[0], adv_ann["0"])

The following information is known about the patient's symptoms:
- dyspnea: no
- cough: no
- nasal symptoms: no
- respiratory pain: no
- fever: high

 Following the instructions you received, please extract from the following clinical note all phrases (verbatim) that describe these symptoms:
**History** Pt reports high fever for 48 hrs, denies resp pain, dyspnea, or cough. Describes significant fatigue and malaise. No recent routine changes or known infectious exposures. Discusses recent stress and travel.  **Physical Examination** VS: Temp 103 °F, HR 98 bpm (tachycardic), O2 sat 98%. Lungs clear, no adventitious sounds. Abd: non-tender, no organomegaly. Skin: no rashes/lesions, normal CRT. Neuro: non-focal. Overall: WNL apart from elevated fever.

Please fill in the missing text (indicated with "?"). If you cannot find any mention of a symptom, simply fill in "".
[
   {"symptom":"fever", "text": "Pt reports high fever for 48 hrs"},
   {"symptom":"fever", "text": "Describes significant

In [86]:
print(response)

[
   {"symptom":"fever", "text": "Pt reports high fever for 48 hrs"},
   {"symptom":"fever", "text": "Describes significant fatigue and malaise"},
   {"symptom":"fever", "text": "VS: Temp 103 °F"},
   {"symptom":"fever", "text": "Overall: WNL apart from elevated fever"},
   {"symptom":"dyspnea", "text": "denies resp pain, dyspnea, or cough"},
   {"symptom":"cough", "text": "denies resp pain, dyspnea, or cough"},
   {"symptom":"respiratory pain", "text": "denies resp pain, dyspnea, or cough"}
]


In [88]:
resp = json.loads(response)

In [89]:
adv_ann["0"] = resp

The LLM was able to extract the phrases pointing towards these symptoms now. We then use the "complete_annotations" function like before to complete the annotations. For the failed notes, the phrases that could not be found in the text are corrected with an additional LLM call (as before). We now remove the empty annotations, since it's now likely that they really were not mentioned in the advanced note. 

In [90]:
adv_ann, failed = complete_annotations(adv_ann, compl="adv", filter_empty=True)

In [92]:
adv_ann

{'0': [{'symptom': 'fever',
   'text': 'Pt reports high fever for 48 hrs',
   'start': 12,
   'end': 44},
  {'symptom': 'fever',
   'text': 'Describes significant fatigue and malaise',
   'start': 83,
   'end': 124},
  {'symptom': 'fever', 'text': 'VS: Temp 103 °F', 'start': 245, 'end': 260},
  {'symptom': 'fever',
   'text': 'Overall: WNL apart from elevated fever',
   'start': 425,
   'end': 463},
  {'symptom': 'dyspnea',
   'text': 'denies resp pain, dyspnea, or cough',
   'start': 46,
   'end': 81},
  {'symptom': 'cough',
   'text': 'denies resp pain, dyspnea, or cough',
   'start': 46,
   'end': 81},
  {'symptom': 'respiratory pain',
   'text': 'denies resp pain, dyspnea, or cough',
   'start': 46,
   'end': 81}]}

Finally, we delete duplicates from the annotations (exact same symptom and exact same phrase). This then leaves us with the full annotations for the advanced notes, of which an example is shown below. 

In [93]:
with open("../data/spans/adv_span_annotations.json", "r") as file:
    ann = json.load(file)

Example:

In [97]:
ann["8596"]

[{'symptom': 'dyspnea', 'text': 'no dyspnea', 'start': 110, 'end': 120},
 {'symptom': 'cough', 'text': 'no dyspnea or cough', 'start': 110, 'end': 129},
 {'symptom': 'nasal symptoms',
  'text': 'No notable PMHx of resp illnesses or chronic conditions.',
  'start': 285,
  'end': 341},
 {'symptom': 'respiratory pain',
  'text': 'sharp resp pain during deep breaths',
  'start': 73,
  'end': 108},
 {'symptom': 'respiratory pain',
  'text': 'Pain localized to lower right thorax, worsens with movement or cough.',
  'start': 131,
  'end': 200},
 {'symptom': 'fever',
  'text': 'sudden onset of high fever x 2 days',
  'start': 23,
  'end': 58},
 {'symptom': 'fever', 'text': 'Temp 39.2°C', 'start': 368, 'end': 379}]

In [98]:
import textwrap
print(textwrap.fill(df.loc[5623, "advanced_text"], 100))

**History** Pt reports mild fever onset x 2 days. Notable respiratory pain w/ deep inspiration &
chest movements. Denies cough or dyspnea. Stated regular activities uncomfortable due to chest
discomfort. No recent travel, sick contacts, or routine changes explained.  **Physical Examination**
Temp: 37.8°C (low-grade fever). Lungs: clear breath sounds, no adventitious sounds. Chest: tender on
costal palp, no soft tissue swelling. Regular respirations, no labored breathing. Vitals: BP 120/80
mmHg, HR 76 bpm, RR 16 bpm. CV exam unremarkable. Skin: warm, dry, no rash/lesions.
