In [1]:
import json

def get_dataset(dataset_string):
  with open(dataset_string, 'r') as file:
      data = json.load(file)
  return data

In [14]:
from approaches.approach2 import approach2
from llms.llm_interaction import GroqClient

patient_records2 = get_dataset('datasets/patient_records2.json')
llm_client = GroqClient(model="llama-3.3-70b-versatile")

Groq client initialized with model: llama-3.3-70b-versatile


In [3]:
system_prompt = """
        You are a helpful AI assistant that strictly outputs a python list of tuples, where each tuple is (<semantic_symbol>, explanation) in the order they appear in the patient record.
        The output should be parseable with ast.literal_eval().
        """
prompt = """
            Given the following patient record, extract the following semantic symbols if they exist: {regex}. 
            Return a machine parseable python list of tuples, where each tuple is (<semantic_symbol>, explanation) in the order they appear in the patient record. Only include semantic symbols that are explicitly represented in the patient record, i.e. their explanation should be the actual text in which they appear. 
            IMPORTANT: Only return the list, nothing else.
            Make sure the order of the list reflects the order in which the semantic symbols appear in the patient record, not the order in which they are listed in the regex.
            The explanation should be a brief description of where/how the symbol appears in the text.
            \n\nPatient Record: {record_text}
            """
results = approach2(patient_records2[0:1], llm_client, verbose=True, order_sensitive=True, system_prompt=system_prompt, extraction_prompt_template=prompt)

Processing 1 records...

Patient Record:
ADMISSION DIAGNOSIS
Pancytosis with concerns for chronic myeloproliferative disorder versus myeloproliferative neoplasm was diagnosed in an eighty-year-old female retired school teacher, presenting to the emergency department upon advice of primary care for abnormal laboratory findings discovered during evaluation of fatigue.

HISTORY OF PRESENT ILLNESS
The patient reports several weeks of progressive and severe fatigue and weight loss that made daily tasks difficult, necessitating help with managing the household from family members. The primary care provider initially attributed these findings to aging, poor diet, and deconditioning but pursued laboratory evaluations when symptom severity necessitated an ER evaluation. This led to the discovery of abnormal blood work.

PAST MEDICAL HISTORY
Relevant medical conditions include coronary artery disease managed by medications including carvedilol, a prior appendicitis treated surgically at a young 

## annotation approach

In [13]:
system_prompt = """
        You are a helpful AI assistant that annotates patient record with the provided semantic symbols if they exist in the record. 
        You surround each text instance of a semantic symbol with <semantic_symbol> text text text </semantic_symbol>.
        You return the entire annotated record as a string.
        """
prompt = """
            Given the following patient record, annotate it with the provided semantic symbols if they exist in the record. Annotate by surrounding each text instance of a semantic symbol with <semantic_symbol> text text text </semantic_symbol>. Semantic symbols: {regex}.
            \n\nPatient Record: {record_text}
            """
record = patient_records2[5]
res = llm_client.generate(prompt.format(regex=record['s_regex'], record_text=record['record']), system_prompt=system_prompt, max_tokens=1000)
print(f'regular patient record: {record["record"]}')
print(f'semantic regex: {record["s_regex"]}')
print(f"Annotated Patient Record:\n{res}\n")


regular patient record: ADMISSION DIAGNOSIS 
The patient, a 72-year-old retired machinist, was admitted with a diagnosis of community-acquired pneumonia and associated hypoxemia requiring supplemental oxygen therapy. 

HISTORY OF PRESENT ILLNESS 
The patient had become symptomatic over the preceding week with gradually increasing cough, mild fatigue, and difficulty breathing, exacerbated by exertion. Initial management by the primary care physician included oral antibiotics and supportive care, which unfortunately did not yield the expected improvement. Consequently, upon clinical deterioration and given concerns for respiratory compromise, the patient presented to the emergency department for further evaluation and management.

PAST MEDICAL HISTORY 
The patient has a long-standing history of chronic obstructive pulmonary disease, complicated coronary artery disease status post bypass grafting, hypertension, and gastroesophageal reflux disease. Regular medications include aspirin, a be