# Generative example

This example uses the LangChain framework with Ollama to generate synthetic clinical notes from synthetic structured data.

Running this notebook requires an additional install of [Ollama](https://ollama.ai/) and the particular model used is `llama2:latest` (also named `llama2:7b-chat`) from the [Ollama model library](https://ollama.ai/library/llama2/tags).

Open a terminal and run `ollama pull llama2` to download the model.

## Import libraries

In [1]:
import json

from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.prompts import PromptTemplate

## Load input from json

In [2]:
input_path = "../../privfp-poc-zk/experiments/02_generate_dataset/synthea_dataset_inpatient.json"

with open(input_path) as file:
    data = json.load(file)

batch = []

for i in range(5):
    batch.append({"data": json.dumps(data[i])})

In [4]:
batch[0]

{'data': '{"name": "Ms. Clementine Nienow", "NHS number": "830 864 0705", "address": "1080 Dach Flat Apt 97, Ashford, SP6 3UQ", "date of birth": "1973-04-08", "marital status": "single", "ethnicity": "White - Any other White background", "gender": "female", "visit type": "Encounter Inpatient", "visit date": "1976-10-17T16:25:58Z", "provider": {"doctor": "Hyman Schmeler", "facility": "Fordingbridge  Surgery"}, "visit reason": "Appendicitis", "conditions": ["History of appendectomy"]}'}

## Load input from dict

In [5]:
FIHR_DIR = '../../privfp-poc-zk/experiments/02_generate_dataset'

# with open(FIHR_DIR + '/' + 'all_inpatient_visit_dict.json') as f:
#    all_inpatient_visit_dict = json.load(f)
   
#    # ed.head()


with open(FIHR_DIR + '/' + 'long_visit_dict.json') as f:
   long_visit_dict = json.load(f)

print(len(long_visit_dict.keys()))
   
ex = list(long_visit_dict.keys())[0]
ex =  "urn:uuid:6ac2bf6b-6a8c-51e9-a2dd-30eb780264c5"

18


In [5]:
import re
import datetime
import os
for key, value in long_visit_dict.items():

   encounter_id = long_visit_dict[key]['Encounter']['Encounter id']
   encounter_started = long_visit_dict[key]['Encounter']["Encounter Started"]
   uuid  = key[9:]

   patient_file = [file for file in os.listdir( FIHR_DIR + '/synthea/fhir') if re.search(uuid, file, re.IGNORECASE)]
   with open(FIHR_DIR + '/synthea/fhir/' + patient_file[0]) as f:
      patient_fhir = json.load(f)

   patient_dict = {}
   patient_uuid = long_visit_dict[key]['Patient']
   patient_dict['name'] = ' '.join(patient_fhir['entry'][0]['resource']['name'][0]['given']) + ' ' + patient_fhir['entry'][0]['resource']['name'][0]['family']
   patient_dict['birthDate'] = patient_fhir['entry'][0]['resource']['birthDate']
   patient_dict['gender'] = patient_fhir['entry'][0]['resource']['gender']
   patient_dict['age'] = int((datetime.datetime.fromisoformat(encounter_started).date() - datetime.datetime.strptime(patient_dict['birthDate'], '%Y-%m-%d').date()).days / 365.25)
   long_visit_dict[key]['Patient'] = patient_dict

In [11]:
long_visit_dict[ex]

batch = []

for i in range(5):
    ex = list(long_visit_dict.keys())[i]
    batch.append({"data": long_visit_dict[ex]})

batch[0]

{'data': {'Patient': 'urn:uuid:811ef822-fab7-3f91-35dd-106227ca25c1',
  'Encounter': {'Encounter id': 'urn:uuid:ff4668b4-92d0-53a6-4fa2-534e7bfc1edc',
   'Encounter Started': '2022-05-20 19:37:52',
   'Encounter Ended': '2022-06-25 19:52:52',
   'Encounter Duration': 3111300,
   'Hospital Staff': 'Dr. Delcie812 Casper496',
   'Type of admission': 'Admission to skilled nursing facility (procedure)'},
  'Condition': {},
  'Procedure': {'1': {'Text': 'History AND physical examination (procedure)',
    'Started': '2022-05-20T19:37:52+01:00',
    'Ended': '2022-05-20T19:52:52+01:00'},
   '2': {'Text': 'Initial patient assessment (procedure)',
    'Started': '2022-05-20T19:37:52+01:00',
    'Ended': '2022-05-20T19:52:52+01:00'},
   '3': {'Text': 'Development of individualized plan of care (procedure)',
    'Started': '2022-05-20T19:37:52+01:00',
    'Ended': '2022-05-20T19:52:52+01:00'},
   '4': {'Text': 'Nursing care/supplementary surveillance (regime/therapy)',
    'Started': '2022-05-20T1

## Load LLM

The model used to run inference can be easily swapped out using LangChain and Ollama!

Simply open a terminal and run `ollama pull <model_name:tag>` to retrieve any model from the Ollama model library and pass `<model_name:tag>` as the new argument when instantiating the LLM.

For example, if you wanted to use Mistral instead of Llama2, you would need to run `ollama pull mistral` in a terminal and set `model="mistral"` below.

In [12]:
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
llm = Ollama(model="llama2", callback_manager=callback_manager)

## Construct prompt

In [13]:
template = """[INST]
<<SYS>>
Your goal is to give your judgement about a patient's condition, and to make a plan for their admission to the hospital. 
You may be given extensive list of lab results or observations.  Do not restate the results of observations as a list. However, you should refer to them in giving your judgement. 
<</SYS>>

### INSTRUCTIONS ### 

Create a JSON object with the following parameters:

1. "history_of_present_illness": a detailed narrative of the patient's current complaint

2. "physical_examination": findings from a detailed examination of the patient, including vital signs, observations and procedures

3. "assessment_and_plan": interpretations, diagnoses, and proposed management plan

Here is some information about the patient: 

{data}
[/INST]
"""

prompt = PromptTemplate.from_template(template)
chain = prompt | llm

In [14]:
prompt

PromptTemplate(input_variables=['data'], template='[INST]\n<<SYS>>\nYour goal is to give your judgement about a patient\'s condition, and to make a plan for their admission to the hospital. \nYou may be given extensive list of lab results or observations.  Do not restate the results of observations as a list. However, you should refer to them in giving your judgement. \n<</SYS>>\n\n### INSTRUCTIONS ### \n\nCreate a JSON object with the following parameters:\n\n1. "history_of_present_illness": a detailed narrative of the patient\'s current complaint\n\n2. "physical_examination": findings from a detailed examination of the patient, including vital signs, observations and procedures\n\n3. "assessment_and_plan": interpretations, diagnoses, and proposed management plan\n\nHere is some information about the patient: \n\n{data}\n[/INST]\n')

## Run inference on batch

In [16]:
def collapse_nested_dict_to_text(nested_dict):
    text_parts = []
    for key, value in nested_dict.items():
        if isinstance(value, dict):
            subtext = collapse_nested_dict_to_text(value)
            text_parts.append(f"{key}: {subtext}")
        else:
            text_parts.append(f"{key}: {value}")
    return '; '.join(text_parts)

batch = []
for patient, data in long_visit_dict.items():
    formatted_data = collapse_nested_dict_to_text(data)
    prompt_instance = PromptTemplate.from_template(template.format(data=formatted_data))
    batch.append(prompt_instance)

In [19]:
type(batch)

list

In [15]:
outputs = chain.batch(batch)

ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12357b250>: Failed to establish a new connection: [Errno 61] Connection refused'))

In [10]:
batch[0]

{'data': '{"name": "Ms. Clementine Nienow", "NHS number": "830 864 0705", "address": "1080 Dach Flat Apt 97, Ashford, SP6 3UQ", "date of birth": "1973-04-08", "marital status": "single", "ethnicity": "White - Any other White background", "gender": "female", "visit type": "Encounter Inpatient", "visit date": "1976-10-17T16:25:58Z", "provider": {"doctor": "Hyman Schmeler", "facility": "Fordingbridge  Surgery"}, "visit reason": "Appendicitis", "conditions": ["History of appendectomy"]}'}

In [26]:
outputs[0]

'{\n"history_of_present_illness": "Ms. Clementine Nienow presents with sudden onset abdominal pain, nausea, and vomiting. She reports a feeling of generalized weakness and fatigue. The patient has a history of appendectomy 10 years ago. Her vital signs are as follows: temperature 37.8°C, blood pressure 120/80mmHg, heart rate 100bpm, respiratory rate 24bpm. On physical examination, there is tenderness in the right lower quadrant of the abdomen with guarding and rigidity. There are no palpable masses or organomegaly noted. The patient\'s mental status is within normal limits.",\n"physical_examination": "On initial assessment, Ms. Nienow appears uncomfortable and anxious. She is awake and alert, but her mood is tense. Vital signs are as follows: temperature 37.8°C, blood pressure 120/80mmHg, heart rate 100bpm, respiratory rate 24bpm. The patient\'s abdomen is tender to palpation in the right lower quadrant with guarding and rigidity. There are no palpable masses or organomegaly noted. The

In [25]:
batch[-1]

{'data': '{"name": "Mrs. Cherri Sanford", "NHS number": "567 224 7934", "address": "681 Marquardt Bay, Havant, PO9 2SH", "date of birth": "1983-02-22", "marital status": "married", "ethnicity": "Mixed - White and Asian", "gender": "female", "visit type": "Encounter Inpatient", "visit date": "2010-02-26T10:41:30Z", "provider": {"doctor": "Chu Weber", "facility": "Spire Hospital Portsmouth"}, "visit reason": "Appendicitis", "conditions": ["History of appendectomy"]}'}

In [None]:
outputs[-1]