# Retrieval Augmented Clinical Encounter Demo

## Overview
This demo uses a Retrieval Augmented Generation (RAG) technique to use previously diagnosed conditions from a sample patient's health record to aid a clinician during a clinical encounter. The concept is not meant to be used in the real world, but aims to highlight the art-of-the-possible with generative AI tools such as Amazon Bedrock in a healthcare environment. It also discusses some of the prompt engineering techniques to achieve consistent, high-quality responses from large language models such as Anthropic's Claude (v2).

### Prerequisites
* It can run either in a local environment or in SageMaker Studio with the **`Data Science 3.0`** kernel on an **`ml.t3.medium`** instance.
* The demo was build and tested using boto3 version 1.33.6. Verify that the environment is running a version of boto3 at least 1.33 or higher.
* Verify that model access to Anthropic's Claude v2 is granted to the account being used, see documentation here: [Amazon Bedrock Model Access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html)
* The user or permission context running the notebook needs to have InvokeModel permissions for the **anthropic.claude-v2** Bedrock model

Sample IAM Policy:
```
{
    "Sid": "BedrockInvokeClaudeV2",
    "Effect": "Allow",
    "Action": "bedrock:InvokeModel",
    "Resource": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2"
}
```

## 1. Testing the Environment

### 1.1. Check boto3 version
Ensure the boto3 version is 1.33 or higher

In [1]:
!pip list | grep boto3

boto3              1.33.6

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


### 1.2. Setting up the Bedrock client
Create the Bedrock client and define the parameters that will not change for the rest of the demo. This demo uses Anthropic's Claude v2 model and sends/receives JSON.

In [2]:
import boto3
import json
bedrock_runtime = boto3.client(service_name='bedrock-runtime',region_name='us-east-1')

model_id = 'anthropic.claude-v2'
accept = 'application/json'
content_type = 'application/json'

### 1.3. Test the Bedrock client
We can test permissions and setup by sending a prompt and reading the response. Let's attempt to summarize all 101,000 words of USA's Health Insurance Portability and Accountability Act (HIPAA) to 50 words for 5th graders.

In [3]:
prompt = "\n\nHuman: explain HIPAA to a 5th grader in less than 50 words\n\nAssistant:"

body = json.dumps({
    "prompt": prompt,
    "max_tokens_to_sample": 1000,
    "temperature": 0.1,
    "top_p": 0.9,
})

response = bedrock_runtime.invoke_model(body=body, modelId=model_id, accept=accept, contentType=content_type)
response_body = json.loads(response.get('body').read())
print(response_body.get('completion'))

 Here is an explanation of HIPAA for a 5th grader in 47 words:

HIPAA is a law that keeps your private health information safe. Doctors, nurses, and hospitals can't share things about your health with other people without your OK. This stops strangers from seeing your medical records and keeps your health details just between you and your doctor.


## 2. Working with FHIR-formatted Healthcare Records
This demo uses sample data generated in Amazon HealthLake using an open-source library called Synthea. This demo will not directly interact with HealthLake, but the companion file `healthlake_sample_records.json` contains this sample data exactly as presented by HealthLake using a GET search. It is an array of FHIR-formatted Conditions for a single patient.

FHIR is a set of rules and specifications intended to standardize exchanging health data. To ensure its flexibility, it contains significant amounts of metadata that may not be useful in many contexts. For our purpose, we want to minimize the amount of extraneous data we send to the large language model to ensure that it will only use the minimum data necessary for our prompt and optimize the efficiency of the request.

### 2.1. FHIR record optimization
The follow is a single Condition FHIR record. LLM models process data in units called tokens. While Claude v2 can accept up to 100,000 tokens, tokens are also used to price usage of the model. Streamlining the number of tokens sent by only sending the necessary data can contribute to higher quality output at an efficient rate.

The following is a single JSON-formatted FHIR record for a patient Condition:
```
{"resource":{"resourceType":"Condition","id":"71c4183a-390e-496d-b410-81fe13ad8b1d","meta":{"lastUpdated":"2023-09-25T21:19:39.299Z"},"clinicalStatus":{"coding":[{"system":"http://terminology.hl7.org/CodeSystem/condition-clinical","code":"resolved"}]},"verificationStatus":{"coding":[{"system":"http://terminology.hl7.org/CodeSystem/condition-ver-status","code":"confirmed"}]},"code":{"coding":[{"system":"http://snomed.info/sct","code":"302932006","display":"Tear of medial meniscus of knee"}],"text":"Tear of medial meniscus of knee"},"subject":{"reference":"Patient/24042b1f-5acb-449f-9948-462618aba6d6"},"encounter":{"reference":"Encounter/6deb0a82-03f9-45e0-a2e8-51dacb1662ee"},"onsetDateTime":"2022-02-01T10:19:48-08:00","abatementDateTime":"2022-03-15T11:48:48-08:00","recordedDate":"2022-03-15T10:19:48-08:00"},"search":{"mode":"match"}}
```
This is roughly 280 tokens.

If we reduce this record to only the fields strictly necessary for our request, we get the following JSON-formatted output:
```
{"clinicalStatus":"resolved","verificationStatus":"confirmed","code":"302932006","condition":"Tear of medial meniscus of knee","onsetDateTime":"2022-02-01","abatementDateTime":"2022-03-15","recordedDate":"2022-03-15"}
```
This is roughly 60 tokens. We've also reduced the complexity of the record by flattening the JSON to a single dimension and clearly labeled the condition "Tear of medial meniscus of knee".

### 2.2. Optimization code
The following code ingests the FHIR-formatted Condition records and reduces each Condition to this minimal set of fields.

In [4]:
from datetime import datetime
with open('./healthlake_sample_records.json') as file:
    healthlake_data = json.load(file)

field_list = ['clinicalStatus','verificationStatus','code','onsetDateTime','abatementDateTime','recordedDate']
entries = []
for entry in healthlake_data["entry"]:
    entries.append(entry['resource'])

conditions_obj = {}
conditions_obj["conditions"] = []

for condition in entries:
    condition_json_obj = {}
    for field in field_list:
        if not condition.get(field) == None:
            val = datetime.fromisoformat(condition[field]).strftime('%Y-%m-%d') if 'Date' in field else condition[field]['coding'][0]['code']
            display = None if not field == 'code' else condition[field]['coding'][0]['display']
            condition_json_obj[field] = val
            if not display == None:
                condition_json_obj["condition"] = display
    conditions_obj["conditions"].append(condition_json_obj)

formatted_conditions_json = json.dumps(conditions_obj, indent=2) # indented for ease-of-reading for this demo

print(formatted_conditions_json)

{
  "conditions": [
    {
      "clinicalStatus": "resolved",
      "verificationStatus": "confirmed",
      "code": "840544004",
      "condition": "Suspected COVID-19",
      "onsetDateTime": "2020-03-01",
      "abatementDateTime": "2020-03-01",
      "recordedDate": "2020-03-01"
    },
    {
      "clinicalStatus": "active",
      "verificationStatus": "confirmed",
      "code": "233678006",
      "condition": "Childhood asthma",
      "onsetDateTime": "2005-10-15",
      "recordedDate": "2005-10-15"
    },
    {
      "clinicalStatus": "resolved",
      "verificationStatus": "confirmed",
      "code": "444814009",
      "condition": "Viral sinusitis (disorder)",
      "onsetDateTime": "2019-09-20",
      "abatementDateTime": "2019-09-27",
      "recordedDate": "2019-09-20"
    },
    {
      "clinicalStatus": "active",
      "verificationStatus": "confirmed",
      "code": "232353008",
      "condition": "Perennial allergic rhinitis with seasonal variation",
      "onsetDateTime":

## 3. Prompt Engineering

### 3.1. Injecting context into the prompt
Retrieval Augmented Generation (RAG) refers to retrieving relevant data to use in context of the prompt to provide the model with up-to-date or relevant information. Reading the code below, we see the prompt is asking the model to provide a list of medical conditions the patient has or previously had that may be relevant to their current clinical encounter. The complain and today's date are added, but the RAG concept is achieved by providing the patient's condition history as context. We also clearly label the context and the task.

In [5]:
from datetime import date

chief_complaint = "lump behind knee"
prompt = """
PATIENT RECORD:
{context}

TASK:
The current date is {date}. Use exact values from the above patient record to answer the question below. 
If the record does not contain relevant conditions, respond with the single word NO.
A patient is currently being examined by a doctor with a chief complaint of: {complaint}.
Respond in JSON format with this template: 
{{
    "FoundRelevantConditions":BOOLEAN,
    "RelevantConditionsWithDates":ARRAY,
    "ShortExplanation":STRING
}}
Using the patient's record, create a list of conditions with a high probability of helping diagnose or treat the chief complaint.
""".format(context = '{"clinicalStatus":"resolved","verificationStatus":"confirmed","code":"302932006","condition":"Tear of medial meniscus of knee","onsetDateTime":"2022-02-01","abatementDateTime":"2022-03-15","recordedDate":"2022-03-15"}', date = date.today(), complaint = chief_complaint)
print(prompt)


Patient Record:
{"clinicalStatus":"resolved","verificationStatus":"confirmed","code":"302932006","condition":"Tear of medial meniscus of knee","onsetDateTime":"2022-02-01","abatementDateTime":"2022-03-15","recordedDate":"2022-03-15"}

Task:
The current date is 2023-12-01. Use exact values from the above patient record to answer the question below. 
If the record does not contain relevant conditions, respond with the single word NO.
A patient is currently being examined by a doctor with a chief complaint of: lump behind knee.
Respond in JSON format with this template: 
{
    "FoundRelevantConditions":BOOLEAN,
    "RelevantConditionsWithDates":ARRAY,
    "ShortExplanation":STRING
}
Using the patient's record, create a bulleted list of conditions with a high probability of helping diagnose or treat the chief complaint.



### 3.2. Structured responses lead to more consistent results
Two parts of this prompt are important for achieving useful results. First is to clearly state what to do if the request cannot be completed. Without this, the model will imagine it's own response which leads to inconsistent output. Second, we are requesting that the output be structured in JSON. The property names themselves imply tasks for the LLM to complete, creating a double-check of its work. Removing the `FoundRelevantConditions` BOOLEAN property can lead to interesting results for various chief complaints.

Experiment with the chief complaint. Try "lower back pain", "migraine", or "difficulty breathing". Also experiment with the prompt, change the wording, add/remove JSON properties from the response template, and change the `temperature` variable (0.0-1.0) in the `body` object. Higher temperature leads to higher "creativity" of the LLM.

In [14]:
from datetime import date

chief_complaint = "lower back pain"
prompt = """
PATIENT RECORD:
{context}

TASK:
The current date is {date}. Use exact values from the above patient record to answer the question below. 
If the record does not contain relevant conditions, respond with the single word NO.
A patient is currently being examined by a doctor with a chief complaint of: {complaint}.
Respond in JSON format with this template: 
{{
    "FoundRelevantConditions":BOOLEAN,
    "RelevantConditionsWithDates":ARRAY,
    "ShortExplanation":STRING
}}
Using the patient's record, create a list of conditions with a high probability of helping diagnose or treat the chief complaint.
""".format(context = formatted_conditions_json, date = date.today(), complaint = chief_complaint)

body = json.dumps({
    "prompt": "\n\nHuman: {prompt}\n\nAssistant:".format(prompt = prompt),
    "max_tokens_to_sample": 1000,
    "temperature": 0.1,
    "top_p": 0.9,
})

response = bedrock_runtime.invoke_model(body=body, modelId=model_id, accept=accept, contentType=content_type)
response_body = json.loads(response.get('body').read())
print(response_body.get('completion'))

 {
  "FoundRelevantConditions": true, 
  "RelevantConditionsWithDates": [
    {
      "condition": "Tear of medial meniscus of knee",
      "onsetDateTime": "2022-02-01",
      "abatementDateTime": "2022-03-15"
    },
    {
      "condition": "Acute meniscal tear, medial",
      "onsetDateTime": "2022-08-01",
      "abatementDateTime": "2022-09-15"
    }
  ],
  "ShortExplanation": "The patient has a history of medial meniscus tears which can cause lower back pain."
}
