# Clinical Trials Gen AI Workshop Part 2

**Note:** This notebook was designed to be run in a SageMaker Studio running *Data Science 3.0* on an *ml.t3.medium* instance with 5GB of storage although other configurations may be supported.

## Introduction

This is continuation of Part 1, we will move the use-case closer to production by focusing on evaluation logic, chaining LLMs and cost analysis. 



In the real world, patient records typically live in datastore outside of the LLM. In this notebook we will explore querying the data directly from Amazon Healthlake (FHIR Datastore) or from JSON files in FHIR standard. 

**If you prefer to query Amazon Healthlake resource to get FHIR records follow the procedure below. If not jump down to FHIR Record [Section](#read_FHIR_records)**

Let continue to get our patient records! 
 
 _Note: Patient records are synthetic and are not real patient data and are generated by [Synthea](https://github.com/synthetichealth/synthea). All pricing information is provided as an example, refer to https://aws.amazon.com/bedrock/pricing/ for most up to date pricing information._

In [None]:
# Ensure we have neccessary packages, you use the requirements.txt for virtual environments
%pip install boto3 awscrt ipykernel ipython ipywidgets requests
# you may need to restart kernal for the environment to recognize installed packages

### (Optional) Amazon Healthlake - FHIR Datastore

Lets run the cell below to construct a FHIR query to get records related to 3 patient ids and store the results in `patients_records` directory. We will use `helper_packages` to sign the FHIR API request to Amazon Healthlake.

_Note: You will need to uncomment the code to run the code._

In [3]:
# # Query FHIR Records for Amazon HealthLake
# # uncomment to run
# import boto3
# client =  boto3.client('healthlake')
# datastore_resp = client.list_fhir_datastores(
#     Filter={
#         'DatastoreName': 'demo-store'
#     }
# )
# # Selection of pre-selected FHIR Records in Amazon Healthlake
# patient_id = ["20a70ecf-c423-4318-82c3-40542074d6a8","04c704c4-5d2d-4308-9c33-1690a6e47a6b","fddf3bac-f14d-45a9-a0b0-690435ea799b" , "9ae87a5e-0cd2-4573-b0b6-c37ae5e5e894"]
# print("FHIR Database URLs:")
# for id in patient_id:
#     datastore_url = datastore_resp['DatastorePropertiesList'][0]['DatastoreEndpoint']
#     full = f"{datastore_url}Patient"
#     payload = [
#         ("identifier",f"{id}"),
#         ("_revinclude", 'Encounter:patient'),
#         ("_revinclude", "AllergyIntolerance:patient"),
#         ("_revinclude", "CarePlan:patient"),
#         ("_revinclude", "Condition:patient"),
#         ("_revinclude", "Encounter:patient"),
#         ("_revinclude", "DiagnosticReport:patient"),
#         ("_revinclude", "Observation:patient"),
#         ("_revinclude", "Medication:patient"),
#         ("_revinclude", "MedicationDispense:patient"),
#         ("_revinclude", "Procedure:patient")
#     ]
#     payload_str =""
#     for i in payload:
#         key=(i[0])
#         val=(i[1])
#         payload_str = payload_str + f"&{key}={val}"
#     # The first element does not require '&' sign
#     payload_str = (payload_str.replace('&','',1)) 
#     # Construct the full url with query '?'
#     url = full+'?'+payload_str
#     print(url)
#     from helper_packages.sigv4a_sign import SigV4ASign
#     import requests


#     headers = SigV4ASign().get_headers_basic('healthlake', 'us-east-1', 'GET', url)
#     r = requests.get(url, headers=headers)
#     # For each patient record write to file in patient_records
    # with open(f'patients_records/{id}.json','w') as f:
    #     f.write(r.text)

<a id='read_fhir_records'></a>
### Read FHIR records from JSON files

Lets run the cells below to read the patient record and store in memory for quick access. You may open the files directly to familiarize yourself with the patient's history. 

Can you identify the main differences between the patients? 

Patient Details:
| Patient ID | Name | Sex | Date of Birth | Short History |
| -- | -- | -- | -- | -- |
| **04c704c4-5d2d-4308-9c33-1690a6e47a6b** |  Mr. Dexter530 Little434 |  Male | Date of Birth: 1997-11-22 |  history of common flu symptoms in good general health |
| **20a70ecf-c423-4318-82c3-40542074d6a8** | Dorene845 Fadel536 |  Female | Date of Birth: 2015-05-01 | common flu symptoms in good general health |
| **fddf3bac-f14d-45a9-a0b0-690435ea799b** | Ronnie7 Greenfelder433 |  Female | Date of Birth: 1984-08-31 | Several illness including breast cancer |
| **9ae87a5e-0cd2-4573-b0b6-c37ae5e5e894** |  Mrs. Queenie922 Bechtelar572 |  Female | Date of Birth: 1983-09-17 | obesity and related conditions |

In [4]:
# File FHIR Resources (Option 2)
import os
import json
# Get records from folder patient_records into memory
list_files = os.listdir('patients_records')
patient_data = {}
for i in list_files:
    if '.json' in i:
        with open(f'patients_records/{i}') as f:
            result = json.load(f)
            # Condense JSON Object 
            result = json.dumps(result)
            patient_data[i] = result

In [5]:
# Display UI to Switch Between Patient Records
from helper_packages.choice import Prompt
from IPython.display import HTML, display
patient_choice = Prompt(patient_data)
display(HTML("<h1>Choose Patient Record</h1>\n<h3>Select a patient record to continue, you may revisit this cell to choose another patient record.</h3>"))
# If the button does not load up, you need to reload the window
display(patient_choice.get_buttons())

HBox(children=(Button(description='04c704c4-5d2d-4308-9c33-1690a6e47a6b.json', style=ButtonStyle()), Button(de…

### Read Study Information from ClinicalTrials.gov

In the next cell, we will download Study details directly from ClinicalTrials.gov. We have opted to download only `EligibilityCriteria`,`HealthyVolunteers`,`Sex`,`MinimumAge`,`MaximumAge`,`BriefTitle`, and `BriefSummary` sections.

Study Details

| Study ID | Name |
| -- | --|
| NCT04510376 | Allergy Potential of Omeza Collagen Matrix in Human Subjects Using the Skin Prick Method |
| NCT02340468 | Breast Tumor Oxygenation During Exercise |
| NCT05174689 | Epigenetic Regulation of Exercise Induced Asthma |

In [6]:
# Display UI to Switch Between Clinical Studies
from helper_packages.choice import Prompt
from IPython.display import HTML, display
import requests

list_studies = ['NCT04510376', 'NCT02340468','NCT05174689']
study = {}
for i in list_studies:
    study[i] = requests.get(f"https://clinicaltrials.gov/api/int/studies/download/{i}?format=json&fields=EligibilityCriteria%2CHealthyVolunteers%2CSex%2CMinimumAge%2CMaximumAge%2CBriefTitle%2CBriefSummary").text
study_choice = Prompt(study)
display(HTML("<h1>Choose Clinical Study</h1>\n<h3>Select a Clinical Study to continue, you may revisit this cell to choose another Clinical Study</h3>"))
display(study_choice.get_buttons())

HBox(children=(Button(description='NCT04510376', style=ButtonStyle()), Button(description='NCT02340468', style…

In [7]:
# If the clincial Trials urls have network issues uncomment and run this cell. If not skip this cell.

# import os
# list_files = os.listdir('study')
# study_data = {}
# for i in list_files:
#     if '.json' in i:
#         with open(f'study/{i}','r') as f:
#             result = json.load(f)
#             result = json.dumps(result)
#             study_data[i] = result
# study_choice_manual = Prompt(study_data)
# display(HTML("<h1>Choose Clincial Study</h1>\n<h3>Select a Clincial Study to continue, you may revisit this cell to choose another Clincial Study</h3>"))
# display(study_choice_manual.get_buttons())
    

### Prompt Engineering 

In this section we will deconstruct the prompt to better understand why we did it this way:

- Provide dynamic information or important information in xml tags - It is important to provide prompts in ways the model is most [familiar with](https://docs.anthropic.com/claude/docs/use-xml-tags). Claude was trained on prompts with XML tags. 

- _"You are medical researcher checking if a patient is eligible for a clinical study."_ - Give LLM a [role](https://docs.anthropic.com/claude/docs/give-claude-a-role#when-to-use-role-prompting) in this highly technical task.  

- _FHIR healthcare record with <patient> tags_ - Help the LLM identity the type of information they are given. Validate the model's knowledge in one-off prompts.
- _"Please follow these steps to evaluate if patient is suitable for study in <study> tags:"_ - In highly technical workflow, we can teach the model on how to [think](https://docs.anthropic.com/claude/docs/let-claude-think#how-to-prompt-for-thinking-step-by-step) 
    - It is important to guide the LLM to make decisions based on its understanding and if information is missing educate the model on what path it should take. 

In [None]:
from datetime import datetime
import json
from botocore.config import Config
from helper_packages.Tokencounter import PrettyPrintModel
# To ensure boto3 has time to get a response 
config = Config(read_timeout=1000)

# Get choice from buttons above
result = patient_choice.get_choice()
trial = study_choice.get_choice()


today = datetime.today().strftime('%Y-%m-%d')
input_prompt = f"""
You are medical researcher checking if a patient is eligible for a clinical study. You are given FHIR healthcare record within <patient> tags.

<patient>{result}</patient>

Please follow these steps to evaluate if patient is suitable for study in <study> tags:

Step 1: Calculate the age of the patient based on today's date of {today}.
Step 2: Validate if patient age is eligible for study. If patient does not met study requirements then patient is not eligible then skip other steps.
Step 3: Validate if patient mets study gender requirements if patient does not met gender requirements then patient is not eligible then skip other steps.
Step 4: Validate if patient mets all inclusion requirements.
Step 5: If there is no information about whether the patient mets the criteria then patient is not eligible for study then say patient not eligible and skip other steps.
Step 6: If there is no indication patient mets inclusion requirements then say patient is not eligible for study then skips other steps.
Step 7: If patient mets any exclusion requirements then patient is not eligible then skip other steps.
Step 8: If study accepts health patients and patient is healthy then patient is eligible for study. If patient health status is unclear, then patient is not eligible.


<study>{trial}</study>


Please provide your reasoning in <reason> tag and if the patient is eligible as "true" and If candidate is not eligible as "false" in <result> tag and if the patient is maybe eligible as as "possible" in <result> tag.
"""
import boto3
import json
brt = boto3.client(service_name='bedrock-runtime',config=config)


body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "messages": [
        {
            "role": "user",
            "content":[
                {"type": "text","text": input_prompt}
            ]
        }
    ],
    "max_tokens": 10000,
    "temperature": 0.0,
    "top_p": 0.9,
    "top_k": 50
})

model_id = 'anthropic.claude-3-sonnet-20240229-v1:0'
accept = 'application/json'
content_type = 'application/json'
print(f"Invoking {model_id}.....")
print(f'Study: {json.loads(trial)["protocolSection"]["identificationModule"]["briefTitle"]}')
print(f'Patient Name: {json.loads(result)["entry"][0]["resource"]["name"]}')
response = brt.invoke_model(body=body, modelId=model_id, accept=accept, contentType=content_type)
# Print input prompt
# print(input_prompt)
# print("=" * 70)
single_token = PrettyPrintModel(response,model_id)
print(single_token)


### Review Claude Sonnet Response

In the response we can view the model's reasoning and final output. **Do you agree with the thought process it took to arrive at the conclusion in the `<result>` tag?**

We also see the cost of running the prompt 1,000 times or trying to match 1,000 similar sized FHIR records with this study. In the real world we may get a list of 100 patients and check if they are matching against 1 study and vice versa. 

In all of these cases it is important to track the number of token and its effect on total cost.

### Chain Together Models

Similar to writing prompts to LLM chat bot and guiding it along a workflow, we can use output of one model and feed it to another model. In the below cell, we will take an approach of summarizing the patients FHIR record using [Claude Haku](https://www.anthropic.com/news/claude-3-family). This will help with lowering the overall number of token being sent to next model (Sonnet).

In the next cells, notice the **time to execute the prompt** and **costs**. 

In [None]:
# Chain together requests


modelId_chain_1='anthropic.claude-3-haiku-20240307-v1:0'
# Get choice from buttons above
result = patient_choice.get_choice()

input_prompt_chain_1 = f"""
Summarize the following FHIR record in tags <patient> and focus on extracting information related to patient, medication list and past illness with resolutions.

<patient>{result}</patient>

here is output example

<information>contain patient information</information> 
<medication>contain medication list<medication> 
<history>contain history with time it took to resolve<history> 

"""
accept = 'application/json'
contentType = 'application/json'
print(f'Invoke model: {modelId_chain_1}')
body_chain_1 = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "messages": [
        {
            "role": "user",
            "content":[
                {"type": "text","text": input_prompt_chain_1}
            ]
        }
    ],
    "max_tokens": 10000,
    "temperature": 0.0,
    "top_p": 0.9,
    "top_k": 50
})
print(f"Invoking {model_id}.....")
response_chain_1 = brt.invoke_model(body=body_chain_1, modelId=modelId_chain_1, accept=accept, contentType=contentType)
# Print prompt input 
# print(input_prompt_chain_1)
# print("=" * 70)
token_chain_1 =PrettyPrintModel(response_chain_1,modelId_chain_1)
print(token_chain_1)

### Chaining Models Together - Review Haiku

We are first invoking Haiku model to summarize a patient's FHIR record before passing it along to next model. [Haiku](https://www.anthropic.com/news/claude-3-haiku) is great at quickly analyzing large datasets. In leveraging lower priced model to summarize the output, we can lower the total cost overall by sending the summary to more intelligent model for reasoning.  

A patient medical history can be quite large, especially patients with illness so we need to keep in mind the context window for each model. Both Haiku and Sonnet has large context window of 200K tokens. In the process of using two models, we can optimize the large input (FHIR Record) to Haiku (lower price) and smaller input (Summary + Study) to Sonnet (complex reasoning).

In [None]:
# Chain together requests

model_id_chain_2 = 'anthropic.claude-3-sonnet-20240229-v1:0'
# Get choice from buttons above
trial_chain_2 = study_choice.get_choice()
today = datetime.today().strftime('%Y-%m-%d')
input_prompt_chain_2 = f"""
You are medical researcher checking if a patient is eligible  for a clincial study. You are given summarized healthcare information in <patient> tags.

{token_chain_1.content}


Please follow these steps to evaluate if patient is suitable for study in <study> tags:

Step 1: Calculate the age of the patient based on today's date of {today}.
Step 2: Validate if patient age is eligible for study. If patient does not met study requirements then patient is not eligible then skip other steps.
Step 3: Validate if patient mets study gender requirements if patient does not met gender requirements then patient is not eligible then skip other steps.
Step 4: Validate if patient mets all inclusion requirements.
Step 5: If there is no information about whether the patient mets the criteria then patient is not eligible for study then say patient not eligible and skip other steps.
Step 6: If there is no indication patient mets inclusion requirements then say patient is not eligible for study then skips other steps.
Step 7: If patient mets any exclusion requirements then patient is not eligible then skip other steps.
Step 8: If study accepts health patients and patient is healthy then patient is eligible for study. If patient health status is unclear, then patient is not eligible.


<study>{trial_chain_2}</study>


Please provide your reasoning in <reason> tag and if the patient is eligible as "true" and If candidate is not eligible as "false" in <result> tag and if the patient is maybe eligible as as "possible" in <result> tag.



"""
accept = 'application/json'
contentType = 'application/json'
print(f'Invoke model: {model_id_chain_2}')
body_chain_2 = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "messages": [
        {
            "role": "user",
            "content":[
                {"type": "text","text": input_prompt_chain_2}
            ]
        }
    ],
    "max_tokens": 10000,
    "temperature": 0.0,
    "top_p": 0.9,
    "top_k": 50
})
print(f"Invoking {model_id}.....")
response_chain_2 = brt.invoke_model(body=body_chain_2, modelId=model_id_chain_2, accept=accept, contentType=contentType)

# Optional
# print(input_prompt_chain_2)
print("=" * 70)
token_chain_2 = PrettyPrintModel(response_chain_2,model_id_chain_2)
print(token_chain_2)


### Chaining Models Together - Review Sonnet

We have completed the chain by taking the summarization of FHIR record focusing on specific conditions and gave it as input to Sonnet model.

Sonnet is more intelligent in Haiku and is better [choice](https://aws.amazon.com/about-aws/whats-new/2024/03/anthropics-claude-3-sonnet-model-amazon-bedrock/) for complex reasoning and analysis. 

See the results for performance and cost in following cell!

In [None]:
import locale
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
price_input = token_chain_1.raw_input_cost + token_chain_2.raw_input_cost
price_output = token_chain_1.raw_output_cost + token_chain_2.raw_output_cost
print("The estimated price for 1000 similar patients evaluated for the study. All prices are provided as examples, refer to https://aws.amazon.com/bedrock/pricing/ for most up to pricing\n")
print("Single Model")
print(f"Total Estimated Price\nInput: {single_token.input_cost}\nOutput: {single_token.output_cost}\nTotal {locale.currency(single_token.raw_input_cost+single_token.raw_output_cost)}\nLatency {single_token.latency}ms\n\n")
print("Chaining Model")
print(f"Total Estimated Price\nInput: {locale.currency(price_input)}\nOutput: {locale.currency(price_output)}\nTotal {locale.currency(price_input+price_output)}\nLatency {token_chain_1.latency + token_chain_2.latency}ms")