# Generate clinical plans from patient-physician audio interviews

This notebook demonstrates how to generate clinical plans from patient-physician audio interviews using AWS Managed services and Claude 3 generalised large language model family.  

## Prerequisites
- Verify that model access to Anthropic's Claude 3 Sonnet and Haiku is granted to the account being used, see documentation here: [Amazon Bedrock Model Access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html)

## Instructions
1. The notebook is designed to run with Amazon SageMaker Notebook Instance. For instructions on how to onboard to a Sagemaker Notebook Instances, refer to this [link](https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html).

2. Update your SageMaker IAM role (created when you initially set up the Sagemaker Notebook Instance) to
 contain the following AWS managed policies:

- [AmazonBedrockFullAccess](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonBedrockFullAccess.html)
- [AmazonTranscribeFullAccess](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonTranscribeFullAccess.html) 

You can find the SageMaker IAM role attached to your Notebook Instance from the **Amazon SageMaker Console** -> **Notebook Instance** in the section **Permissions and encryption**, as shown below:

![IAMROLE](assets/SageMaker-nbi-IAM-role.png)


## Introduction

This notebook shows how to use transcribe and diarize pre-recorded conversations between patients and physicians, and use Claude 3 model family to generate structured clinical notes. 

As shown in the architecture diagram below, this Jupyter Notebook orchestrates:

1. The retrival of patient-physician medical interviews from a public location
2. The upload to the default Sagemaker S3 bucket
3. The execution of an **Amazon Transcribe** batch job to transcribe and diarize the recordings
4. The preparation of the structured prompt to generate the clinical plan
5. Generation of the clinical plan using the Claude 3 model family

![Architecture](assets/clinicalplans_genai.001.png)


## Environment Setup

Update boto3 SDK to version **`1.33.0`** or higher.

In [None]:
!pip install botocore boto3 awscli tscribe pandas ipython --upgrade

## 1. Batch Transcription Using Python SDK

Setting up the environment with the AWS clients and libraries

In [None]:
import os
import time
import boto3
import json
import tscribe
import pandas
import datetime
from IPython.display import display_markdown, Markdown, clear_output
import sagemaker

sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()
region = boto3.session.Session().region_name

s3 = boto3.client('s3', region)
transcribe = boto3.client('transcribe', region)

#### 1.1. Download the recordings

We will use the sample recording published as part of the supplemental materials of the following paper "Fareez, F., Parikh, T., Wavell, C. et al. A dataset of simulated patient-physician medical interviews with a focus on respiratory cases. Sci Data 9, 313 (2022). https://doi.org/10.1038/s41597-022-01423-1 

In [None]:
!curl -L --output data.zip https://springernature.figshare.com/ndownloader/files/30598530

In [None]:
!unzip -qq -o data.zip

In [None]:
prefix = "rawdata"
inputs = sagemaker_session.upload_data(path="Data", bucket=bucket, key_prefix=prefix)
print("input spec (in this case, just an S3 path): {}".format(inputs))

In the variable below, indicate the name of the recorded session you want to transcribe and summarise:  
- **`[object_name]`**: file name including the extension (e.g. RES0037.mp3)

In [None]:
object_name = "RES0038.mp3"

We will prefill the value of the `[job_name]` variable such to create unique Transcribe jobs.

In [None]:
timestamp = datetime.datetime.now().strftime("%Y-%m-%d-%H%M%S")
media_uri = "s3://%s/%s/%s/%s" % (bucket, prefix, "Audio Recordings", object_name)
job_name = "transcribe-%s-%s" % (object_name.split(".")[0],timestamp)

#### 1.2. Starting an AWS Transcribe job
Invoking **`start_transcription_job`** API to start a transcription job:

In [None]:
response = transcribe.start_transcription_job(
    TranscriptionJobName=job_name,
    LanguageCode='en-US',
    Media={
        'MediaFileUri': str(media_uri)
    },
    OutputBucketName=bucket,
    Settings={
        'ShowSpeakerLabels': True,
        'MaxSpeakerLabels': 2,
        'ChannelIdentification': False
    }
)
print(response)

#### 1.3. Checking job status

The code below will invoke Transcribe **`get_transcription_job`** API to retrieve the status of the job we started in the previous step. If the status is not Completed or Failed, the code waits 5 seconds to retry until the job reaches a final state.

In [None]:
while True:
    status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
    if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)

print("Job status: " + status.get('TranscriptionJob').get('TranscriptionJobName'))

start_time = status.get('TranscriptionJob').get('StartTime')
completion_time = status.get('TranscriptionJob').get('CompletionTime')
diff = completion_time - start_time

print("Job duration: " + str(diff))
print("Transcription file: " + status.get('TranscriptionJob').get('Transcript').get('TranscriptFileUri'))

#### 1.4. Analysing the scribe results
The code below will download the **`transcribe.json`** file generated by Transcribe, will parse the file and extract the diarised transcription.

In [None]:
transcription_file = job_name + ".json"

transcription = s3.get_object(Bucket=bucket, Key=transcription_file)
body = json.loads(transcription['Body'].read())

s3.download_file(bucket, transcription_file, "output.json")

In [None]:
tscribe.write("output.json", format="csv", save_as="output.csv")

desired_width = 600
pandas.set_option('display.width', desired_width)

transcript = pandas.read_csv("output.csv",  names=["line", "start_time", "end_time", "speaker", "comment"], header=None, skiprows=1)
interaction = ["%s, %s: %s" % (segment[0], segment[1],segment[2]) for segment in transcript[['line','speaker', 'comment']].values.tolist()]
transcript

---

## 2. Generate clinical notes using Claude model family

### 2.1. Prompt engineering
Claude is trained to be a helpful, honest, and harmless assistant. It is used to speaking in dialogue, and you can instruct it in regular natural language requests as if you were making requests of a human.The quality of the instructions you give Claude can have a large effect on the quality of its outputs, especially for complex tasks. See https://docs.anthropic.com/claude/docs/intro-to-prompting to learn more about prompt engineering.

Structured enterprise-grade prompts may contain the following sections: 
1. **Task context**
1. Tone context
1. Background data, documents, and images
1. **Detailed task description & rules**
1. Examples
1. Conversation history
1. Immediate task description or request
1. Thinking step by step / take a deep breath

In our scenario, we will use a simplified prompt (template) that will instruct the model to generate a structured summary of the transcribed conversation and indicate the lines in the transcript that support each claim. This summary is divided in the following sections: 

1. Chief complaint
1. History of present illness
1. Review of systems
1. Past medical history
1. Assessment
1. Plan
1. Physical examination

In [None]:
prompt = '''You will be reading a transcript of a recorded conversation between a physician and a patient. You will find the conversation within the transcript XML tags. Your goal is to summarise 
it, capture the most significative insights and propose the appropriate action plan under a section named ‘clinical plan’ that includes the following sections: Chief complaint; History of present 
illness; Review of systems; Past medical history; Assessment; Plan; Physical examination. Per each claim you make, you need to indicate which lines of the transcript supports it (please indicate 
only the line numbers within the tag <line></line>).
<transcript>
%s
</transcript>
''' % "\n".join(interaction)
print(prompt)

### 2.2. Payload preparation and model invocation
The new generation of Claude model only support the Messages API, hence we must format the body of our payload in the following way:

In [None]:
accept = 'application/json'
contentType = 'application/json'
body = json.dumps(
    {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1000,
        "messages": [
            {
                "role": "user",
                "content": [{
                    "type": "text",
                    "text": prompt,
                }],
            },
        ],
        "temperature": 0
    }
)

In [None]:
# If you are running this workshop during an AWS Instructor-Led lab, please uncomment the following line:
#region = "us-west-2"
bedrock_runtime = boto3.client('bedrock-runtime', region)

#### 2.2.1 Claude 3 Sonnet

The Bedrock service generates the entire summary for the given prompt in a single output, this can be slow if the output contains large amount of tokens.

Below we explore the option how we can use Bedrock to stream the output such that the user could start consuming it as it is being generated by the model. For this Bedrock supports invoke_model_with_response_stream API providing ResponseStream that streams the output in form of chunks.

Instead of generating the entire output, Bedrock sends smaller chunks from the model. This can be displayed in a consumable manner as well.


In [None]:
def teletype_model_response(stream):
    output = []
    i = 1
    if stream:
        for event in stream:
            chunk = event.get('chunk')
            if chunk:
                chunk_obj = json.loads(chunk.get('bytes').decode())
                if chunk_obj['type'] == 'content_block_delta':
                    text = chunk_obj['delta']['text']
                    clear_output(wait=True)
                    output.append(text)
                    display_markdown(Markdown(''.join(output)))
                    i += 1

We will print the content of the response immediately as the first string is returned 

In [None]:
%%time
modelId = 'anthropic.claude-3-sonnet-20240229-v1:0'

# response = bedrock_runtime.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
# response_body = json.loads(response["body"].read())
# completion = response_body["content"][0]["text"]
# print(completion)
response = bedrock_runtime.invoke_model_with_response_stream(body=body, modelId=modelId, accept=accept, contentType=contentType)
teletype_model_response(response.get('body'))


#### 2.2.2. Claude 3 Haiku

Let's print the response only when it is returned in full

In [None]:
%%time
modelId = 'anthropic.claude-3-haiku-20240307-v1:0'
response = bedrock_runtime.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response["body"].read())
completion = response_body["content"][0]["text"]
display_markdown(Markdown(''.join(completion)))