## Using LLM with Kendra 

Before running this notebook, you will need to first run the notebook titled `deploy_openchatkit_on_sagemaker.ipynb` so that the model is deployed to an endpoint first. In this notebook, we will use the model to do the inference. 

We will first get the response from Kendra and feed it to the model to get a more precise answer to the query

In [2]:
!pip install -U sagemaker --quiet

[0m

In [3]:
import boto3
import json
import sagemaker

s3=boto3.resource('s3')
region = boto3.session.Session().region_name
role = sagemaker.get_execution_role()

s3Bucket = sagemaker.Session().default_bucket()
smr_client = boto3.client("sagemaker-runtime")

In [4]:
# Define Kendra client
kendra = boto3.client('kendra')
indexId = '884609ed-c06c-452f-9383-b708df745995'   #remember to change to your index ID here

In [5]:
query='What is the purpose of SageMaker GroundTruth'

In [6]:
response=kendra.query(
        QueryText = query,
        IndexId = indexId, # paste the Index ID here
)

In [7]:
for query_result in response['ResultItems']:
        
        if query_result['Type']=='QUESTION_ANSWER':
            document_text = query_result['AdditionalAttributes'][1]['Value']['TextWithHighlightsValue']['Text']
            break

        elif query_result['Type']=='ANSWER':
            document_text = query_result['AdditionalAttributes'][0]['Value']['TextWithHighlightsValue']['Text']
            break
            
        elif query_result['Type']=='DOCUMENT':
            document_text = query_result['DocumentExcerpt']['Text']

In [8]:
context = document_text.replace("\n","")
question = query

In [9]:
# Get the endpoint from the notebook "deploy_openchatkit_on_sagemaker"
endpoint_name = 'gpt-neox-djl20-acc-2023-03-28-03-40-52-368-endpoint'

In [10]:
%%time

response_model = smr_client.invoke_endpoint(
    EndpointName=endpoint_name,
    Body=json.dumps(
        {
            "inputs": ["Question: "+ question + "Context: "+ context + "Answer: "],
            "parameters": {
                "num_return_sequences": 1, 
                "max_new_tokens": 100, 
                "temperature": 0.6, 
                "top_p": 0.8,
        }
        }))

response_LLM = json.loads(response_model["Body"].read().decode("utf8"))

CPU times: user 12.7 ms, sys: 414 µs, total: 13.1 ms
Wall time: 22.6 s


In [11]:
import re

answer = re.search(r'Answer:\s*(.*)', response_LLM["outputs"][0])
if answer:
    answer_text = answer.group(1)
else:
    answer_text = "No answer found in the output"

print(answer_text)

The purpose of SageMaker GroundTruth is to help you build training data sets quickly and accurately using an active learning model to label data, combining machine learning and human interaction to make the model progressively better.
