# 02. Prompt Building and Inference test

In this notebook we will test the inference. We will use LLM (Claude 2 on Amazon Bedrock) to convert a profile into suggested requirement. We then use text-to-embedding model (Amazon Titan Embedding on Bedrock) to find the suggested items which are closest to the requirement.

An example:
profile -> user bio or travel preference
requirement -> suggested ideal places to travel to
items -> the actual places that exist, which are the most relevant to the requirement.

So overall, an example flow would be like the following. The application provides user travel profile. It asks LLM on what places might be good for the user to travel to. One/few-shot can be used here. The text-embedding model will then convert that into an embedding. A search will be conducted to find the most relevant actual places in the database which match the suggested place by the LLM. The place's text description will be returned.

You can use this notebook to iteratively develop your prompt template. The next notebook will deploy the prompt template into the application.

### 1. Set up

In [None]:
!pip install psycopg2-binary -q

In [None]:
import boto3
import json, shutil, os, time, uuid
import psycopg2, psycopg2.extras
from helper.bastion import find_instances

bedrock = boto3.client("bedrock-runtime")
ssm = boto3.client("ssm")

### 2. Configure the database connection

Load the output variables from the CDK deployment

In [None]:
deployment_output = json.load(open("./deployment-output.json","r"))
rds_host = deployment_output["RecommenderStack"]["dbreaderendpoint"]
ssm_llm_parameter_name = deployment_output["RecommenderStack"]["ssmllmparametername"]
ssm_recommendation_parameter_name = deployment_output["RecommenderStack"]["ssmrecommendationparametername"]
bastion_asg = deployment_output["RecommenderStack"]["bastionhostasgname"]
bastion_id = find_instances(bastion_asg) if bastion_asg != "" else None
connect_to_db_via_bastion = False # Set to True if you are running this Notebook without VPC connection to the DB.

Configure the Database class to interact with the database

In [None]:
class Database():
    def __init__(self, reader, bastion_id=None, port=5432, database_name="vectordb"):
        self.reader_endpoint = reader
        self.username = None
        self.password = None
        self.port = port
        self.database_name = database_name
        self.bastion_id = bastion_id # Also indicates that DB commands are run via a bastion host with AWS SSM.
        self.conn = None
    
    def fetch_credentials(self):
        secrets_manager = boto3.client("secretsmanager")
        credentials = json.loads(secrets_manager.get_secret_value(
            SecretId='AuroraClusterCredentials'
        )["SecretString"])
        self.username = credentials["username"]
        self.password = credentials["password"]
    
    def connect_for_reading(self):
        if self.username is None or self.password is None: self.fetch_credentials()
        
        conn = psycopg2.connect(host=self.reader_endpoint, port=self.port, user=self.username, password=self.password, database=self.database_name)
        conn.autocommit = True
        self.conn = conn
        return conn
    
    def close_connection(self):
        if self.conn is not None:
            self.conn.close()
            self.conn = None
    
    def search(self, query_template, embedding, num_items=1, additional_query_parameters = []):
        all_query_parameters = [embedding, str(num_items)] + additional_query_parameters
        query_statement = query_template.format(*all_query_parameters)
        return self.query_database(query_statement, tuples_only_and_unaligned=True, verbose=False)
    
    def query_database(self, query, tuples_only_and_unaligned=False, verbose=True):
        if self.username is None or self.password is None: self.fetch_credentials()
        
        #print(query)
        
        if self.bastion_id is None or not connect_to_db_via_bastion:
            if self.conn is None: self.connect_for_reading()
            
            cur = self.conn.cursor(cursor_factory = psycopg2.extras.RealDictCursor)
            cur.execute(query)
            
            try:
                result = cur.fetchall()
            except Exception as e:
                if str(e) != "no results to fetch": print(e)
                result = cur.statusmessage
                
            if verbose: print(result)
            
            cur.close()
            return result
            
        else:
            query_id = str(uuid.uuid4())[:8]
            query_modifier = " -At" if tuples_only_and_unaligned  else ""
                
            query_command = f"""export PGPASSWORD='{self.password}' && echo "{query}" > ./q{query_id}.txt && psql -h {self.reader_endpoint} -p 5432 -U {self.username} -d {self.database_name} -F "=@#@=" -R "===@###@===" -f ./q{query_id}.txt {query_modifier} && rm ./q{query_id}.txt"""

            response = ssm.send_command(
                        InstanceIds=[self.bastion_id],
                        DocumentName="AWS-RunShellScript",
                        Parameters={'commands': [query_command]})

            command_id = response['Command']['CommandId']
            flight_flag = True
            while flight_flag:
                try:
                    output = ssm.get_command_invocation(
                      CommandId=command_id,
                      InstanceId=self.bastion_id
                    )
                    flight_flag = False

                    if output['StandardOutputContent'] != '':
                        records = list(map(lambda r: {"id": r.split("=@#@=")[0], "distance":r.split("=@#@=")[1], "description": r.split("=@#@=")[2]} , output["StandardOutputContent"].split("===@###@===")))
                        return records    
                    
                    output_string = ""
                    if output["StandardOutputUrl"] !=  '': output_string = output["StandardOutputUrl"]
                    if output["StandardErrorContent"] !=  '': output_string = output["StandardErrorContent"]
                    if output["StandardErrorUrl"] !=  '': output_string = output["StandardErrorUrl"]

                    if output["StandardErrorContent"] !=  '' or output["StandardErrorUrl"] !=  '':
                        print(output_string)

                    return output_string
                except:
                    time.sleep(1)
            return output_string

db = Database(reader=rds_host, bastion_id=bastion_id)

### 3. Build the prompt template

This is a prompt template to match with the mock dataset used. Feel free to change this one/few shot prompt example in accordance with your own dataset. Any value in {} are to be replaced during runtime. It is mandatory to have {0} to be the input text (the requirement/profile) and {1} to be the number of suggested item types to be requested. You can have more parameters, like {2}, {3}, and so on. During runtime, the AWS Lambda function will run `.format(*prompt_parameters)` on this prompt template. The `prompt_parameters` is a result of merge between `[input_text, num_types]` list and `additional_prompt_parameters` which defaults to `[]`. You can specify the `additional_prompt_parameters` in the API payload when doing the inference.

Note that the `\n\nHuman:` and `\n\nAssistant:` structure needs to be preserved to make it work with the Claude 2 model on Amazon Bedrock. Also, ***use '###'*** to separate the suggested items since the AWS Lambda function is configured to split the results with this token.

In [None]:
prompt_template = """
Human: You are a travel planner expert. Given a traveler profile, you can suggest best place types the person should visit in Singapore.
See the below example.

Suggest 2 place types to visit in Singapore for the below traveler profile.
===Traveler Profile Example===
Gender: Male
Interest: Making family and friends happy. I am a backpacker who loves to go to new places, especially places with many people.
I have gone to 76 countries and I love to try their local chocolates or biscuits. Sometimes I bring interesting local things for my colleagues.
===2 Place Type Suggestion Example===
###
Place type: Shop
Description: This shop sells very wide array of products including souverniers, chocolates, and biscuits, perfect for a treat for family with Singapore's unique items.
###
Place type: Traditional market
Description: This place has many shops and merchants selling goodies from Singapore, from chocolates to cheap t-shirts. This also has some authentic atmosphere of the old Singapore, perfect for selfies.
###

Now given the example above, suggest {1} place types to visit in Singapore for the below traveler profile. Answer straight with the data WITHOUT added introduction sentence.
===Traveler Profile===
{0}
==={1} Place Type Suggestion===

Assistant:
"""

# Store it on disk
path = "prompt_template.txt" # Do not change the naming of the file
f = open(path, "w")
f.write(prompt_template)
f.close()

### 4. Build the vector search query

Since this solution is customizable, it allows you to customer the vector search query. It will then be uploaded to S3 (in notebook 03) and be used by the Lambda function in the actual inference.

Note that the AWS Lambda that backs the API is set to run `.format(*parameters)` from this template, while `parameters` will be a merged array of `[embedding, num_items]` and any additional parameters you supply during inference time. For example, if you want to add more parameters for the WHERE clause or other part of the query, you can do so by adding {2}, {3}, and so on. You must remember to supply these parameters via `additional_query_parameters` when invoking the inference API. By default the `additional_query_parameters` is and empty list `[]`.

Another restriction is to always have the id and distance outputted and they must be the first and second column in the return result.

In [None]:
query_statement_template = "SELECT id, embedding <-> '{0}' AS distance, description FROM items ORDER BY distance LIMIT {1};"

# Store it on disk
path = "vector_search_query.txt" # Do not change the naming of the file
f = open(path, "w")
f.write(query_statement_template)
f.close()

In [None]:
# Assign the value for additional query parameters (if any) now for testing
additional_query_parameters = []

### 5. Define parameters

Load the default parameters from the AWS SSM Parameter Store, then store them on file to be used by the next notebook. 

The LLM related parameters definition can be found here https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html

The recommendation related parameters are:
* num_types = This is the number of the recommended item types to be returned by the LLM.
* num_items = This is the number of items to be returned by the vectorDB for each vector being searched
* model_id = This is the id of the model to be used in Amazon Bedrock. Please refer here https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids-arns.html

If you set `num_types=2` and `num_items=3`, this means that given a text input, you request LLM to recommended **2** item types. For each, this solution will convert them into embedding and do vector search to find the top **3** actual items in the database. So in total, you will have 2 x 3 = 6 items to be returned, assuming there is no duplication. This solution will do deduplication so the actual items to be returned can be less than num_types x num_items

Note that `num_items` and `num_types` set on the recommendation parameters are just use as default values. They can be overridden during runtime if you specifcy those parameters in the API payload.

In [None]:
# Load the LLM related parameters
llm_parameters = json.loads(ssm.get_parameter(
    Name=ssm_llm_parameter_name
)['Parameter']['Value'])
print("The LLM parameters are:")
llm_parameters

In [None]:
# Load the recommendation related parameters
recommendation_parameters = json.loads(ssm.get_parameter(
    Name=ssm_recommendation_parameter_name
)['Parameter']['Value'])
print("The recommendation related parameters are:")
recommendation_parameters

### 6. Test inference

Define a test profile.
This test data is meant to be used with the mock dataset used in this solution. Feel free to change as appropriate.

In [None]:
new_input="""Female, 30 years old, married
I really want to celebrate my wedding anniversary with Husband. 
Somewhere where we can have a picnic with our own bento with wide view of sky and city."""

Applying prompt

In [None]:
additional_prompt_parameters = []
prompt_parameters = [new_input, recommendation_parameters['num_types']] + additional_prompt_parameters
prompt = prompt_template.format(*prompt_parameters)
print(prompt)

Define inference test function

In [None]:
def test_inference(prompt, num_items):
    # Get the recommended item text from LLM
    llm_parameters['prompt'] = prompt
    body = json.dumps(llm_parameters)

    # Call the LLM to get the suggested item types
    response = bedrock.invoke_model(body=body, modelId=recommendation_parameters['model_id'])
    recommended_item_types = json.loads(response.get("body").read())["completion"]
    recommended_item_types = recommended_item_types.split("###") if "\n###" in recommended_item_types else [recommended_item_types]
    recommended_item_types = list(filter(lambda x: x != '' and not x.isspace(), recommended_item_types))
    
    print("\nLLM suggested item types:")
    print(recommended_item_types)

    recommended_item_embeddings = []

    # Call the text-to-embedding model to get the embedding for each of the suggested item types.
    for item_type in recommended_item_types:
        # Get the embedding of the recommended item text
        body = json.dumps(
            {
                "inputText": item_type,
            }
        )

        response = bedrock.invoke_model(body=body, modelId="amazon.titan-embed-text-v1")
        recommended_item_embeddings.append(json.loads(response.get("body").read())["embedding"])

    recommended_items = []

    # Do search on vector database
    for embedding in recommended_item_embeddings:
        recommended_items = recommended_items + db.search(query_statement_template, 
                                      embedding, 
                                      num_items=recommendation_parameters['num_items'])

    # Deduplicate
    final_recommended_items = {}
    for item in sorted(recommended_items, key = lambda k: k["distance"]):
        if item['id'] not in final_recommended_items: final_recommended_items[item['id']] = item

    final_recommended_items = list({'id': v[1]['id'], 'distance': v[1]['distance'], 'description': v[1]['description']} for v in final_recommended_items.items())
    return final_recommended_items

Do inference test with `num_items` = 1 and `num_types` = 1

In [None]:
recommendation_parameters['num_items'] = 1
test_inference(prompt, recommendation_parameters['num_items'])

Now let's try with `num_items` = 2 and `num_types` = 1

In [None]:
recommendation_parameters['num_items'] = 2
test_inference(prompt, recommendation_parameters['num_items'])

Now let's try with `num_items` = 2 and `num_types` = 2

In [None]:
recommendation_parameters['num_types'] = 2
prompt_parameters = [new_input, recommendation_parameters['num_types'] ]
prompt = prompt_template.format(*prompt_parameters)

test_inference(prompt, recommendation_parameters['num_items'] )

If all is good, now let's verify and save the parameters to disk

In [None]:
print("Recommendation parameters are:")
recommendation_parameters

In [None]:
del llm_parameters['prompt']
print("LLM parameters are:")
llm_parameters

In [None]:
# Store it on disk
path = "llm_parameters.txt" # Do not change the naming of the file
f = open(path, "w")
f.write(json.dumps(llm_parameters))
f.close()

# Store it on disk
path = "recommendation_parameters.txt" # Do not change the naming of the file
f = open(path, "w")
f.write(json.dumps(recommendation_parameters))
f.close()