# Build Multimodal RAG with Amazon OpenSearch Service

In this notebook, you will build and run different types search method using a sample retail dataset. You will start with a text only search, followed by a multi-modal semantic search with both text and image, and finally you will used conversational search, all from Amazon OpenSearch Service.

_Note: The retail dataset used contains 2,465 retail product samples that belong to different categories such as accessories, home decor, apparel, housewares, books, and instruments. Each product contains metadata including the ID, current stock, name, category, style, description, price, image URL, and gender affinity of the product. You will be only use the product image and product description fields in this solution._

Here are the key steps in this notebook:

1. Setup OpenSearch connection and create connection to ML models.
2. Ingest data into OpenSearch domain.
3. Experiment with different search options like lexical (text only), multimodal (semantic), and conversational search.
4. (Bonus / Optional) Deploy a web application to experiment with a shopping assistant chat bot.

## 1. Lab Pre-requisites


For this notebook we require a few libraries. We'll use the Python clients for Amazon OpenSearch Service and Amazon Bedrock, and OpenSearch ML Client library for generating multimodal embeddings.

### 1.1. Import libraries & initialize resources
The code blocks below will import all the relevant libraries and modules used in this notebook.

In [None]:
!pip install opensearch-py -q
!pip install opensearch_py_ml -q
!pip install deprecated -q
!pip install requests_aws4auth -q
print("Installs completed.")

In [None]:
# Import Python libraries
import boto3
import json
from opensearchpy import OpenSearch, RequestsHttpConnection
import os
import urllib.request
import tarfile
from requests_aws4auth import AWS4Auth
from ruamel.yaml import YAML
from PIL import Image
import base64
import re

# Initiale variables for later use
embedding_connector_id = ""
embedding_model_id = ""
llm_connector_id = ""
llm_model_id = ""

print("Imports and initialization completed.")

### 1.2. Get CloudFormation stack output variables

We have preconfigured a few resources by creating a CloudFormation stack in the account. Names and ARN of these resources will be used within this lab. Load the output variables here to be used in later parts of this notebook.

In [None]:
# Create a Boto3 session
session = boto3.Session()

# Get the account id
account_id = boto3.client('sts').get_caller_identity().get('Account')

# Get the current region
region = session.region_name

cfn = boto3.client('cloudformation')

# Method to obtain output variables from Cloudformation stack. 
def get_cfn_outputs(stackname):
    outputs = {}
    for output in cfn.describe_stacks(StackName=stackname)['Stacks'][0]['Outputs']:
        outputs[output['OutputKey']] = output['OutputValue']
    return outputs

## Setup variables to use for the rest of the demo
cloudformation_stack_name = "multimodal-rag-opensearch"

outputs = get_cfn_outputs(cloudformation_stack_name)
aos_host = outputs['OpenSearchDomainEndpoint']
# s3_bucket = outputs['s3BucketTraining']
bedrock_inf_iam_role = outputs['BedrockBatchInferenceRole']
bedrock_inf_iam_role_arn = outputs['BedrockBatchInferenceRoleArn']
# sagemaker_notebook_url = outputs['SageMakerNotebookURL']
notebook_iam_role_arn = outputs['NotebookRoleArn']

# We will just print all the variables so you can easily copy if needed.
outputs

### 1.3. Retrieve internal OpenSearch credentials (for this step only)

Create a connection to OpenSearch using username and password.

In [None]:
# Connect to OpenSearch using the internal username and password obtained from AWS Secrets Manager
kms = boto3.client('secretsmanager')
aos_credentials = json.loads(kms.get_secret_value(SecretId=outputs['OpenSearchSecret'])['SecretString'])
auth = (

aos_credentials['username'], aos_credentials['password'])

# Create OpenSearch client
aos_client = OpenSearch(
    hosts=[f'https://{aos_host}'],
    http_auth=auth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection
)
print(f"Connected to OpenSearch endpoint :{aos_client}")

### 1.4 Map this Notebook's IAM role to OpenSearch backend role


In [None]:
# Define the role mapping to grant permissions to  AOS username and notebook_iam_role_arn
role_name = "all_access"

role_mapping = {
    "backend_roles": [notebook_iam_role_arn],
    "users" : [ aos_credentials['username'] ]
}

# Create the role mapping
response = aos_client.security.create_role_mapping(role=role_name, body=role_mapping)
print("Role mapping created:", response)

### 1.5 Download the retail dataset to our Notebook instance

In [None]:
os.makedirs('tmp/images', exist_ok = True)
metadata_file = urllib.request.urlretrieve('https://aws-blogs-artifacts-public.s3.amazonaws.com/BDB-3144/products-data.yml', 'tmp/images/products.yaml')
img_filename,headers= urllib.request.urlretrieve('https://aws-blogs-artifacts-public.s3.amazonaws.com/BDB-3144/images.tar.gz', 'tmp/images/images.tar.gz')              
print(img_filename)
file = tarfile.open('tmp/images/images.tar.gz')
file.extractall('tmp/images/')
file.close()
#remove images.tar.gz
os.remove('tmp/images/images.tar.gz')
print("Data download and extraction completed.")

## 2. Create, register, and deploy the OpenSearch Service ML connector to the Amazon Titan Multimodal Embeddings G1 model


#### NOTE: AUTHENTICATION CELL

At any point in this lab, if you get a failure message - The security token included in the request is expired. You can resolve it by running this cell again. The cell refreshes the security credentials that is required for the rest of the lab.

In [None]:
# Connect to OpenSearch using the IAM Role of this Jupyter notebook
# Create AWS4Auth instance
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(
    credentials.access_key,
    credentials.secret_key,
    region,
    'es',
    session_token=credentials.token
)

# Create OpenSearch client
aos_client = OpenSearch(
    hosts=[f'https://{aos_host}'],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    timeout=60
)
print("Connection details: ")
aos_client

### 2.1 Create the connector to Amazon Bedrock Multimodal embedding model.

In [None]:
## Create an OpenSearch remote model connector with Amazon Bedrock Titan MM Embedding model.

if not embedding_connector_id:
    payload = {
        "name": "Amazon Bedrock Connector: embedding",
        "description": "The connector to bedrock Titan multimodal embedding model",
        "version": 1,
        "protocol": "aws_sigv4",
        "credential": {
          "roleArn": f"arn:aws:iam::{account_id}:role/{bedrock_inf_iam_role}"
           },
        "parameters": {
        "region": region,
        "service_name": "bedrock",
        "model": "amazon.titan-embed-image-v1"
            },
        "actions": [
            {
          "action_type": "predict",
          "method": "POST",
          "url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/${parameters.model}/invoke",
          "headers": {
            "content-type": "application/json",
            "x-amz-content-sha256": "required"
              },
            "request_body": "{ \"inputText\": \"${parameters.inputText:-null}\", \"inputImage\": \"${parameters.inputImage:-null}\" }",
          "pre_process_function": "\n    StringBuilder parametersBuilder = new StringBuilder(\"{\");\n    if (params.text_docs.length > 0 && params.text_docs[0] != null) {\n      parametersBuilder.append(\"\\\"inputText\\\":\");\n      parametersBuilder.append(\"\\\"\");\n      parametersBuilder.append(params.text_docs[0]);\n      parametersBuilder.append(\"\\\"\");\n      \n      if (params.text_docs.length > 1 && params.text_docs[1] != null) {\n        parametersBuilder.append(\",\");\n      }\n    }\n    \n    \n    if (params.text_docs.length > 1 && params.text_docs[1] != null) {\n      parametersBuilder.append(\"\\\"inputImage\\\":\");\n      parametersBuilder.append(\"\\\"\");\n      parametersBuilder.append(params.text_docs[1]);\n      parametersBuilder.append(\"\\\"\");\n    }\n    parametersBuilder.append(\"}\");\n    \n    return  \"{\" +\"\\\"parameters\\\":\" + parametersBuilder + \"}\";",
          "post_process_function": "\n      def name = \"sentence_embedding\";\n      def dataType = \"FLOAT32\";\n      if (params.embedding == null || params.embedding.length == 0) {\n          return null;\n      }\n      def shape = [params.embedding.length];\n      def json = \"{\" +\n                 \"\\\"name\\\":\\\"\" + name + \"\\\",\" +\n                 \"\\\"data_type\\\":\\\"\" + dataType + \"\\\",\" +\n                 \"\\\"shape\\\":\" + shape + \",\" +\n                 \"\\\"data\\\":\" + params.embedding +\n                 \"}\";\n      return json;\n    "
                }
          ] 
        }

    response = aos_client.transport.perform_request(
        'POST',
        '/_plugins/_ml/connectors/_create',
        body=json.dumps(payload),
        headers={"Content-Type": "application/json"}
    )
    
    embedding_connector_id = response['connector_id']
else:
    print(f"Connector already exists - {embedding_connector_id}")
    
print("Embedding connector ID: " + embedding_connector_id)

### 2.2 Register the model
Once the connector is created, register and deploy the model using the following 2 cells.

In [None]:
# Register the multimodal embedding model
if not embedding_model_id:
    # Prepare the payload
    payload = {
        "name": "Bedrock Titan mm embeddings model",
        "function_name": "remote",
        "description": "Bedrock Titan mm embeddings model",
        "connector_id": embedding_connector_id
    }
    # Make the request
    response = aos_client.transport.perform_request(
        'POST',
        '/_plugins/_ml/models/_register',
        body=json.dumps(payload),
        headers={"Content-Type": "application/json"}
    )
    embedding_model_id = response['model_id']
else:
    print("skipping model registration - model already exists")
print("Model registered under model_id: "+embedding_model_id)

### 2.3 Deploy the multimodal embedding model

In [None]:
# Deploy the embedding model
response = aos_client.transport.perform_request(
    'POST',
    '/_plugins/_ml/models/'+embedding_model_id+'/_deploy',
    headers={"Content-Type": "application/json"}
)
print("Deployment status of the model, "+embedding_model_id+" : "+response['status'])

### 2.4 Test the integration between the OpenSearch domain and Amazon Bedrock multimodal embedding model.

In [None]:
img = "tmp/images/footwear/2d2d8ec8-4806-42a7-b8ba-ceb15c1c7e84.jpg"
with open(img, "rb") as image_file:
    input_image_binary = base64.b64encode(image_file.read()).decode("utf8")
    
payload = {
"parameters": {
"inputText": "Sleek, stylish black sneakers made for urban exploration. With fashionable looks and comfortable design, these sneakers keep your feet looking great while you walk the city streets in style",
"inputImage":input_image_binary
}
}

response = aos_client.transport.perform_request(
    'POST',
    '/_plugins/_ml/models/'+embedding_model_id+'/_predict',
    body=json.dumps(payload),
    headers={"Content-Type": "application/json"}
)

try:
    embed = response['inference_results'][0]['output'][0]['data'][0:10]
    shape = response['inference_results'][0]['output'][0]['shape'][0]
    print("Embedding test completed.")
    print("First 10 dimensions:")
    print(str(embed))
    print("\n")
    print("Total: " + str(shape) + " dimensions")
except KeyError as e:
    print(f"KeyError: {e}")
    print("The response does not contain the expected data structure.")
except Exception as e:
    print(f"Error: {e}")
    print("An unexpected error occurred.")

## 🎉👏🎈Congrats! You have succesfully connected to OpenSearch and created an integration to multimodal embedding model!! 🎉👏🎈

## 3. Create the OpenSearch ingestion pipeline and index


Create an ingestion pipeline that will call Amazon Bedrock Titan Multimodal embedding model and convert the text and image into multimodal vector embeddings. Ingest pipeline is a feature in OpenSearch that allows you to define certain actions to be automatically be performed at the time of data ingestion. You could do simple processing such as adding a static field, modify an existing field, or call a remote model to get inference and store inference output together with the indexed record/document. In our case inference output is a vector embedding.

This ingestion pipeline is going to call our remote model and convert the `product_description` field and the `image_binary` fields to vector embedding and store it in the field called `vector_embedding`.

### 3.1 Create the ingestion pipeline

In [None]:
pipeline_id = "bedrock-multimodal-ingest-pipeline"
payload = {
    "description": "A text/image embedding pipeline",
    "processors": [
        {
            "text_image_embedding": {
                "model_id": embedding_model_id,
                "embedding": "vector_embedding",
                "field_map": {
                    "text": "product_description",
                    "image": "image_binary"
                }
            }
        }
    ]
}
response = aos_client.ingest.put_pipeline(id=pipeline_id, body=payload)
response

### 3.2 Create the k-NN index

In [None]:
# Check if the index exists. Delete and recreate if it does. 
if aos_client.indices.exists(index='bedrock-multimodal-rag'):
    print("The index exists. Deleting...")
    response = aos_client.indices.delete(index='bedrock-multimodal-rag')
    
payload = {
  "settings": {
    "index.knn": True,
    "default_pipeline": "bedrock-multimodal-ingest-pipeline"
  },
  "mappings": {
      
    "_source": {
     
    },
    "properties": {
      "vector_embedding": {
        "type": "knn_vector",
        "dimension": shape,
        "method": {
          "name": "hnsw",
          "engine": "faiss",
          "parameters": {}
        }
      },
      "product_description": {
        "type": "text"
      },
        "image_url": {
        "type": "text"
      },
      "image_binary": {
        "type": "binary"
      },
      "price": {
        "type": "float"
      }
    }
  }
}

print("Creating index...")
response = aos_client.indices.create(index='bedrock-multimodal-rag',body=payload)
response

### 3.3 Index the dataset to the index

In [None]:
def resize_image(photo, width, height):
    Image.MAX_IMAGE_PIXELS = 100000000
    
    with Image.open(photo) as image:
        image.verify()
    with Image.open(photo) as image:    
        
        if image.format in ["JPEG", "PNG"]:
            file_type = image.format.lower()
            path = image.filename.rsplit(".", 1)[0]

            image.thumbnail((width, height))
            image.save(f"{path}-resized.{file_type}")
    return file_type, path

# Load the products from the dataset
yaml = YAML()
items_ = yaml.load(open('tmp/images/products.yaml'))

batch = 0
count = 0
body_ = ''
batch_size = 100
last_batch = int(len(items_)/batch_size)
action = json.dumps({ 'index': { '_index': 'bedrock-multimodal-rag' } })

for item in items_:
    count+=1
    fileshort = "tmp/images/"+item["category"]+"/"+item["image"]
    payload = {}
    payload['image_url'] = fileshort
    payload['product_description'] = item['description']
    payload['price'] = item['price']
    
    #resize the image and generate image binary
    file_type, path = resize_image(fileshort, 2048, 2048)

    with open(fileshort.split(".")[0]+"-resized."+file_type, "rb") as image_file:
        input_image = base64.b64encode(image_file.read()).decode("utf8")
    
    os.remove(fileshort.split(".")[0]+"-resized."+file_type)
    payload['image_binary'] = input_image
    
    body_ = body_ + action + "\n" + json.dumps(payload) + "\n"
    
    if(count == batch_size): # When count reaches batch size, send to bulk for indexing.
        response = aos_client.bulk(
        index = "bedrock-multimodal-rag",
        body = body_
        )
        batch += 1
        count = 0
        print("batch "+str(batch) + " ingestion done!")
        if(batch != last_batch):
            body_ = ""
        
            
#ingest the remaining rows
response = aos_client.bulk(
        index = "bedrock-multimodal-rag",
        body = body_
        )
        
print("All "+str(last_batch)+" batches ingested into index")

### 3.4 Check indexing by running an OpenSearch query: 

In [None]:
res = aos_client.search(index="bedrock-multimodal-rag", body={"query": {"match_all": {}}})
print("Records found: %d." % res['hits']['total']['value'])

## 🎉👏🎈Congrats! You have succesfully ingested sample data into to OpenSearch using an ingest pipeline!!! 🎉👏🎈

## 4. Lexical and vector search

### 4.1 Lexical search
Try using different keywords and phrases to see different results.
Replace the `query` string with your search and then run the code block.

In [None]:
# Lexical search: Using query text only
query = "shoes"
index_name = "bedrock-multimodal-rag"
search_body = {
    "_source": {
        "exclude": ["vector_embedding"]
    },
    "query": {
        "match": {
            "product_description": {
                "query": query
            }
        }
    },
    "size": 5
}

# Perform the search
response = aos_client.search(
    body=search_body,
    index=index_name
)

#Output the results
count = 1
for hit in response['hits']['hits']:
    print(str(count) + ". " + hit["_source"]["product_description"])
    print("Price: "+ str(hit["_source"]["price"]))
    image = Image.open(hit["_source"]["image_url"])
    new_size = (300, 200)
    resized_img = image.resize(new_size)
    resized_img.show()
    count+=1
    print('')

### 4.2 Vector search with both image and text as inputs

In [None]:
# Multimodal Search using text and image as inputs
query_text = "travel"
query_image = "./simple_bag.jpg"

img = Image.open(query_image) 
print("Input text query: "+query_text)
print("Input query Image:")
img.show()

# Define the query and search body
with open(query_image, "rb") as image_file:
    query_image_binary = base64.b64encode(image_file.read()).decode("utf8")
search_body = {
    "_source": {
        "exclude": [
            "vector_embedding"
        ]
        },
        "query": {    
            "neural": {
                "vector_embedding": {           
                    "query_image":query_image_binary,
                    "query_text":query_text,     
                    "model_id": embedding_model_id,
                    "k": 5
                }
            }
        },
        "size":5
      }

# Perform the search
response = aos_client.search(
    body=search_body,
    index=index_name
)

#Output the results
print('Search results:')
count = 1
for hit in response['hits']['hits']:
    print(str(count) + ". " + hit["_source"]["product_description"])
    print("Price: "+ str(hit["_source"]["price"]))
    image = Image.open(hit["_source"]["image_url"])
    new_size = (300, 200)
    resized_img = image.resize(new_size)
    resized_img.show()
    count+=1
    print('')

## 5. Multimodal conversational search

Conversational search lets you ask questions in natural language, receive a text response based on a provided context, and ask additional  questions with context.

In this section, you will be using OpenSearch Service as the knowledge database to run multimodal retrieval and augment the LLM prompt with the relevant context. You will be using Claude v2 as the foundational model to generate responses and provide fashion advice to end users based on the available retail items stored in OpenSearch Service.


### 5.1 Create an OpenSearch Bedrock Claude LLM connector


In [None]:
if not llm_connector_id:
    connector_payload = {
        "name": "Amazon Bedrock Connector: Claude 2",
        "description": "The connector to bedrock Claude V2",
        "version": 1,
        "protocol": "aws_sigv4",
        "credential": {
            "roleArn": f"arn:aws:iam::{account_id}:role/{bedrock_inf_iam_role}"
        },
        "parameters": {
            "region": region,
            "service_name": "bedrock",
            "auth": "Sig_V4",
            "model": "anthropic.claude-v2"
        },
        "actions": [
            {
                "action_type": "predict",
                "method": "POST",
                "headers": {
                    "content-type": "application/json"
                },
                "url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/${parameters.model}/invoke",
                "request_body": "{\"prompt\":\"\\n\\nHuman: ${parameters.inputs}\\n\\nAssistant:\",\"max_tokens_to_sample\":300,\"temperature\":0.5,\"top_k\":250,\"top_p\":1,\"stop_sequences\":[\"\\\\n\\\\nHuman:\"]}"
            }
        ]
    }
    
    # Create the connector
    response = aos_client.transport.perform_request(
        'POST',
        '/_plugins/_ml/connectors/_create',
        body=connector_payload
    )
    # Print the response
    llm_connector_id = response['connector_id']
else:
    print(f"Connector already exists")
    
print("LLM connector ID: " + llm_connector_id)

### 5.2 Register and deploy the Claude LLM connector

In [None]:
# Register and deploy the llm model
if not llm_model_id:
    payload = { 
        "name": "Amazon Bedrock Connector: Claude v2",
        "function_name": "remote",
        "description": "The connector to bedrock Claude v2",
        "connector_id": llm_connector_id
    }
    # Make the request
    response = aos_client.transport.perform_request(
        'POST',
        '/_plugins/_ml/models/_register',
        body=json.dumps(payload),
        headers={"Content-Type": "application/json"}
    )
    llm_model_id = response['model_id']
# Display the response    
response

### 5.3 Test the LLM connector by running an inference 
We will run the test inference without searching the index by passing an input directly to the LLM connector we just deployed.

In [None]:
# Prepare the query string
payload = {
    "parameters": {
        "inputs": "What are the most important features of a travel bag? Be concise."
    }
}
# Make the request
response = aos_client.transport.perform_request(
    'POST',
    '/_plugins/_ml/models/'+llm_model_id+'/_predict',
    body=payload,
    headers={"Content-Type": "application/json"}
)
# Check the response 
if response['inference_results'][0]['status_code'] == 200:
    llm_gen = response['inference_results'][0]['output'][0]['dataAsMap']['completion']
    print(str("Claude generated response without context: \n\n"+llm_gen))

### 5.4 Enable the memory and rag pipeline features of OpenSearch

In [None]:
# Set plugin settings using the cluster.put_settings method of the OpenSearch client
response = aos_client.cluster.put_settings(
    body={
        "persistent": {
            "plugins.ml_commons.memory_feature_enabled": "true",
            "plugins.ml_commons.rag_pipeline_feature_enabled": "true"
        }
    }
)
# Print response. Look for 'acknowledged': True
response

### 5.5 Create a search pipeline (shopping assistant)
In order to have a conversational search, the LLM needs to remember the context of the entire conversation to have follow-up questions. This is composed of two components:
- Conversational memory to provide the history context
- Retrieval-augmented generation (RAG) pipeline to provide the search context

The retrieval_augmented_generation processor, part of the RAG pipeline, is a search results processor that intercepts query results, retrieves previous messages from the conversation, and sends a prompt to a large language model (LLM), saving the response in conversational memory and returning both the original OpenSearch query results and the LLM response.

Let's create the search pipeline with the retrieval_augmented_generation processor to use at search time.

In [None]:
# Prepare the response processor attributes
payload = {
    "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "bedrock_rag-pipeline_demo",
        "description": "Search pipeline using Bedrock Claude v2 Connector for RAG",
        "model_id": llm_model_id,
        "context_field_list": ["product_description"],
        "system_prompt": "You are a helpful shopping advisor that uses their vast knowledge of fashion tips to make great recommendations people will enjoy.",
        "user_instructions": "As a shopping advisor, be friendly and approachable. Greet the customer warmly. Evaluate each item provided in the context and provide a concise recommendation about each item to matches best the customer question using the order and number of search result related to each item. If there are items in the provided context that do not match the user question, explain that this may be due to insufficient items in the inventory. Finally, thank the client and let them know you're available if they have any other questions."
      }
    }
  ]
}
# Make the request
response = aos_client.transport.perform_request(
    'PUT',
    "/_search/pipeline/multimodal_rag_pipeline",
    body=payload,
    headers={"Content-Type": "application/json"}
)
# Print response. Look for 'acknowledged': True
response

### 5.6 Create a conversational memory object

#### NOTE: Re-run the below cell to create a new memory ID

In [None]:
# Prepare the query string
payload = {
    
    "name": "Conversation about products"
}
# Make the request
response = aos_client.transport.perform_request(
    'POST',
    "/_plugins/_ml/memory/",
    body=payload,
    headers={"Content-Type": "application/json"}
)
# Persist memory_id
memory_id = response['memory_id']
# Print the 'memory_id'
print("The new memory id is: " +memory_id) 

### 5.7 Start a conversational search

In [None]:
# RAG using multimoadal search to provide prompt context
# Text and image as inputs
query_text = "for a long trip."
query_image = "./simple_bag.jpg"

img = Image.open(query_image) 
print("Input text query: "+query_text)
print("Input query Image:")
img.show()

# Define the query and search body
with open(query_image, "rb") as image_file:
    query_image_binary = base64.b64encode(image_file.read()).decode("utf8")

response = aos_client.search(
    index='bedrock-multimodal-rag',
    body={
        "_source": {
            "exclude": ["vector_embedding", "image_binary"]
        },
        "query": {
            "neural": {
                "vector_embedding": {
                    "query_image": query_image_binary,
                    "query_text": query_text,
                    "model_id": embedding_model_id,
                    "k": 5
                }
            }
        },
        "size": 5,
        "ext": {
            "generative_qa_parameters": {
                "llm_model": "bedrock/claude",
                "llm_question": query_text,
                "memory_id": memory_id,
                "context_size": 5,
                "message_size": 5,
                "timeout": 60
            }
        }
    },
    params={
        "search_pipeline": "multimodal_rag_pipeline"
    },
    request_timeout=30
)
# Extract the generated 'shopping assistant' recommendations
# Split the string into lines
# lines = response['ext']['retrieval_augmented_generation']['answer'].split('\n')
recommendations = response['ext']['retrieval_augmented_generation']['answer']
# for line in lines:
#     if re.match(r'[^\s\0]+', line):
#         recommendations.append(line.strip())
    
# Output the search results and shopping assistnat recommendations together
print('Search results and Shopping assistant recommendations:')
count = 1
for hit in response['hits']['hits']:
    print("Search result "+str(count) + ": ")
    print("Price: "+ str(hit["_source"]["price"]))
    print(hit["_source"]["product_description"])
    # print("Shopping assistant: ")
    # print(recommendations[count-1])
    image = Image.open(hit["_source"]["image_url"])
    new_size = (300, 200)
    resized_img = image.resize(new_size)
    resized_img.show()
    count+=1
    print('')
print("Shopping assistant: ")
print(recommendations)

### 5.8 Ask a follow up question

In [None]:
query_text = "thanks. What is likely to be best for air travel?"

print("Input text query: " + query_text)

# Perform the search (no new image this time)
response = aos_client.search(
    index='bedrock-multimodal-rag',
    body={
        "_source": {
            "exclude": ["vector_embedding", "image_binary"]
        },
        "query": {
            "neural": {
                "vector_embedding": {
                    #"query_image": query_image_binary, (leaving out the image for now)
                    "query_text": query_text,
                    "model_id": embedding_model_id,
                    "k": 5
                }
            }
        },
        "size": 5,
        "ext": {
            "generative_qa_parameters": {
                "llm_model": "bedrock/claude",
                "llm_question": query_text,
                "memory_id": memory_id,
                "context_size": 5,
                "message_size": 5,
                "timeout": 60
            }
        }
    },
    params={
        "search_pipeline": "multimodal_rag_pipeline"
    },
    request_timeout=30
)
# Extract the recommendations from the response
# Split the string into lines
# lines = response['ext']['retrieval_augmented_generation']['answer'].split('\n')
recommendations = response['ext']['retrieval_augmented_generation']['answer']
# for line in lines:
#     if re.match(r'[^\s\0]+', line):
#         recommendations.append(line.strip())
# Output the search results and shopping assistnat recommendations together
print('Search results and Shopping assistant recommendations:')
count = 1
for hit in response['hits']['hits']:
    print("Search result "+str(count) + ": ")
    print(hit["_source"]["product_description"])
    # print("Shopping assistant: ")
    # print(recommendations[count-1])
    image = Image.open(hit["_source"]["image_url"])
    new_size = (300, 200)
    resized_img = image.resize(new_size)
    resized_img.show()
    count+=1
    print('')
print("Shopping assistant: ")
print(recommendations)

### 5.9 Check the conversation history

To verify that the messages were added to the memory, provide the memory_ID to the Get Messages API:

In [None]:
# Make the request
response = aos_client.transport.perform_request(
    'GET',
    "/_plugins/_ml/memory/"+memory_id +'/messages',
    headers={"Content-Type": "application/json"}
)
# Print the response. You should see a dictionary containing a list of messages.
#print(response) 
for message in response['messages']:
    print(message)

## 🎉👏🎈Congrats! You have succesfully experimented with lexical, multimodal, and conversational search in Amazon OpenSearch service!! 🎉👏🎈

# Bonus: Deploy your web application

To deploy this code as an application we will use [Streamlit](https://streamlit.io/). 

**Step 1:** Export the connector and model IDs you created earlier in this notebook. We will store them to a file so that they can be referenced outside of the notebook and persisted between kernel restarts.

In [None]:
connector_ids = {}
connector_ids['llm_connector_id'] = llm_connector_id
connector_ids['llm_model_id'] = llm_model_id
connector_ids['embedding_connector_id'] = embedding_connector_id
connector_ids['embedding_model_id'] = embedding_model_id
connector_ids['aos_host'] = aos_host

with open('connector_ids.json', 'w') as file:
    json.dump(connector_ids, file)

**Step 2:** Since you will launch the app.py script using the Notebook instance's shell, you need to install the package libraries in the shell environment.

In [None]:
!pip install streamlit -q
print("Install completed.")

**Step 3:** Execute this code to get the URL to access the app.

This is the address the Jupyter server running this on this Notebook instance. **NOTE: the URL won't work until you execute the `streamlit run` cell below.**

In [None]:
temp_url = boto3.client('sagemaker').create_presigned_notebook_instance_url(
    NotebookInstanceName='semantic-search-nb',
    SessionExpirationDurationInSeconds=1800
)
app_url = temp_url['AuthorizedUrl'].split('?')[0] + "/proxy/absolute/8501"
print("App URL to use after executing the next code block:")
app_url

**Step 3:** Run the streamlit application from the shell. Then use the App URL above to access it.

In [None]:
!streamlit run app4.py --server.baseUrlPath="/proxy/absolute/8501" --theme.base=dark

## 🎉👏🎈Congrats! You have succesfully deployed a shopping assistant chatbot web application and completed the Builder's session!! 🎉👏🎈