# Craft your style with AI: Building a Virtual Sylist Agent Using Amazon Bedrock and LangGraph


## Lab 1 - Populating the Amazon OpenSearch Serverless Vector database


In this lab, we will download our fashion dataset, create vector representations of it using [Amazon Titan Multimodal Embeddings Model](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-multiemb-models.html) and store those representations into [Amazon OpenSearch Serverless (AOSS)](https://aws.amazon.com/opensearch-service/features/serverless/).

This notebook is required if you would like to the agent to be able to take the `/image_look_up` action, otherwise you can directly run the `Step2_langgraph_agent.ipynb` notebook.

### Environment setup 
This has been tested in `conda_python3` Jupyter Notebook kernel with `ml.t3.medium`

### Install the requirements
Before getting started, let's install some pre-requisite packages so that we can work with AOSS easier

In [None]:
!pip install -q opensearch-py --quiet
!pip install -q requests_aws4auth --quiet

### Download the dataset locally

For our agent, we will use the Fashion Product Images Dataset, more specifically the one for western dresses.
You can find this dataset available on GitHub. Let's download those images by cloning the repository.

In [None]:
!git clone https://github.com/orbitalsonic/Fashion-Dataset-Images-Western-Dress.git

### Selecting 10 examples images
Since this dataset is rather large, let's limit it to just a couple of examples in order to reduce our execution time and solution costs. We will keep only 10 images and delete the other ones

In [None]:
# Only keep 10 images to save time
import os
import shutil

current_dir = os.getcwd()
image_extensions = ('.jpg', '.jpeg', '.png')

relative_path = "Fashion-Dataset-Images-Western-Dress/WesternDress_Images"
image_folder = os.path.join(current_dir, relative_path)
image_files = [f for f in os.listdir(image_folder) if f.endswith(image_extensions)]
image_files.sort()
images_to_keep = image_files[:10]

for image in image_files:
    if image not in images_to_keep:
        file_path = os.path.join(image_folder, image)
        os.remove(file_path)

### Add all the dependencies/imports

Next let's import some support libraries and start our boto3 session

In [None]:
import os
import boto3
from opensearchpy import AWSV4SignerAuth, OpenSearch, RequestsHttpConnection
from dependencies.opensearch_utils import OpensearchIngestion

boto3_session = boto3.Session()
# create a client for OSS
client = boto3.client('opensearchserverless')
service = 'aoss'
region = boto3_session.region_name
credentials = boto3_session.get_credentials()
AWSAUTH = AWSV4SignerAuth(credentials, region, "aoss")

### Load parameters for AOSS collection and embedding setup
Next we will load some parameters from AOSS. Since we have already created the collection for you, you can access the required parameters using [AWS Systems Manager Parameter Store](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html)

In [None]:
ssm_client = boto3.client('ssm')

response = ssm_client.get_parameters(
    Names=[
        'AOSSCollectionName', 'AOSSEmbeddingSize', 'AOSSHost', 'AOSSIndexName'
    ]
)
param_dict = {}
for parameter in response['Parameters']:
    param_dict[parameter['Name']] = parameter['Value']
param_dict

### Initialize an OpenSearch client
Next we will initialize the OpenSearch client connecting to our AOSS Host

In [None]:
# Create the client with SSL/TLS enabled.
OSSclient = OpenSearch(
    hosts=[{'host': param_dict['AOSSHost'], 'port': 443}],
    http_auth=AWSAUTH,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    pool_maxsize=20,
    timeout=3000,
)

### Create OpensearchIngestion class
Opensearch Ingestion class (created in opensearch_utils.py) contains helper functions for the document processing and ingestion into the index

In [None]:
oss_instance = OpensearchIngestion(
    client=OSSclient,
    session=boto3_session
)

### Ingest the images
Now that we have all the clients that we need, let's create the embeddings and ingest the data into AOSS

<div class="alert alert-block alert-warning">
<b>Attention:</b> This step will fail if you have not enabled Amazon Titan Multimodal Embedding on Amazon Bedrock.
You can find more information on how to access models on Amazon Bedrock <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html">here</a>
</div>

In [None]:
dataset_path = "Fashion-Dataset-Images-Western-Dress/WesternDress_Images/"

In [None]:
failed = []
for image_name in os.listdir(dataset_path):
    image = dataset_path+image_name
    try:
        (data, embedding) = oss_instance.create_titan_multimodal_embeddings(image_path=image)
        img_id = image.rsplit("/",1)[1].split(".")[0]
        # print(img_id)
        body = {
            "vector_field": embedding["embedding"],
            "image_b64": data["inputImage"], 
            }
    except Exception as e:
        print(f"Exception thrown in image {image}: {e}")
        continue
    # Ingest the images one by one.
    status = oss_instance.client.index(
        index=param_dict['AOSSIndexName'], 
        body=body, 
    )
    if status["result"] != "created":
        failed.append(image)
        
print(f"Ingestion Complete. Failed ingestion for the following: {failed}")

### Next Steps
That is it! You have ingested data into AOSS!

Next step, you should run the [Step 2 notebook](Step2_langgraph_agent.ipynb).

Clean up will be done together with all other agent assets