# Use Vertex AI Vector Search to Recommend Similar Products

## Overview

This notebook demonstrates how to use the Vertex AI Vector Search Service. It is a high-scale, low-latency solution, to find similar vectors (or more specifically "embeddings") for a large dataset. Moreover, it is a fully managed offering, further reducing operational overhead. It is built upon [Approximate Nearest Neighbor (ANN) technology](https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html) developed by Google Research.

### Terminology

This section reviews some key terminology related to vector search.

- **Vector**: A vector is a mathematical object that has both a magnitude and a direction. It is often represented as an array of numbers. In the context of machine learning, a vector is often used to represent a feature or an embedding.
- **Embedding**: An embedding is a mathematical representation of an object, such as a word, sentence, image, or sound. In the context of machine learning, an embedding is often used to represent a feature. For this lab, product embeddings are used to represent products. Each product embedding is a 768-dimension vector meaning 768 numbers are used to represent the product.
- **Vector Search**: Vector search is the process of finding similar vectors (or embeddings) for a given query vector. It is often used to find or recommend similar products, images, or documents.
- **Index**: An index is a data structure that is used to store and retrieve embeddings. It is optimized for fast search and retrieval of similar embeddings.

### Objective

In this notebook, you learn how to create Approximate Nearest Neighbor (ANN) Index in Vertex AI (formerly known as AI Platform), query against indexes, and validate the performance of the index. 

The steps performed include:

* Create an Index
* Create an IndexEndpoint
* Deploy an Index to an IndexEndpoint
* Perform a vector search query
* Compare the results with the ground truth

The following diagram illustrates the different services involved:

![](assets/overview.png)

- The product embeddings are stored in Google Cloud Storage. The embeddings have been generated using [Vertex AI embeddings for text](https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/text-embeddings) using the product names
- Each Vector Search Index is created in Vertex AI using the embeddings.
- There are two indexes created, one using the Approximate Nearest Neighbor algorithm (ANN) and the other using the Brute Force algorithm.
    - Nearest neighbor refers to finding the closest embedding to a given query embedding.
    - The ANN algorithm is optimized for fast search and retrieval of similar embeddings although it is not guaranteed to find the best solution, especially for large datasets. 
    - The Brute Force algorithm will find the true nearest neighbor but is less efficient and not recommended for production. By comparing the results of the ANN and Brute Force algorithms, you can validate the performance of the ANN algorithm.
- To query the indexes, an IndexEndpoint is created and the indexes are deployed to the endpoint.
- IndexEndpoints can be deployed in private/VPC or public mode. In this lab, the IndexEndpoints are deployed in private mode which is the most performant and requires VPC peering to query the endpoints. Public mode endpoints are not open to the public and still have IAM security controls available. The lab environment has been set up with VPC peering so the notebook instance can query the endpoints.

*Note*: The index creation and deployment can take up to 30 minutes to complete. Because of this, they have automatically been deployed when you started this lab to save you time waiting. You will understand all of the code involved in creating and deploying the index before performing real-time queries against the index endpoints.

### Understanding the sample data

In this lab, you will use [TheLook dataset](https://console.cloud.google.com/marketplace/product/bigquery-public-data/thelook-ecommerce) which has [products](bigquery-public-data.thelook_ecommerce.products) table with about 30,000 rows of synthetic product data for a fictitious e-commerce clothing site.

![](assets/Thelook.png)

From this table, a `product-embeddings.json` file has been prepared for this lab which includes 5000 product embeddings.

This file is in JSONL (JSON lines) format and each row has an `id` for the product id, `name` for the product name, and `embedding` for the embedding vector of the product name in 768 dimensions which was generated using [Vertex AI Embeddings for Text](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings). A sample row from the file is as follows (where the `embedding` vector shows only 3 of the 768 dimensions):

```json
{"id":"19536","name":"original penguin men's pro-bro mock sweater","embedding":[0.015607465989887714,0.0183266568928957,0.080682516098022461,...]}
```

The text embeddings represent the meaning of the clothing product names. In this lab, you will use Vertex AI Vector Search to complete a [semantic search](https://en.wikipedia.org/wiki/Semantic_search) of the items. This sample code can be used as a basis for other simple recommendation systems where you can quickly find other items similar to a given one.

## Installation

Install the following Python packages required to execute this notebook:

- `google-cloud-aiplatform` - The official Python client library for Vertex AI.
- `google-cloud-storage` - The official Python client library for Google Cloud Storage.
- `grpcio-tools` - The gRPC tools for Python. gRPC is a high-performance, open-source universal remote procedure call (RPC) framework.

You can ignore any pip errors about other dependencies as they do not impact this notebook.

In [None]:
# Install the packages
! pip3 install --upgrade google-cloud-aiplatform==1.42.1 \
                         google-cloud-storage \
                         grpcio-tools

## Before you begin
#### Getting your project ID and Number from `gcloud`

In [None]:
PROJECT_ID = ! gcloud config get project
PROJECT_ID = PROJECT_ID[0]

# Retrieve the project number
PROJECT_NUMBER = !gcloud projects list --filter="PROJECT_ID:'{PROJECT_ID}'" --format='value(PROJECT_NUMBER)'
PROJECT_NUMBER = PROJECT_NUMBER[0]

### Configuring the VPC network variable

To reduce any network overhead that might lead to unnecessary increase in overhead latency, it is best to call the ANN endpoints from your VPC via a direct [VPC Peering](https://cloud.google.com/vertex-ai/docs/general/vpc-peering) connection. The lab's notebook is within a VPC named `ai-net` which is peered with Google private services VPC to query the private Vertex AI endpoints.

In [None]:
VPC_NETWORK = "ai-net"

### Configuring the Cloud Storage bucket

The lab environment created a storage bucket to store the embeddings used for building the Vector Search indexes.

In [None]:
BUCKET_URI = "{BUCKET_PLACEHOLDER}"

## Preparing the data

Download the product embedding dataset to the notebook environment. You will use the data set to retrieve product titles for the product IDs returned by the vector search.

In [None]:
! gsutil cp "gs://ca-lab-assets/vector-search/product-embeddings.json" .

Copy the data to the lab's Cloud Storage bucket where it is accessible to the Vertex AI services.

In [None]:
! gsutil cp "gs://ca-lab-assets/vector-search/product-embeddings.json" "$BUCKET_URI"

## Creating Indexes

Now everything is ready to load the embeddings to Vector Search indexex. The Vector Search APIs are available in the [aiplatform](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform) package of the Google SDK (Vertex AI was formerly known as AI Platform). 

You will begin by reviewing the code to create the ANN index and then create the brute force index.

In [None]:
# init the aiplatform package
from google.cloud import aiplatform

aiplatform.init(project=PROJECT_ID, location='us-central1')

Create a [MatchingEngineIndex](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndex) with its `create_tree_ah_index` function (Matching Engine is the previous name of Vector Search).

By calling the `create_tree_ah_index` function, it starts building an Index. Note that the code above does not run since the index has already been created for you when the lab started. The index takes a few minutes to create for small datasets (such as the product embeddings in this lab), otherwise an hour or more can be expected depending on the data size. You can check status of the index creation on the Vector Search Console **INDEXES** tab:

- In the Google Cloud Console search bar, enter *vector search* and click the **Vector Search** result to open the Vertex AI Vector Search indexes view:

![](assets/indexes.png)

#### Function parameters

- `contents_delta_uri`: The URI of Cloud Storage directory where you stored the embedding JSON files
- `dimensions`: Dimension size of each embedding. In this case, it is 768 as you are using the embeddings from the Text Embeddings API.
- `approximate_neighbors_count`: how many similar items you want to retrieve in typical cases

See [the documentation](https://cloud.google.com/vertex-ai/docs/vector-search/create-manage-index) for more details on creating the Index and the parameters.

The following code creates the brute force index.

### Creating Index Endpoints

To use an Index, you need to create an [Index Endpoint](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-public). It works as a server instance accepting query requests for your Index.

Note that it is possible to deploy multiple indexes to the same index endpoint. However two are used in this lab to avoid any confusion.

In the Google Cloud Console, you can see the Index Endpoints you have created by navigating to the Vector Search Console **INDEX ENDPOINTS** tab:

![](assets/indexendpoints.png)

###  Deploying the Indexes to the Index Endpoints

With the Index Endpoints available, deploy the Indexes by specifying unique deployed index IDs.

Deploying the indexes can take <ins>up to 40 minutes</ins> to complete (although it may take as little as 10 minutes) and this is the main reason why the lab deploys them for you. <ins>You must ensure the indexes are deployed before querying them in the next section</ins>.

Return to the Vector Search Console's **INDEX ENDPOINTS** tab and periodically refresh the page to check the status of the index deployment (you must refresh the page to see the status change). When the indexes are deployed you will see green checkmarks in the **Deployed indexes** column:

![](assets/deployedindexes.png)



<div class="alert alert-block alert-warning">
<b>⚠️ The indexes must finish deploying before proceeding. Please wait until the Deployed indexes green check marks appear before continuing. Remember to refresh the page every few minutes. ⚠️</b>
</div>

## Performing a vector search query

Now that the indexes are deployed, you can perform a vector search query. The following code performs a vector search query using the ANN index first and then the brute force index for comparison.

### Getting an embedding to run a query

First, load the embedding JSON file to build a dictionary of product names and embeddings.

In [None]:
import json

# build dictionaries for product names and embeddings
product_names = {}
product_embeddings = {}
with open("product-embeddings.json") as f:
    for l in f.readlines():
        p = json.loads(l)
        id = p["id"]
        product_names[id] = p["name"]
        product_embeddings[id] = p["embedding"]

With the `product_embeddings` dictionary, you can specify a product ID to get an embedding for it.

In [None]:
# get the embedding for ID 12711 "hurley juniors supersuede beachrider boardshort"
# you can also try with other IDs such as 18090, 19536 and 11863
query_embedding = product_embeddings["12711"]

Because the index endpoints were created for you when the lab started, you will need to retrieve them before use.

In [None]:
index_endpoints = aiplatform.MatchingEngineIndexEndpoint.list()
index_endpoint = None
index_brute_force_endpoint = None
for i in index_endpoints:
    if "brute-force" not in i.display_name:
        index_endpoint = i
    else:
        index_brute_force_endpoint = i

index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name=index_endpoint.resource_name
)
index_brute_force_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name=index_brute_force_endpoint.resource_name
)

Now you can query the index endpoints using the query embedding.

In [None]:
# Test query

NUM_NEIGHBOURS = 10 # The number of nearest neighbors to be retrieved

# Execute the request
response = index_endpoint.find_neighbors(
    deployed_index_id='product_deployed_index',
    queries=[query_embedding],
    num_neighbors=NUM_NEIGHBOURS,
)

response

The output displays the raw response which includes the product IDs and distances to similar products. Note that the first product is the same product used for querying (ID 12711). The distances are the cosine similarity between the query embedding and the similar product embeddings. With cosine similarity a distance of 1 means the embeddings are equal and -1 means they are opposite. The feature_vector fields are all `None` because the query did not request to include them in the response.

Using the IDs you can retrieve the product names from the `product_names` dictionary and confirm that the approximate nearest neighbors are indeed similar to the query product.

In [None]:
# show the results
for idx, neighbor in enumerate(response[0]):
    print(f"{neighbor.distance:.2f} {product_names[neighbor.id]}")

### Comparing to the ground truth

Use the deployed brute force index as the ground truth to calculate the recall of ANN Index.

In [None]:
# Execute the request
brute_force_response = index_brute_force_endpoint.find_neighbors(
    deployed_index_id='product_brute_force_deployed_index',
    queries=[query_embedding],
    num_neighbors=NUM_NEIGHBOURS,
)
for idx, neighbor in enumerate(brute_force_response[0]):
    print(f"{neighbor.distance:.2f} {product_names[neighbor.id]}")

Observe that the more efficient ANN index is able to find the true nearest neighbors for the query product. You may repeat the comparison with other query embeddings if you have time remaining in your lab session.

## Summary

In this notebook, you demonstrated how to use the Vertex AI Vector Search to build a product similarity search system.
Along the way you learned about the following Vertex AI resources:

- Vector Search Indexes
- Vector Search Index Endpoints
- Vector Search Deployed Indexes

Although not explored in this notebook, the indexes can be updated by providing incremental JSONL files to the `update_embeddings` method. 
This allows for the addition of new embeddings to the index without the need to re-create the index from scratch.

Return to the Cloud Academy Lab page to complete the lab.