# Embeddings - Vertex AI Vector Search

### Semantic Search Using Embeddings and Vertex AI Vector Search (Formerly Matching Engine)

- Semantic search is a type of search that uses the meaning of words and phrases to find relevant results.
- In this tutorial, we will demonstrate how to do semantic search with embeddings generated from the news text using vector search`
- Vertex AI Vector Search (formerly known as Matching Engine) is a vector database, which can find the most similar vectors from over a billion vectors. Matching Engine's ANN service can serve similarity-matching queries at high queries per second (QPS).

This pattern is more appropriate for larger datasets and production deployments. For demonstrating the full workflow and explanation, a small dataset is pre-processed locally and uploaded to a storage bucket in this example. In a larger or production deployment, the embedding generation and storage can be done separately and Vector Search provides db update strategies as described later in this notebook.

## Install Vertex LLM SDK

Install required libraries and initialises the Vertex AI SDK

In [4]:
# Install Required Libraries
!pip3 install "google-cloud-aiplatform>=1.25" "shapely<2.0.0"



In [39]:
# Import Vertex AI SDK
PROJECT_ID = !gcloud config get project
PROJECT_ID = PROJECT_ID.n
LOCATION = "europe-west2"
LOCATION_DEPLOY = "europe-west2" #Location to deploy GCP resources

import vertexai
from google.cloud import aiplatform

vertexai.init(project=PROJECT_ID, location=LOCATION)

## Import TextEmbeddingModel

**Available models as of Sep 2023:**
| Models | Description
| :- | :- |
| textembedding-gecko@001 | stable |
| textembedding-gecko@latest | public preview: an embeddings model with enhanced AI quality |
| textembedding-gecko-multilingual@latest | public preview: an embeddings model designed to use a wide range of non-English languages. |


Further documentation on available models can be found here: https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings#generative-ai-get-text-embedding-python

In [40]:
from vertexai.preview.language_models import TextEmbeddingModel

model = TextEmbeddingModel.from_pretrained("textembedding-gecko@001")

## Import Required Packages

Outputs with regard to Tensorflow can be ignored as this is caused by this notebook being CPU only, a GPU is not required for this demonstration.


In [41]:
import json
import time

import numpy as np
import pandas as pd

## Create Embedding Dataset.

The dataset is solely to demonstrate the use of the Text Embedding API with a vector database. It is not intended to be used for any other purpose, such as evaluating models. The dataset is small and does not represent a comprehensive sample of all possible text.

The following command copies the data json file from a google storage bucket, the data is stored locally within the notebook for use. 

### Peek at the data

In [42]:
df = pd.read_csv('text_data.csv')
df.head(5)

Unnamed: 0,file_name,text
0,whats-new-in-online-for-business.txt,\nNo\n
1,activate-cbo.txt,Activate your Commercial Banking Online accoun...
2,getting-started-with-the-business-mobile-app.txt,Getting started with the Business Mobile Banki...
3,account-management.txt,Account ManagementAccess & permissionsAdd or r...
4,confirmation-of-payee.txt,Make payments with confidenceConfirmation of P...


In [11]:
cd ../help_guide

/home/jupyter/help_guide


### Get embeddings from the Google Embedding Model

The following code sends a request to the embedding model api to get the embedding vector for each entry in the dataset and stores it in a Python DataFrame.

The number of dimensions of the embedding vectors is 768 for the text-embedding gecko model.

In [43]:
def get_embedding(text):
    get_embedding.counter += 1
    try:
        if get_embedding.counter % 100 == 0:
            time.sleep(3)
        return model.get_embeddings([text])[0].values #Send request to embedding model
    except:
        return []


get_embedding.counter = 0

# This may take several minutes to complete.
df["embedding"] = df["text"].apply(lambda x: get_embedding(x))

# Convert the embeddings into a Python list 
embeddings_list = df['embedding'].values.tolist()

## Using Vertex AI Vector Search Approximate Nearest Neighbour (ANN) Service

Vertex AI Vector Search (formerly known as Matching Engine) is a vector database, which can find the most similar vectors from over a billion vectors. Matching Engine's ANN service can serve similarity-matching queries at high queries per second (QPS).

Further Details and Documentation: https://cloud.google.com/vertex-ai/docs/matching-engine/ann-service-overview

**Terminology**
- **Index:** A collection of vectors deployed together for similarity search. Vectors can be added to an index or removed from an index. Similarity search queries are issued to a specific index and will search over the vectors in that index.
- **Recall:** The percentage of true nearest neighbors returned by the index. For example, if a nearest neighbor query for 20 nearest neighbors returned 19 of the "ground truth" nearest neighbors, the recall is 19/20x100 = 95%.
- **Restricts:** Functionality to "restrict" searches to a subset of the index by using Boolean rules.

### Generate Embedding File


This function converts the embedding and saves it into a json file format usable with matching engine in the following format. 
- Encode the file using UTF-8.
- Make each line a valid JSON object to be interpreted as a record.
- Include in each record a field named id that requires a valid UTF-8 string that is the ID of the vector.
- Include in each record a field named embedding that requires an array of numbers. This is the feature vector.

Other file formats can be found here: https://cloud.google.com/vertex-ai/docs/matching-engine/match-eng-setup/format-structure#json

In [45]:
ls

E1_Embeddings.ipynb     Load_data.ipynb  [0m[01;34moutput[0m/        web_scrapping.ipynb
E2_Vector_Search.ipynb  [01;34mdatasets[0m/        text_data.csv


In [44]:
with open("./datasets/vector_search_dataset.json", "w") as f:
    for i in range(len(embeddings_list)):
        f.write('{"id":"' + str(i) + '",')
        f.write('"embedding":[' + ",".join(str(x) for x in embeddings_list[i]) + "]}")
        f.write("\n")

### Copy Dataset to Storage Bucket
The following command copies the data file to a cloud storage bucket. Folder structure documentation: https://cloud.google.com/vertex-ai/docs/matching-engine/match-eng-setup/format-structure#input_directory_structure

Supported Update Methods:
- **Batch:** To update the content of an existing Index, use the IndexService.UpdateIndex method.
- **Streaming:** With Streaming Updates, you can update and query your index within a few seconds. At this time, you can't use Streaming Updates on an existing index, you must create a new index.

Documentation: https://cloud.google.com/vertex-ai/docs/matching-engine/update-rebuild-index#update_index_content_with_batch_updates

In [24]:
!mkdir -p datasets
!gsutil copy ../userguides/datasets/vector_search_dataset.json gs://gen-ai-{PROJECT_ID}-bucket/embeddings/vs_root/vector_search_dataset.json

Copying file://../userguides/datasets/vector_search_dataset.json [Content-Type=application/json]...
/ [1 files][  1.0 MiB/  1.0 MiB]                                                
Operation completed over 1 objects/1.0 MiB.                                      


### Create an Index

Details on configuration parameters can be found here: https://cloud.google.com/vertex-ai/docs/matching-engine/configuring-indexes

**It can take up to 30 minutes to deploy.**


In [48]:
DIMENSIONS = 768
GS_URI = "gs://gen-ai-%s-bucket/embeddings/vs_root/" % PROJECT_ID

gen_ai_index = aiplatform.MatchingEngineIndex.create_tree_ah_index(
    display_name="Gen AI Index",
    contents_delta_uri=GS_URI,
    dimensions=DIMENSIONS,
    approximate_neighbors_count=5,
    distance_measure_type="DOT_PRODUCT_DISTANCE",
    leaf_node_embedding_count=10,
    leaf_nodes_to_search_percent=80,
    description="Example Index for Gen AI Playpen",
    location=LOCATION_DEPLOY
)

Creating MatchingEngineIndex
Create MatchingEngineIndex backing LRO: projects/474327682772/locations/europe-west2/indexes/6617054490002456576/operations/1843185008919969792
MatchingEngineIndex created. Resource name: projects/474327682772/locations/europe-west2/indexes/6617054490002456576
To use this MatchingEngineIndex in another session:
index = aiplatform.MatchingEngineIndex('projects/474327682772/locations/europe-west2/indexes/6617054490002456576')


### Create an Index Endpoint

The following function is used to create an index endpoint, this allows for queries to be sent to the index

In [49]:
gen_ai_index_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
    display_name="Gen AI Index Endpoint",
    description="Example Index for Gen AI Playpen",
    public_endpoint_enabled=True,
    location=LOCATION_DEPLOY
)

Creating MatchingEngineIndexEndpoint
Create MatchingEngineIndexEndpoint backing LRO: projects/474327682772/locations/europe-west2/indexEndpoints/5044735270096732160/operations/5013719146588798976
MatchingEngineIndexEndpoint created. Resource name: projects/474327682772/locations/europe-west2/indexEndpoints/5044735270096732160
To use this MatchingEngineIndexEndpoint in another session:
index_endpoint = aiplatform.MatchingEngineIndexEndpoint('projects/474327682772/locations/europe-west2/indexEndpoints/5044735270096732160')


### Deploy the Index to the Index-Endpoint

In [50]:
gen_ai_index_endpoint = gen_ai_index_endpoint.deploy_index(
    index=gen_ai_index, deployed_index_id="gen_ai_deployed_index",
    machine_type="e2-standard-16",
    min_replica_count=1,
    max_replica_count=1
)

gen_ai_index_endpoint.deployed_indexes

Deploying index MatchingEngineIndexEndpoint index_endpoint: projects/474327682772/locations/europe-west2/indexEndpoints/5044735270096732160
Deploy index MatchingEngineIndexEndpoint index_endpoint backing LRO: projects/474327682772/locations/europe-west2/indexEndpoints/5044735270096732160/operations/3266322491169046528
MatchingEngineIndexEndpoint index_endpoint Deployed index. Resource name: projects/474327682772/locations/europe-west2/indexEndpoints/5044735270096732160


[id: "gen_ai_deployed_index"
index: "projects/474327682772/locations/europe-west2/indexes/6617054490002456576"
create_time {
  seconds: 1701432534
  nanos: 15607000
}
index_sync_time {
  seconds: 1701433450
  nanos: 253330000
}
deployment_group: "default"
dedicated_resources {
  machine_spec {
    machine_type: "e2-standard-16"
  }
  min_replica_count: 1
  max_replica_count: 1
}
]

### Additional Functions:
Uncomment, replace the relevant project, region and ids to retrieve indexes and index endpoints. The ids can be found on the cloud console under Vertex AI and Vector Search on the left bar. This is useful as obtaining the index objects allow for functions on them (such as delete or query) if the notebook kernels etc. have been reset.

In [56]:
gen_ai_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name="projects/playpen-b57463/locations/europe-west2/indexEndpoints/5044735270096732160"
)

gen_ai_index = aiplatform.MatchingEngineIndex(
    index_name="projects/playpen-b57463/locations/europe-west2/indexes/6617054490002456576"
)

### Create Online Queries

Pre-define function that generates an embedding for an input prompt using the embeddings-gecko model and performs a search.

In [51]:
NUM_NEIGHBOURS = 3 #Number of neighbours from query

def search(input):
    embedding_vec =  model.get_embeddings([input])[0].values #Send request to embedding model to generate the embedding vector
    
    #find neighbours using vector search
    neighbours = gen_ai_index_endpoint.find_neighbors(
        deployed_index_id="gen_ai_deployed_index",
        queries=[embedding_vec],
        num_neighbors=NUM_NEIGHBOURS,
    )[0]
    
    for nb in neighbours:
        print("id: " + nb.id + " | text: " + df.iloc[int(nb.id)]["text"] + " | dist: " + str(nb.distance)) 

### Example Queries

If you face an openssl error, please wait up to 5 minutes for the endpoint to finish deploying.

In [57]:
search("tell me about reporting a stolen card")

id: 10 | text: Report a lost or stolen Corporate CardIf your Corporate Card has been lost, stolen or misused please call us right away. This applies to your PIN or security details too.Payment cards include:Corporate MultiPay cardCorporate Charge CardCorporate Purchasing CardBusiness Travel SolutionePay VirtualePayablesBy PhoneCall us anytime on0800 096 4496If you're outside the UK call+44 1908 544 059.Was this helpful?YesNo
No
 | dist: 0.7556924819946289
id: 27 | text: Security & fraudLost or stolen cardsReport a lost or stolen Business CardReport a lost or stolen Business payment cardReport a lost or stolen Corporate CardReport a lost or stolen Corpoarate cardReport a fraudReport a fraud on our business accountsMake a complaintReport fraudulent use of online bankingStaying safeProtect your business from fraudManage the cyber threat to your business
 | dist: 0.7492184638977051
id: 21 | text: Report a lost or stolen Business CardIf your Business Card has been lost, stolen, misused or g

In [54]:
search("tell me about an important moment or event in your life")

id: 25 | text: Log on to Online for Business – Memorable InformationLog on securely using three characters from your memorable information, or with your Card Reader.The easiest way to log on to Online for Business is to tell us you’re on a trusted device.This means we’ll usually only ask you for three characters from your memorable information, as well as your password, each time you log on.Sometimes we may ask you to use your Card Reader, so it’s a good idea to always have one to hand.How to logonMemorable informationTrusting your deviceUsing your Card ReaderHow to set up memorable informationTo set up memorable information please log on to your account and click on the ‘Settings’ section in the top right hand corner of your Online for Business homepage.Then click on ‘change memorable information’ and follow the instructions.When you log on to your account, you will be asked to enter your user ID and your password.You will then be given the option to use either your memorable informat

### Clean Up Resources

In [None]:
gen_ai_index_endpoint.undeploy_all()
gen_ai_index_endpoint.delete() #index endpoint

In [None]:
gen_ai_index.delete()