## STEP 0: Create Matching Engine Index and Endpoint for Retrieval

[Embeddings](https://cloud.google.com/blog/topics/developers-practitioners/meet-ais-multitool-vector-embeddings) are a way of representing data as n-dimensional vector, in a space where the locations of those points in space are semantically meaningful. These embeddings can be then used to find similar data points. You can get text embeddings using Vertex AI Embeddings API. These embeddings are managed using a vector database.

Vertex AI Matching Engine is a Google Cloud managed vector database, which stores data as high-dimensional vectors (embeddings) and can find the most similar vectors from over a billion vectors. Matching Engine's Approximate Nearest Neigbors (ANN) service can serve similarity-matching queries at high queries per second (QPS). Unlike vector stores that run locally, Matching Engine is optimized for scale (multi-million and billion vectors) and it's an enterprise ready engine.

As part of the environment setup, create an index on Vertex AI Matching Engine and deploy the index to an Endpoint. Index Endpoint can be public or private. This notebook uses a Public endpoint.

## Getting Started
### Install Vertex AI SDK, other packages and their dependencies

In [6]:
# Install Vertex AI LLM SDK
! pip install --user google-cloud-aiplatform==1.27.0 langchain==0.0.201



### Utils for Matching Engine

In [7]:
!pip install github-clone
!ghclone https://github.com/GoogleCloudPlatform/generative-ai/tree/main/language/examples/document-qa/utils

Cloning into 'utils'...
done.


In [5]:
# Utils
import uuid
import json
import time
import uuid
from typing import List
import numpy as np



# Vertex AI
import vertexai
from google.cloud import aiplatform
print(f"Vertex AI SDK version: {aiplatform.__version__}")



# Langchain
import langchain

print(f"LangChain version: {langchain.__version__}")

from langchain.embeddings import VertexAIEmbeddings
from langchain.llms import VertexAI


# Import custom Matching Engine packages
from utils.matching_engine import MatchingEngine
from utils.matching_engine_utils import MatchingEngineUtils



Vertex AI SDK version: 1.27.0
LangChain version: 0.0.201


## Initialize Configurations

In [6]:
PROJECT_ID = "analytics-ml-ai"  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}
ME_DIMENSIONS = 768 # when using Vertex PaLM Embedding
ME_DISPLAY_NAME = "rfpbot_all_products_stage"
ME_DESCRIPTION = "rfpbot across all products stage"
ME_EMBEDDING_DIR   = "gs://rfpbot-stage-me" # @param {type:"string"}


In [7]:
# Initialize Vertex AI SDK
import vertexai
vertexai.init(project=PROJECT_ID, location=LOCATION)

#### Make a Google Cloud Storage bucket for your Matching Engine index

In [24]:
! gsutil mb -l us-central1 $ME_EMBEDDING_DIR

Creating gs://rfpbot-stage-me/...


#### Create a dummy embeddings file to initialize when creating the index

In [25]:
# dummy embedding
init_embedding = {"id": str(uuid.uuid4()), "embedding": list(np.zeros(ME_DIMENSIONS))}

# dump embedding to a local file
with open("embeddings_0.json", "w") as f:
    json.dump(init_embedding, f)

# write embedding to Cloud Storage
! gsutil cp embeddings_0.json {ME_EMBEDDING_DIR}/init_index/embeddings_0.json

Copying file://embeddings_0.json [Content-Type=application/json]...
/ [1 files][  3.8 KiB/  3.8 KiB]                                                
Operation completed over 1 objects/3.8 KiB.                                      


### Create Index
You can create index on Vertex AI Matching Engine for batch updates or streaming updates.

This notebook creates Matching Engine Index:

With streaming updates
With default configuration - e.g. small shard size
You can update the index configuration in the Matching Engine utilities script.


While the index is being created and deployed, you can read more about Matching Engine's ANN service which uses a new type of vector quantization developed by Google Research: Accelerating Large-Scale Inference with Anisotropic Vector Quantization.

For more information about how this works, see [Announcing ScaNN: Efficient Vector Similarity Search.](https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html)

In [8]:
mengine = MatchingEngineUtils(PROJECT_ID, LOCATION, ME_DISPLAY_NAME)

In [None]:
index = mengine.create_index(
    embedding_gcs_uri=f"{ME_EMBEDDING_DIR}/init_index",
    dimensions=ME_DIMENSIONS,
    index_update_method="streaming",
    index_algorithm="tree-ah",
)
if index:
    print(index.name)

INFO:root:Index rfpbot_all_products_stage does not exists. Creating index ...
INFO:root:Creating index with long running operation projects/184378960328/locations/us-central1/indexes/9057504110734999552/operations/8544991710617272320
INFO:root:Poll the operation to create index ...


...........

### Deploy Index to Endpoint
Deploy index to Index Endpoint on Matching Engine. This notebook deploys the index to a public endpoint. The deployment operation creates a public endpoint that will be used for querying the index for approximate nearest neighbors.

For deploying index to a Private Endpoint, refer to the documentation to set up pre-requisites

In [9]:
index_endpoint = mengine.deploy_index()
if index_endpoint:
    print(f"Index endpoint resource name: {index_endpoint.name}")
    print(f"Index endpoint public domain name: {index_endpoint.public_endpoint_domain_name}")
    print("Deployed indexes on the index endpoint:")
    for d in index_endpoint.deployed_indexes:
        print(f"    {d.id}")

INFO:root:Index endpoint rfpbot_all_products_stage-endpoint does not exists. Creating index endpoint...
INFO:root:Deploying index to endpoint with long running operation projects/184378960328/locations/us-central1/indexEndpoints/7247057060532060160/operations/7635827535841853440
INFO:root:Poll the operation to create index endpoint ...


.

INFO:root:Index endpoint rfpbot_all_products_stage-endpoint created with resource name as projects/184378960328/locations/us-central1/indexEndpoints/7247057060532060160 and endpoint domain name as 
INFO:root:Deploying index with request = {'id': 'rfpbot_all_products_stage_20230806233435', 'display_name': 'rfpbot_all_products_stage_20230806233435', 'index': 'projects/184378960328/locations/us-central1/indexes/9057504110734999552', 'dedicated_resources': {'machine_spec': {'machine_type': 'e2-standard-2'}, 'min_replica_count': 2, 'max_replica_count': 10}}
INFO:root:Poll the operation to deploy index ...


...............

INFO:root:Deployed index rfpbot_all_products_stage to endpoint rfpbot_all_products_stage-endpoint


.Index endpoint resource name: projects/184378960328/locations/us-central1/indexEndpoints/7247057060532060160
Index endpoint public domain name: 
Deployed indexes on the index endpoint:


In [10]:
ME_INDEX_ID, ME_INDEX_ENDPOINT_ID = mengine.get_index_and_endpoint()
print(f"ME_INDEX_ID={ME_INDEX_ID}")
print(f"ME_INDEX_ENDPOINT_ID={ME_INDEX_ENDPOINT_ID}")


ME_INDEX_ID=projects/184378960328/locations/us-central1/indexes/9057504110734999552
ME_INDEX_ENDPOINT_ID=projects/184378960328/locations/us-central1/indexEndpoints/7247057060532060160
