# Vertex AI Matching Engine Example

This notebook demonstrates how to use Google Vertex AI Matching Engine to perform queries. Before running the queries, you need to set your Google application credentials and define your query text.

In [1]:
# Install the necessary library
!pip3 install git+https://github.com/googleapis/python-aiplatform.git

Collecting git+https://github.com/googleapis/python-aiplatform.git
  Cloning https://github.com/googleapis/python-aiplatform.git to /tmp/pip-req-build-973wjqqx
  Running command git clone --filter=blob:none --quiet https://github.com/googleapis/python-aiplatform.git /tmp/pip-req-build-973wjqqx
  Resolved https://github.com/googleapis/python-aiplatform.git to commit 44587ecc6377cc23adc5fb5a792944a2e15276ed
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: google-cloud-aiplatform
  Building wheel for google-cloud-aiplatform (setup.py) ... [?25l[?25hdone
  Created wheel for google-cloud-aiplatform: filename=google_cloud_aiplatform-1.72.0-py2.py3-none-any.whl size=6213190 sha256=418141f82b0b923b06cd528dfa8b80f22fdee637a392e5b296d04875a6053a6f
  Stored in directory: /tmp/pip-ephem-wheel-cache-9eq855yb/wheels/87/e7/9e/2e9cd4cce37ffd28089a017b1a3551afd8c260795589f7005a
Successfully built google-cloud-aiplatform
Installing collected packages: google

## Set Google Application Credentials

Please upload your Google application credentials JSON file and set the path below.

In [6]:
# Set the path to your Google application credentials
import os
from google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint import (
    HybridQuery,
)
import json
import os
from google.colab import files
from google.cloud import aiplatform
from vertexai.preview.language_models import TextEmbeddingModel

# Upload your JSON credentials file
uploaded = files.upload()

# Assuming the filename is as follows
filename = next(iter(uploaded))

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = filename

Saving rv-hermes-nonprod-3a8067e91511.json to rv-hermes-nonprod-3a8067e91511 (1).json


In [7]:

# Replace with your project ID, region, and bucket name
project_id = "rv-hermes-nonprod"
region = "us-central1"

# Initialize Vertex AI
aiplatform.init(project=project_id, location=region)

COMMERCE_DEPLOYED_TOKEN_INDEX_ID = f"commerce_product_search_index_poc_3"

COMMERCE_ENDPOINT_ID = "3531785280044400640"

model = TextEmbeddingModel.from_pretrained("text-embedding-004") #for commerce use text-embedding-004


# wrapper
def get_dense_embedding(text):
    return model.get_embeddings([text])[0].values


my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name=f"{COMMERCE_ENDPOINT_ID}"
)

# Upload the JSON data file
uploaded_json = files.upload()
json_filename = next(iter(uploaded_json))

# Load JSON data from a file
with open(json_filename, 'r') as file:
    json_data = json.load(file)

Saving monstro_TechProduct_11_09_2024.json to monstro_TechProduct_11_09_2024.json


## Query the Matching Engine

Write your query text below and run the cell to see queried docs

In [8]:
# Write your query text here
query_text = "small mobile phones"

query_emb = get_dense_embedding(query_text)
query = HybridQuery(
    dense_embedding=query_emb,
)

# build a query request
response = my_index_endpoint.find_neighbors(
    deployed_index_id=COMMERCE_DEPLOYED_TOKEN_INDEX_ID,
    queries=[query],
    num_neighbors=5,
)

# Extract ids and distances from the match_neighbors_output
match_neighbors = response[0]

# Create a dictionary for quick lookup of JSON objects by uuid
json_dict = {item['uuid']: item for item in json_data}

# Prepare the output list
output = []

# Match ids with the JSON objects and append the distance
for neighbor in match_neighbors:
    uuid = neighbor.id
    distance = neighbor.distance
    if uuid in json_dict:
        output.append((json_dict[uuid], distance))

# Output the result
for json_object, distance in output:
    print(json_object, distance)

{'_id': {'$oid': '66173b0837bd9527a1588ca6'}, 'attributes': [{'val': ['https://pisces.bbystatic.com/prescaled/500/500/image2/BestBuy_US/images/products/6578/6578332_sd.jpg'], 'valSlug': ['https-pisces-bbystatic-com'], 'name': 'Image File Name/Location', 'id': 100019, 'slug': 'image-file-name-location'}, {'val': ['https://bestbuy.7tiv.net/c/159047/633495/10014?prodsku=6578332&u=https%3A%2F%2Fapi.bestbuy.com%2Fclick%2F-%2F6578332%2Fpdp&intsrc=CATF_4831'], 'valSlug': ['https-bestbuy-7tiv-net'], 'name': 'LinkModel', 'id': 1000044, 'slug': 'linkmodel'}, {'val': ['616960446439'], 'valSlug': ['0616960446439'], 'name': 'UPC', 'id': 100005, 'slug': 'upc'}, {'val': ['TFALT602DCV2PAP5'], 'valSlug': ['tfalt602dcv2pap5'], 'name': 'Manufacturer SKU', 'id': 100007, 'slug': 'manufacturersku'}, {'val': ['6578332'], 'valSlug': ['6578332'], 'name': 'Provider SKU', 'id': 1105526, 'slug': 'asin'}, {'val': ['Tracfone - TCL 30 Z 32GB Prepaid with 1 Year of Service Bundle - Black'], 'valSlug': ['tracfone-tcl-