# Google Vertex AI Vector Search

This notebook shows how to use functionality related to the `Google Cloud Vertex AI Vector Search` vector database.

> [Google Vertex AI Vector Search](https://cloud.google.com/vertex-ai/docs/vector-search/overview), formerly known as Vertex AI Matching Engine, provides the industry's leading high-scale low latency vector database. These vector databases are commonly referred to as vector similarity-matching or an approximate nearest neighbor (ANN) service.

**Note**: LlamaIndex expects Vertex AI Vector Search endpoint and deployed index is already created. An empty index creation time take upto a minute and deploying an index to the endpoint can take upto 30 min.

> To see how to create an index refer to the section [Create Index and deploy it to an Endpoint](#create-index-and-deploy-it-to-an-endpoint)  
If you already have an index deployed , skip to [Create VectorStore from texts](#create-vector-store-from-texts)

## Installation

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [38]:
! pip install llama-index llama-index-vector-stores-vertexaivectorsearch llama-index-llms-vertex 




[notice] A new release of pip is available: 24.2 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [39]:
%pip install google-cloud-aiplatform

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


## Create Index and deploy it to an Endpoint

- This section demonstrates creating a new index and deploying it to an endpoint.

In [1]:
# TODO : Set values as per your requirements

# Project and Storage Constants
PROJECT_ID = "gen-lang-client-0974620078"
REGION = "asia-southeast1"
GCS_BUCKET_NAME = "image-retrieval"
GCS_BUCKET_URI = f"gs://{GCS_BUCKET_NAME}"

# The number of dimensions for the textembedding-gecko@003 is 768
# If other embedder is used, the dimensions would probably need to change.
VS_DIMENSIONS = 768

# Vertex AI Vector Search Index configuration
# parameter description here
# https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndex#google_cloud_aiplatform_MatchingEngineIndex_create_tree_ah_index
VS_INDEX_NAME = "image-retrieval-index"  # @param {type:"string"}
VS_INDEX_ENDPOINT_NAME = "image-retrieval-endpoint"  # @param {type:"string"}

In [2]:
from google.cloud import aiplatform

In [3]:
from google.oauth2 import service_account

credentials = service_account.Credentials.from_service_account_file(
    "C:/Users/mt200/OneDrive/Desktop/AI/AI_challenge/software/back-end/service-account.json"
)

In [4]:
aiplatform.init(project=PROJECT_ID, location=REGION, credentials=credentials)

### Create Cloud Storage bucket
```sh
! gsutil mb -l $REGION -p $PROJECT_ID $GCS_BUCKET_URI
```

In [44]:
! gsutil mb -l $REGION -p $PROJECT_ID $GCS_BUCKET_URI

Creating gs://image-retrieval/...
ServiceException: 409 A Cloud Storage bucket named 'image-retrieval' already exists. Try another name. Bucket names must be globally unique across all Google Cloud projects, including those outside of your organization.


### Create an empty Index

**Note :** While creating an index you should specify an "index_update_method" - `BATCH_UPDATE` or `STREAM_UPDATE`

> A batch index is for when you want to update your index in a batch, with data which has been stored over a set amount of time, like systems which are processed weekly or monthly.
>
> A streaming index is when you want index data to be updated as new data is added to your datastore, for instance, if you have a bookstore and want to show new inventory online as soon as possible.
>
> Which type you choose is important, since setup and requirements are different.

Refer [Official Documentation](https://cloud.google.com/vertex-ai/docs/vector-search/create-manage-index) and [API reference](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndex#google_cloud_aiplatform_MatchingEngineIndex_create_tree_ah_index) for more details on configuring indexes

In [45]:
# NOTE : This operation can take upto 30 seconds

# check if index exists
index_names = [
    index.resource_name
    for index in aiplatform.MatchingEngineIndex.list(
        filter=f"display_name={VS_INDEX_NAME}"
    )
]

if len(index_names) == 0:
    print(f"Creating Vector Search index {VS_INDEX_NAME} ...")
    vs_index = aiplatform.MatchingEngineIndex.create_tree_ah_index(
        display_name=VS_INDEX_NAME,
        dimensions=VS_DIMENSIONS,
        distance_measure_type="DOT_PRODUCT_DISTANCE",
        shard_size="SHARD_SIZE_SMALL",
        index_update_method="STREAM_UPDATE",  
        approximate_neighbors_count=200,
    )
    print(
        f"Vector Search index {vs_index.display_name} created with resource name {vs_index.resource_name}"
    )
else:
    vs_index = aiplatform.MatchingEngineIndex(index_name=index_names[0])
    print(
        f"Vector Search index {vs_index.display_name} exists with resource name {vs_index.resource_name}"
    )

Creating Vector Search index image-retrieval-index ...
Creating MatchingEngineIndex
Create MatchingEngineIndex backing LRO: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552/operations/6300847590237798400
MatchingEngineIndex created. Resource name: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
To use this MatchingEngineIndex in another session:
index = aiplatform.MatchingEngineIndex('projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552')
Vector Search index image-retrieval-index created with resource name projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552


### Create an Endpoint

To use the index, you need to create an index endpoint. It works as a server instance accepting query requests for your index. An endpoint can be a [public endpoint](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-public) or a [private endpoint](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-vpc).

Let's create a public endpoint.

In [46]:
endpoint_names = [
    endpoint.resource_name
    for endpoint in aiplatform.MatchingEngineIndexEndpoint.list(
        filter=f"display_name={VS_INDEX_ENDPOINT_NAME}"
    )
]

if len(endpoint_names) == 0:
    print(
        f"Creating Vector Search index endpoint {VS_INDEX_ENDPOINT_NAME} ..."
    )
    vs_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
        display_name=VS_INDEX_ENDPOINT_NAME, public_endpoint_enabled=True
    )
    print(
        f"Vector Search index endpoint {vs_endpoint.display_name} created with resource name {vs_endpoint.resource_name}"
    )
else:
    vs_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=endpoint_names[0]
    )
    print(
        f"Vector Search index endpoint {vs_endpoint.display_name} exists with resource name {vs_endpoint.resource_name}"
    )

Vector Search index endpoint image-retrieval-endpoint exists with resource name projects/284454080854/locations/asia-southeast1/indexEndpoints/5917149368225366016


### Deploy Index to the Endpoint

With the index endpoint, deploy the index by specifying a unique deployed index ID.

**NOTE : This operation can take upto 30 minutes.**

In [49]:
deployed_index_id="image_retrieval_deploy_index"

In [None]:
# check if endpoint exists
index_endpoints = [
    (deployed_index.index_endpoint, deployed_index.deployed_index_id)
    for deployed_index in vs_index.deployed_indexes
]

if len(index_endpoints) == 0:
    print(
        f"Deploying Vector Search index {vs_index.display_name} at endpoint {vs_endpoint.display_name} ..."
    )
    vs_deployed_index = vs_endpoint.deploy_index(
        index=vs_index,
        deployed_index_id=deployed_index_id,  # ✅ valid ID
        display_name=VS_INDEX_NAME,
        machine_type="e2-standard-16",
        min_replica_count=1,
        max_replica_count=1,
    )
    print(
        f"Vector Search index {vs_index.display_name} is deployed at endpoint {vs_deployed_index.display_name}"
    )
else:
    vs_deployed_index = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoints[0][0]
    )
    print(
        f"Vector Search index {vs_index.display_name} is already deployed at endpoint {vs_deployed_index.display_name}"
    )

Deploying Vector Search index image-retrieval-index at endpoint image-retrieval-endpoint ...
Deploying index MatchingEngineIndexEndpoint index_endpoint: projects/284454080854/locations/asia-southeast1/indexEndpoints/5917149368225366016
Deploy index MatchingEngineIndexEndpoint index_endpoint backing LRO: projects/284454080854/locations/asia-southeast1/indexEndpoints/5917149368225366016/operations/8559402803364102144


## Create Vector Store from texts

NOTE : If you have existing Vertex AI Vector Search Index and Endpoints, you can assign them using following code:

In [5]:
aiplatform.init(project=PROJECT_ID, location=REGION, credentials=credentials)

In [6]:
# Get index by display name
indexes = aiplatform.MatchingEngineIndex.list(filter='display_name="image-retrieval-index"')
if not indexes:
    raise ValueError("Index with display_name='image-retrieval-index' not found.")

# Use resource_name, not display_name
vs_index = aiplatform.MatchingEngineIndex(index_name=indexes[0].resource_name)
print(f"✅ Index loaded: {vs_index.resource_name}")

# Same for endpoint
endpoints = aiplatform.MatchingEngineIndexEndpoint.list(filter='display_name="image-retrieval-endpoint"')
if not endpoints:
    raise ValueError("Endpoint with display_name='image-retrieval-endpoint' not found.")

vs_endpoint = aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=endpoints[0].resource_name)
print(f"✅ Endpoint loaded: {vs_endpoint.resource_name}")

✅ Index loaded: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
✅ Endpoint loaded: projects/284454080854/locations/asia-southeast1/indexEndpoints/5917149368225366016


### Create a simple vector store from plain text without metadata filters

In [7]:
# import modules needed
from llama_index.core import (
    StorageContext,
    Settings,
    VectorStoreIndex,
    SimpleDirectoryReader,
)
from llama_index.core.schema import TextNode, ImageNode
from llama_index.core.vector_stores.types import (
    MetadataFilters,
    MetadataFilter,
    FilterOperator,
)
from llama_index.llms.vertex import Vertex
from llama_index.embeddings.vertex import VertexTextEmbedding
from llama_index.vector_stores.vertexaivectorsearch import VertexAIVectorStore

### Create Image Node

In [9]:
import os
import json
from typing import List

# Create all note
base_dir = "C:/Users/mt200/OneDrive/Desktop/AI/AI_challenge/feature_extraction/embedding-image"

image_nodes: List[ImageNode] = []

# Duyệt tất cả file trong thư mục
for filename in os.listdir(base_dir):
    if filename.endswith(".json"):
        file_path = os.path.join(base_dir, filename)
        with open(file_path, "r", encoding="utf-8") as f:
            try:
                data = json.load(f)
                # Nếu file chứa một mảng item
                if isinstance(data, list):
                    for item in data:
                        image_node = ImageNode(
                            id_=item["id"],
                            embedding=item["embedding"],
                            metadata=item["metadata"],
                            image_url=item["metadata"]["image_url"]
                        )
                        image_nodes.append(image_node)
                # Nếu file chỉ chứa 1 object
                elif isinstance(data, dict):
                    image_node = ImageNode(
                        id_=data["id"],
                        embedding=data["embedding"],
                        metadata=data["metadata"],
                        image_url=data["metadata"]["image_url"]
                    )
                    image_nodes.append(image_node)
            except json.JSONDecodeError as e:
                print(f"❌ Lỗi đọc file {filename}: {e}")

print(f"✅ Tổng số ImageNode: {len(image_nodes)}")


✅ Tổng số ImageNode: 168096


In [11]:
image_node

ImageNode(id_='L30_V096_F084', embedding=[0.025079943239688873, -0.04452181234955788, -0.01910569705069065, 0.02360299415886402, -0.01714656874537468, 0.022217782214283943, -0.01714448817074299, 0.01847955770790577, -0.025741299614310265, -0.02782304212450981, -0.002923137042671442, -0.06686684489250183, -0.014682747423648834, -0.03318243473768234, -0.00877455249428749, -0.011630440130829811, 0.014482776634395123, -0.002502479823306203, 0.0028956662863492966, -0.042535025626420975, 0.04922705888748169, -0.00548683712258935, -0.02484024688601494, 0.004097541328519583, 0.0023835054598748684, -0.006029616575688124, 0.013425232842564583, -0.0277257040143013, -0.01399318128824234, -0.006962329614907503, -0.0037959017790853977, -0.006188271567225456, -0.009430651552975178, 0.023167526349425316, 0.006705515086650848, -0.028247574344277382, -0.021581493318080902, -0.007011040113866329, 0.008553418330848217, -0.005276231560856104, 0.0035844400990754366, -0.026239732280373573, 0.0256249029189348

### Add ImageNodes to Vector Search Google Cloud

In [12]:
# setup storage
vector_store = VertexAIVectorStore(
    project_id=PROJECT_ID,
    region=REGION,
    index_id=vs_index.resource_name,
    endpoint_id=vs_endpoint.resource_name,
    gcs_bucket_name=GCS_BUCKET_NAME,
    credentials_path="C:/Users/mt200/OneDrive/Desktop/AI/AI_challenge/software/back-end/service-account.json"
)

# set storage context
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [13]:
vector_store.add([image_node])

Upserting datapoints MatchingEngineIndex index: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552




MatchingEngineIndex index Upserted datapoints. Resource name: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552


['L30_V096_F084']

In [None]:
import time

BATCH_SIZE = 500  # giảm batch nếu cần
DELAY_SEC = 60    # delay giữa các batch (tùy quota)

for i in range(122000, len(image_nodes), BATCH_SIZE):
    batch = image_nodes[i:i+BATCH_SIZE]
    vector_store.add(batch)
    print(f"Added batch {i} -> {i+len(batch)}")
    time.sleep(DELAY_SEC)

Upserting datapoints MatchingEngineIndex index: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
MatchingEngineIndex index Upserted datapoints. Resource name: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
Added batch 122000 -> 122500
Upserting datapoints MatchingEngineIndex index: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
MatchingEngineIndex index Upserted datapoints. Resource name: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
Added batch 122500 -> 123000
Upserting datapoints MatchingEngineIndex index: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
MatchingEngineIndex index Upserted datapoints. Resource name: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
Added batch 123000 -> 123500
Upserting datapoints MatchingEngineIndex index: projects/284454080854/locations/asia-southeast1/indexes/5015699367030423552
Mat

## Indexing Image from Vector Search in Google Cloud

In [None]:
import torch
from PIL import Image
from transformers import AutoProcessor, AutoModel, AutoTokenizer
import requests
import numpy as np

# 1. Load model SigLIP (Vision + Text encoder) [siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384)
# model_id = "google/siglip-so400m-patch14-384"
# [siglip-base-patch16-224](https://h uggingface.co/google/siglip-base-patch16-224)
model_id = "google/siglip-base-patch16-224"
tokenizer = AutoTokenizer.from_pretrained("google/siglip-base-patch16-224")
embed_model = AutoModel.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)
device = torch.device('cuda' if torch.cuda.is_available() else "cpu")

In [None]:
import torch
from transformers import AutoTokenizer, AutoModel
import numpy as np
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# Initialize the embedding model
embed_model = HuggingFaceEmbedding(
    model_name="google/siglip-base-patch16-224",
    device="cpu",  
)

In [None]:
# define index from vector store
index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=embed_model
)

### Search

In [None]:
# simple similarity search without filter
retriever = index.as_retriever(similarity_top_k=10)
response = retriever.retrieve("pants")

for row in response:
    print(f"Text: {row.get_text()}")
    print(f"   Score: {row.get_score():.3f}")
    print(f"   Metadata: {row.metadata}")

In [None]:
# similarity search with text filter
filters = MetadataFilters(filters=[MetadataFilter(key="color", value="blue")])
retriever = index.as_retriever(filters=filters, similarity_top_k=100)
response = retriever.retrieve("denims")

for row in response:
    print(f"Text: {row.get_text()}")
    print(f"   Score: {row.get_score():.3f}")
    print(f"   Metadata: {row.metadata}")

In [None]:
# similarity search with text and numeric filter
filters = MetadataFilters(
    filters=[
        MetadataFilter(key="color", value="blue"),
        MetadataFilter(key="price", operator=FilterOperator.GT, value=70.0),
    ]
)
retriever = index.as_retriever(filters=filters, similarity_top_k=3)
response = retriever.retrieve("denims")

for row in response:
    print(f"Text: {row.get_text()}")
    print(f"   Score: {row.get_score():.3f}")
    print(f"   Metadata: {row.metadata}")

In [None]:
query_text = "four women"
def embedding_text(texts):
    with torch.no_grad():
        inputs = processor(text=texts, return_tensors="pt", padding=True)
        embeds = model.get_text_features(**inputs)
        embeds = embeds / embeds.norm(dim=-1, keepdim=True)
    return embeds.cpu().numpy().tolist()

embed_text = embedding_text(query_text)

In [None]:
# Test query
response = index.find_neighbors(
    deployed_index_id=DEPLOYED_INDEX_ID,
    queries=embed_text,
    num_neighbors=20,
)

In [None]:
import numpy as np

for idx, neighbor in enumerate(response[0]):
    id = np.int64(neighbor.id)
    # Use iloc to access the row by index
    similar = df.iloc[id]
    print(similar)
    print(similar.metadata)
    print(f"{neighbor.distance:.4f}")