![tracker](https://us-central1-vertex-ai-mlops-369716.cloudfunctions.net/pixel-tracking?path=statmike%2Fvertex-ai-mlops%2FApplied+GenAI%2FRetrieval&file=Retrieval+-+Vertex+AI+Vector+Search.ipynb)
<!--- header table --->
<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Retrieval/Retrieval%20-%20Vertex%20AI%20Vector%20Search.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo">
      <br>Run in<br>Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2Fstatmike%2Fvertex-ai-mlops%2Fmain%2FApplied%2520GenAI%2FRetrieval%2FRetrieval%2520-%2520Vertex%2520AI%2520Vector%2520Search.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo">
      <br>Run in<br>Colab Enterprise
    </a>
  </td>      
  <td style="text-align: center">
    <a href="https://github.com/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Retrieval/Retrieval%20-%20Vertex%20AI%20Vector%20Search.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      <br>View on<br>GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/statmike/vertex-ai-mlops/main/Applied%20GenAI/Retrieval/Retrieval%20-%20Vertex%20AI%20Vector%20Search.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      <br>Open in<br>Vertex AI Workbench
    </a>
  </td>
</table>

# Retrieval - Vertex AI Vector Search

In prior workflows, a series of documents was [processed into chunks](../Chunking/readme.md), and for each chunk, [embeddings](../Embeddings/readme.md) were created:

- Process: [Large Document Processing - Document AI Layout Parser](../Chunking/Large%20Document%20Processing%20-%20Document%20AI%20Layout%20Parser.ipynb)
- Embed: [Vertex AI Text Embeddings API](../Embeddings/Vertex%20AI%20Text%20Embeddings%20API.ipynb)

Retrieving chunks for a query involves calculating the embedding for the query and then using similarity metrics to find relevant chunks. A thorough review of similarity matching can be found in [The Math of Similarity](../Embeddings/The%20Math%20of%20Similarity.ipynb) - use dot product! As development moves from experiment to application, the process of storing and computing similarity is migrated to a [retrieval](./readme.md) system. This workflow is part of a [series of workflows exploring many retrieval systems](./readme.md).

**Vertex AI Vector Search For Storage, Indexing, And Search**

[Vertex AI Vector Search](https://cloud.google.com/vertex-ai/docs/vector-search/overview) is a vector similarity search solution built for scale, offering enhanced features such as:

- Support for sparse embeddings, including those used for keyword searches.
- Hybrid search capabilities that combine dense and sparse embeddings.
- Batch and streaming indexing options to match update latency requirements.
- Inclusion of vector attributes (metadata) that can be used for filtering searches (allowlisting and denylisting).
- Crowding attributes to limit the number of responses within groups.

**Use Case Data**

Buying a home usually involves borrowing money from a lending institution, typically through a mortgage secured by the home's value. But how do these institutions manage the risks associated with such large loans, and how are lending standards established?

In the United States, two government-sponsored enterprises (GSEs) play a vital role in the housing market:

- Federal National Mortgage Association ([Fannie Mae](https://www.fanniemae.com/))
- Federal Home Loan Mortgage Corporation ([Freddie Mac](https://www.freddiemac.com/))

These GSEs purchase mortgages from lenders, enabling those lenders to offer more loans. This process also allows Fannie Mae and Freddie Mac to set standards for mortgages, ensuring they are responsible and borrowers are more likely to repay them. This system makes homeownership more affordable and stabilizes the housing market by maintaining a steady flow of liquidity for lenders and keeping interest rates controlled.

However, navigating the complexities of these GSEs and their extensive servicing guides can be challenging.

**Approaches**

[This series](../readme.md) covers many generative AI workflows. These documents are used directly as long context for Gemini in the workflow [Long Context Retrieval With The Vertex AI Gemini API](../Generate/Long%20Context%20Retrieval%20With%20The%20Vertex%20AI%20Gemini%20API.ipynb). The workflow below uses a [retrieval](./readme.md) approach with the already generated chunks and embeddings.

---
## Colab Setup

When running this notebook in [Colab](https://colab.google/) or [Colab Enterprise](https://cloud.google.com/colab/docs/introduction), this section will authenticate to GCP (follow prompts in the popup) and set the current project for the session.

In [1]:
PROJECT_ID = 'statmike-mlops-349915' # replace with project ID

In [2]:
try:
    from google.colab import auth
    auth.authenticate_user()
    !gcloud config set project {PROJECT_ID}
except Exception:
    pass

---
## Installs and API Enablement

The clients packages may need installing in this environment. 

### Installs (If Needed)

In [3]:
# tuples of (import name, install name, min_version)
packages = [
    ('google.cloud.aiplatform', 'google-cloud-aiplatform', '1.69.0'),
    ('google.cloud.storage', 'google-cloud-storage')
]

import importlib
install = False
for package in packages:
    if not importlib.util.find_spec(package[0]):
        print(f'installing package {package[1]}')
        install = True
        !pip install {package[1]} -U -q --user
    elif len(package) == 3:
        if importlib.metadata.version(package[0]) < package[2]:
            print(f'updating package {package[1]}')
            install = True
            !pip install {package[1]} -U -q --user

### API Enablement

In [4]:
!gcloud services enable aiplatform.googleapis.com

### Restart Kernel (If Installs Occured)

After a kernel restart the code submission can start with the next cell after this one.

In [5]:
if install:
    import IPython
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
    IPython.display.display(IPython.display.Markdown("""<div class=\"alert alert-block alert-warning\">
        <b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. The previous cells do not need to be run again⚠️</b>
        </div>"""))

---
## Setup

Inputs

In [6]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [7]:
REGION = 'us-central1'
SERIES = 'applied-genai'
EXPERIMENT = 'retrieval-vertex-vector-search'

# GCS storage bucket name
GCS_BUCKET = PROJECT_ID

# Vertex AI Vector Search Names
VS_INDEX_NAME = f"{SERIES}-{EXPERIMENT}"
VS_ENDPOINT_NAME = PROJECT_ID

Packages

In [8]:
import os, json, time, glob

import numpy as np

# Vertex AI
from google.cloud import aiplatform
import vertexai.language_models # for embeddings API
import vertexai.generative_models # for Gemini Models

# gcs client
from google.cloud import storage

In [9]:
aiplatform.__version__

'1.69.0'

Clients

In [10]:
# vertex ai clients
vertexai.init(project = PROJECT_ID, location = REGION)

# gcs client
gcs = storage.Client(project = PROJECT_ID)
bucket = gcs.bucket(GCS_BUCKET)

---
## Text & Embeddings For Examples

This repository contains a [section for document processing (chunking)](../Chunking/readme.md) that includes an example of processing mulitple large pdfs (over 1000 pages) into chunks: [Large Document Processing - Document AI Layout Parser](../Chunking/Large%20Document%20Processing%20-%20Document%20AI%20Layout%20Parser.ipynb).  The chunks of text from that workflow are stored with this repository and loaded by another companion workflow that augments the chunks with text embeddings: [Vertex AI Text Embeddings API](../Embeddings/Vertex%20AI%20Text%20Embeddings%20API.ipynb).

The following code will load the version of the chunks that includes text embeddings and prepare it for a local example of retrival augmented generation.

### Get The Documents

If you are working from a clone of this notebooks [repository](https://github.com/statmike/vertex-ai-mlops) then the documents are already present. The following cell checks for the documents folder and if it is missing gets it (`git clone`):

In [11]:
local_dir = '../Embeddings/files/embeddings-api'

In [12]:
if not os.path.exists(local_dir):
    print('Retrieving documents...')
    parent_dir = os.path.dirname(local_dir)
    temp_dir = os.path.join(parent_dir, 'temp')
    if not os.path.exists(temp_dir):
        os.makedirs(temp_dir)
    !git clone https://www.github.com/statmike/vertex-ai-mlops {temp_dir}/vertex-ai-mlops
    shutil.copytree(f'{temp_dir}/vertex-ai-mlops/Applied GenAI/Embeddings/files/embeddings-api', local_dir)
    shutil.rmtree(temp_dir)
    print(f'Documents are now in folder `{local_dir}`')
else:
    print(f'Documents Found in folder `{local_dir}`')             

Documents Found in folder `../Embeddings/files/embeddings-api`


### Load The Chunks

In [13]:
jsonl_files = glob.glob(f"{local_dir}/large-files*.jsonl")
jsonl_files.sort()
jsonl_files

['../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0000.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0001.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0002.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0003.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0004.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0005.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0006.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0007.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0008.jsonl',
 '../Embeddings/files/embeddings-api/large-files-chunk-embeddings-0009.jsonl']

In [14]:
chunks = []
for file in jsonl_files:
    with open(file, 'r') as f:
        chunks.extend([json.loads(line) for line in f])
len(chunks)

9040

### Review A Chunk

In [15]:
chunks[0].keys()

dict_keys(['instance', 'predictions', 'status'])

In [16]:
chunks[0]['instance']['chunk_id']

'fannie_part_0_c17'

In [17]:
print(chunks[0]['instance']['content'])

# Selling Guide Fannie Mae Single Family

## Fannie Mae Copyright Notice

### Fannie Mae Copyright Notice

|-|
| Section B3-4.2, Verification of Depository Assets 402 |
| B3-4.2-01, Verification of Deposits and Assets (05/04/2022) 403 |
| B3-4.2-02, Depository Accounts (12/14/2022) 405 |
| B3-4.2-03, Individual Development Accounts (02/06/2019) 408 |
| B3-4.2-04, Pooled Savings (Community Savings Funds) (04/01/2009) 411 |
| B3-4.2-05, Foreign Assets (05/04/2022) 411 |
| Section B3-4.3, Verification of Non-Depository Assets 412 |
| B3-4.3-01, Stocks, Stock Options, Bonds, and Mutual Funds (06/30/2015) 412 |
| B3-4.3-02, Trust Accounts (04/01/2009) 413 |
| B3-4.3-03, Retirement Accounts (06/30/2015) 414 |
| B3-4.3-04, Personal Gifts (09/06/2023) 415 |
| B3-4.3-05, Gifts of Equity (10/07/2020) 418 |
| B3-4.3-06, Grants and Lender Contributions (12/14/2022) 419 |
| B3-4.3-07, Disaster Relief Grants or Loans (04/01/2009) 423 |
| B3-4.3-08, Employer Assistance (09/29/2015) 423 |
| B3-4.3-09,

In [18]:
chunks[0]['predictions'][0]['embeddings']['values'][0:10]

[0.031277116388082504,
 0.03056905046105385,
 0.010865348391234875,
 0.0623614676296711,
 0.03228681534528732,
 0.05066155269742012,
 0.046544693410396576,
 0.05509665608406067,
 -0.014074751175940037,
 0.008380400016903877]

### Prepare Chunk Structure

Make a list of dictionaries with information for each chunk:

In [19]:
content_chunks = [
    dict(
        gse = chunk['instance']['gse'],
        chunk_id = chunk['instance']['chunk_id'],
        content = chunk['instance']['content'],
        embedding = chunk['predictions'][0]['embeddings']['values']
    ) for chunk in chunks
]

### Query Embedding

Create a query, or prompt, and get the embedding for it:

Connect to models for text embeddings. Learn more about the model API:
- [Vertex AI Text Embeddings API](../Embeddings/Vertex%20AI%20Text%20Embeddings%20API.ipynb)

In [20]:
question = "Does a lender have to perform servicing functions directly?"

In [21]:
embedder = vertexai.language_models.TextEmbeddingModel.from_pretrained('text-embedding-004')

In [22]:
question_embedding = embedder.get_embeddings([question])[0].values
question_embedding[0:10]

[-0.0005117303808219731,
 0.009651427157223225,
 0.01768726110458374,
 0.014538003131747246,
 -0.01829824410378933,
 0.027877431362867355,
 -0.021124685183167458,
 0.008830446749925613,
 -0.02669006586074829,
 0.06414774805307388]

---
## Retrieval With Vertex AI Vector Search

[Vertex AI Vector Search](https://cloud.google.com/vertex-ai/docs/vector-search/overview) is a vector similarity search solution built for scale, offering many enhanced functions, like streaming inserts of new embeddings.

### Prepare Input Data In GCS

A batch input for Vertex AI Vector Search are sourced from GCS directory with structure:
```
batch_root/
├── features_1.csv
├── features_2.csv
└── delete/
    └── deletes_1.txt
```
Where each `features*` files is `.csv`, `.json`, or `.avro` file of input feature data.  The `delete` folder has `.txt` files of record IDs to remove from the the index.  Each batch job will have a batch root folder like this.

The `features` files have structs of input information for each input and requires a value for `id` and for `embedding` and/or `sparse_embedding`.  The `sparse_embedding` can be great for keyword search and hybrid search.  The example below focuses on embeddings which are also called dense embeddings.

The `features` files can also have optional `restricts` with  `namespace` and `allow` tokens for use in filtering and crowding during search.  These will be used in the example below.

**Reference:**
- [Input data format and structure](https://cloud.google.com/vertex-ai/docs/vector-search/setup/format-structure)
- [Filter vector matches](https://cloud.google.com/vertex-ai/docs/vector-search/filtering)

In [23]:
inside_vs_data = [
    dict(
        id = chunk['instance']['chunk_id'],
        embedding = chunk['predictions'][0]['embeddings']['values'],
        restricts = [
            dict(
                namespace = 'gse',
                allow = [chunk['instance']['gse']]
            )
        ]
    ) for chunk in chunks
]

In [24]:
outside_vs_data = {}
for chunk in chunks:
    outside_vs_data[chunk['instance']['chunk_id']] = chunk['instance']['content']

#### Save To GCS

In [25]:
blob = bucket.blob(f'{SERIES}/{EXPERIMENT}/batches/initial/feature.json')
jsonl_data = '\n'.join(json.dumps(row) for row in inside_vs_data)
blob.upload_from_string(jsonl_data, content_type = 'application/json')
list(bucket.list_blobs(prefix = f'{SERIES}/{EXPERIMENT}'))

[<Blob: statmike-mlops-349915, applied-genai/retrieval-vertex-vector-search/batches/initial/feature.json, 1729265490867222>]

### Create/Retrieve An Index

Before deploying an index for use on an endpoint with Vertex AI Vector Search, you first create the index and load the data.  The workflow here will create and load the data to two different indexes: one for treeAH approximate nearest neighbors search, and one for brute force full search.  Indexes can be created for batch updates or streaming updates.

**Reference:**
- [Create and managed your index](https://cloud.google.com/vertex-ai/docs/vector-search/create-manage-index)
- [Index configuration parameters](https://cloud.google.com/vertex-ai/docs/vector-search/configuring-indexes)
- [Python SDK: `aiplatform.MatchingEngineIndex`](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndex#google_cloud_aiplatform_MatchingEngineIndex_name)

#### Create Empty Indexes

Check for the index and if missing create it:
- a tree ah index for approximation nearest neightbors search
- a brute force index for verifying results in testing

In [58]:
check = aiplatform.MatchingEngineIndex.list(filter=f'display_name="{VS_INDEX_NAME}-tree-ah"')
if len(check) > 0:
    print('Retrieved existing index with same name.')
    vs_index_tree_ah = check[0]
else:
    print('Creating index ...')
    vs_index_tree_ah = aiplatform.MatchingEngineIndex.create_tree_ah_index(
        display_name = VS_INDEX_NAME + '-tree-ah',
        dimensions = len(question_embedding),
        approximate_neighbors_count = 20,
        distance_measure_type = 'DOT_PRODUCT_DISTANCE',
        leaf_node_embedding_count = 250,
        leaf_nodes_to_search_percent = 10,
        index_update_method = 'BATCH_METHOD',
        shard_size = 'SHARD_SIZE_SMALL'
    )    

Retrieved existing index with same name.


In [59]:
check = aiplatform.MatchingEngineIndex.list(filter=f'display_name="{VS_INDEX_NAME}-brute-force"')
if len(check) > 0:
    print('Retrieved existing index with same name.')
    vs_index_brute_force = check[0]
else:
    print('Creating index ...')
    vs_index_brute_force = aiplatform.MatchingEngineIndex.create_brute_force_index(
        display_name = VS_INDEX_NAME + '-brute-force',
        dimensions = len(question_embedding),
        distance_measure_type = 'DOT_PRODUCT_DISTANCE',
        index_update_method = 'BATCH_METHOD',
        shard_size = 'SHARD_SIZE_SMALL'
    )  

Retrieved existing index with same name.


#### Load Data To Index

Check to see if data is already loaded and if missing then load as a complete overwrite:

In [44]:
vs_index_tree_ah.to_dict()['indexStats']

{'vectorsCount': '9040', 'shardsCount': 1}

In [45]:
vs_index_brute_force.to_dict()['indexStats']

{'vectorsCount': '9040', 'shardsCount': 1}

In [60]:
if 'vectorsCount' not in vs_index_tree_ah.to_dict()['indexStats']:
    print('Loading Embeddings...')
    vs_index_tree_ah.update_embeddings(
        contents_delta_uri = f'gs://{bucket.name}/{SERIES}/{EXPERIMENT}/batches/initial',
        is_complete_overwrite = True
    )
else:
    print('Embeddings already loaded.')

Embeddings already loaded.


In [61]:
if 'vectorsCount' not in vs_index_brute_force.to_dict()['indexStats']:
    print('Loading Embeddings...')
    vs_index_brute_force.update_embeddings(
        contents_delta_uri = f'gs://{bucket.name}/{SERIES}/{EXPERIMENT}/batches/initial',
        is_complete_overwrite = True
    )
else:
    print('Embeddings already loaded.')

Embeddings already loaded.


### Manage An Index

Similar to loading data to the indexes in the previous section, batch update can be carried out with incremental data or complete overwrites.  If the indexes are set up for streaming updates the upserts can be streamed to the indexes.

**Reference:**
- [Update and rebuild index](https://cloud.google.com/vertex-ai/docs/vector-search/update-rebuild-index)

### Create/Retrieve An Index Endpoint

To make indexes available for serving nearest neighbors matches they need to be deployed to a Vertex AI Vector Search endpoints.  This section will create an endpoint (or retrieve it).  The work below creates a public endpoint but the endpoint can also be created with VPC peering or private service connect.  

**Reference:**
- [Deploy - Public Endpoint](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-public)
- [Deploy - Private services access(VPC peering)](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-vpc)
- [Deploy - Private Services Connect (PSC)](https://cloud.google.com/vertex-ai/docs/vector-search/setup/private-service-connect)
- [Python SDK: `aiplatform.MatchingEngineIndexEndpoint`](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndexEndpoint)

In [62]:
check = aiplatform.MatchingEngineIndexEndpoint.list(filter=f'display_name="{VS_ENDPOINT_NAME}"')
if len(check) > 0:
    print('Retreived exiting endpoint with same name.')
    vs_endpoint = check[0]
else:
    print('Creating endpoing...')
    vs_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
        display_name = VS_ENDPOINT_NAME,
        public_endpoint_enabled = True
    )

Retreived exiting endpoint with same name.


### Deploy Indexes To Index Endpoint

Check for indexes on the endpoint and if missing deploy them

This is the point where computing resources are started to hosts the endpoint. Choices here and the up time of the endpoint are important considerations for performance and [costs](https://cloud.google.com/vertex-ai/pricing#vectorsearch).

**Notes**
- Machine types should chosen to support the choosen shard size (see first reference below)
- It is recommended to have two replicas per shard
- min_replica_count and max_replica_count default to 2 (no autoscaling)
- if min_replica_count is not set then it defaults to 2
- if max_replica_count is not set then it defaults to the same value as min_replica_count

**Reference:**
- [Machine type options](https://cloud.google.com/vertex-ai/docs/vector-search/create-manage-index#create-index)
- [Enable Autoscaling](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-public#autoscaling)
    - [Change autoscaling parameters for and endpoint](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-public#mutate-deployed-index)
- [Deployment settings that impact performance](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-public#performance)


In [67]:
for index in [vs_index_tree_ah, vs_index_brute_force]:
    if index.display_name.replace('-', '_') not in [i.id for i in vs_endpoint.deployed_indexes]:
        print(f'Deploying index: {index.display_name}')
        vs_endpoint = vs_endpoint.deploy_index(
            index = index.resource_name,
            deployed_index_id = index.display_name.replace('-', '_'),
            machine_type = 'e2-standard-2',
            min_replica_count = 2,
            max_replica_count = 2, 
        )
    else:
        print(f'Found index already deployed to endpoint: {index.display_name}')

Found index already deployed to endpoint: applied-genai-retrieval-vertex-vector-search-tree-ah
Found index already deployed to endpoint: applied-genai-retrieval-vertex-vector-search-brute-force


In [73]:
vs_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name = vs_endpoint.resource_name
)

In [74]:
vs_endpoint.deployed_indexes

[id: "applied_genai_retrieval_vertex_vector_search_tree_ah"
index: "projects/1026793852137/locations/us-central1/indexes/2418270272177045504"
create_time {
  seconds: 1729254606
  nanos: 813361000
}
index_sync_time {
  seconds: 1729266827
  nanos: 283083000
}
deployment_group: "default"
dedicated_resources {
  machine_spec {
    machine_type: "e2-standard-2"
  }
  min_replica_count: 2
  max_replica_count: 2
}
, id: "applied_genai_retrieval_vertex_vector_search_brute_force"
index: "projects/1026793852137/locations/us-central1/indexes/1855742531220799488"
create_time {
  seconds: 1729255453
  nanos: 24613000
}
index_sync_time {
  seconds: 1729266771
  nanos: 750378000
}
deployment_group: "default"
dedicated_resources {
  machine_spec {
    machine_type: "e2-standard-2"
  }
  min_replica_count: 2
  max_replica_count: 2
}
]

### Matches:

In [72]:
test.read_index_datapoints(
    deployed_index_id = vs_index_tree_ah.display_name.replace('-', '_'),
    ids = ['fannie_part_0_c40']
)

[datapoint_id: "fannie_part_0_c40"
feature_vector: 0.027678849175572395
feature_vector: -0.007303171791136265
feature_vector: 0.03341332823038101
feature_vector: 0.06296615302562714
feature_vector: -0.014549952000379562
feature_vector: 0.027599383145570755
feature_vector: 0.04117162525653839
feature_vector: 0.0039031493943184614
feature_vector: -0.02049889974296093
feature_vector: -0.0017076944932341576
feature_vector: -0.025188840925693512
feature_vector: 0.06535536050796509
feature_vector: 0.06576049327850342
feature_vector: -0.039182983338832855
feature_vector: -0.017826547846198082
feature_vector: -0.057337116450071335
feature_vector: 0.035221464931964874
feature_vector: 0.044845353811979294
feature_vector: -0.02948317490518093
feature_vector: -0.0037089877296239138
feature_vector: -0.019938185811042786
feature_vector: -0.013213587924838066
feature_vector: -0.01755746267735958
feature_vector: -0.03500330448150635
feature_vector: -0.024951962754130363
feature_vector: -0.061737138777

In [80]:
vs_endpoint.find_neighbors(
    deployed_index_id = vs_index_tree_ah.display_name.replace('-', '_'),
    queries = [question_embedding[:]]
)

AttributeError: 'MatchingEngineIndexEndpoint' object has no attribute '_public_match_client'

---
## Simple RAG Using Vertex AI Vector Search For Retrieval

---
## Remove Resources

-undeploy index
- delete endpoint
- delete index
- remove gcs

gcs and fs

In [49]:
# undeploy indexes created above:
#vs_endpoint.undeploy_index(deployed_index_id = vs_index_tree_ah.display_name.replace('-', '_'))
#vs_endpoint.undeploy_index(deployed_index_id = vs_index_brute_force.display_name.replace('-', '_'))

# delete the endpoint
#vs_endpoint.delete(force = True)

# delete the indexes
#vs_index_tree_ah.delete()
#vs_index_brute_force.delete()

# delete the gcs content
