![tracker](https://us-central1-vertex-ai-mlops-369716.cloudfunctions.net/pixel-tracking?path=statmike%2Fvertex-ai-mlops%2FApplied+GenAI&file=Grounding+Overview+-+RAG+With+Vertex+AI+Feature+Store.ipynb)
<!--- header table --->
<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Grounding%20Overview%20-%20RAG%20With%20Vertex%20AI%20Feature%20Store.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo">
      <br>Run in<br>Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2Fstatmike%2Fvertex-ai-mlops%2Fmain%2FApplied%2520GenAI%2FGrounding%2520Overview%2520-%2520RAG%2520With%2520Vertex%2520AI%2520Feature%2520Store.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo">
      <br>Run in<br>Colab Enterprise
    </a>
  </td>      
  <td style="text-align: center">
    <a href="https://github.com/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Grounding%20Overview%20-%20RAG%20With%20Vertex%20AI%20Feature%20Store.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      <br>View on<br>GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/statmike/vertex-ai-mlops/main/Applied%20GenAI/Grounding%20Overview%20-%20RAG%20With%20Vertex%20AI%20Feature%20Store.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      <br>Open in<br>Vertex AI Workbench
    </a>
  </td>
</table>

---
This is part a [series of notebook based workflows](./readme.md) for Applided GenAI using Vertex AI.
Specifically, these are related to grounding methods for LLMs:

||Notebook Workflow|Description|
|---|---|---|
||[Grounding Overview](./Grounding%20Overview.ipynb)|Overview of grounding methods with comparison and evaluation|
||[Grounding Overview - Vertex AI Search](./Grounding%20Overview%20-%20Vertex%20AI%20Search.ipynb)|Setting up and using Vertex AI Search.|
||[Grounding Overview - RAG With BigQuery](./Grounding%20Overview%20-%20RAG%20With%20BigQuery.ipynb)|A Complete workflow (process, parse, embed, index, retrieve, generate) with BigQuery Vector Search.|
|**This Notebook**|[Grounding Overview - RAG With Vertex AI Feature Store](./Grounding%20Overview%20-%20RAG%20With%20Vertex%20AI%20Feature%20Store.ipynb)|A complete workflow (process, parse, embed, index, retrieve, generate) with Vertex AI Feature Store as an online retrieval system over BigQuery data.|
||Grounding Overview - RAG With Vertex AI Vector Search|A complete workflow (process, parse, embed, index, retrieve, generate) with Vertex AI Vector Search as an online retrieval system.|
||Grounding Overview - RAG With LlamaIndex ON Vertex AI|A complete retrieval workflow (process, parse, embed, index, retrieve, generate) with LlamaIndex on Vertex AI as a retrieval system.|

---

# Grounding Overview - RAG With Vertex AI Feature Store

Harnessing an LLM to respond accurately means supplying the correct and relevant information to the prompt.  This notebooks create a custom retrieval augmented generation (RAG) application to retrieve grounded context for prompts.  This workflow will process a document through:

- Process document(s) into chunks With the [Layout Parser](https://cloud.google.com/document-ai/docs/layout-parse-chunk) (for example)
- Create embeddings for chunks with the [Vertex AI Embeddings APIs](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings)
- Store chunks with embeddings in BigQuery
- Create online serving for the data on [Vertex AI Feature Store](https://cloud.google.com/vertex-ai/docs/featurestore/latest/overview)
- Retrieve chunks related to input queries
- Generate grounded responsed from [Gemini on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models)

To see this and many more grounding approaches side-by-side check out the accompanying workflow in this repository that uses the RAG application created here:
- [Grounding Overview](./Grounding%20Overview.ipynb)


---
## Colab Setup

When running this notebook in [Colab](https://colab.google/) or [Colab Enterprise](https://cloud.google.com/colab/docs/introduction), this section will authenticate to GCP (follow prompts in the popup) and set the current project for the session.

In [5]:
PROJECT_ID = 'statmike-mlops-349915' # replace with project ID

In [6]:
try:
    from google.colab import auth
    auth.authenticate_user()
    !gcloud config set project {PROJECT_ID}
except Exception:
    pass

---
## Installs and API Enablement

The clients packages may need installing in this environment. 

### Installs (If Needed)

In [7]:
# tuples of (import name, install name, min_version)
packages = [
    ('google.cloud.aiplatform', 'google-cloud-aiplatform', '1.62.0'),
    ('google.cloud.documentai', 'google-cloud-documentai', '2.31.0'),
    ('google.cloud.bigquery', 'google-cloud-bigquery'),
    ('google.cloud.storage', 'google-cloud-storage')
]

import importlib
install = False
for package in packages:
    if not importlib.util.find_spec(package[0]):
        print(f'installing package {package[1]}')
        install = True
        !pip install {package[1]} -U -q --user
    elif len(package) == 3:
        if importlib.metadata.version(package[0]) < package[2]:
            print(f'updating package {package[1]}')
            install = True
            !pip install {package[1]} -U -q --user

### API Enablement

In [8]:
!gcloud services enable aiplatform.googleapis.com
!gcloud services enable documentai.googleapis.com

### Restart Kernel (If Installs Occured)

After a kernel restart the code submission can start with the next cell after this one.

In [9]:
if install:
    import IPython
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
    IPython.display.display(IPython.display.Markdown("""<div class=\"alert alert-block alert-warning\">
        <b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. The previous cells do not need to be run again⚠️</b>
        </div>"""))

---
## Setup

Inputs

In [10]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [11]:
REGION = 'us-central1'
SERIES = 'applied-genai'
EXPERIMENT = 'grounding-overview'

# make this the gcs bucket for storing files
GCS_BUCKET = PROJECT_ID 

# make this the BQ Project / Dataset / Table prefix to store results
BQ_PROJECT = PROJECT_ID
BQ_DATASET = SERIES.replace('-', '_')
BQ_TABLE = EXPERIMENT
BQ_REGION = REGION[0:2]

Packages

In [71]:
import os
import re
import io
import json
import base64
import requests
import concurrent.futures
import time
import asyncio
from IPython.display import Markdown as display

from google.cloud import aiplatform
import vertexai.language_models
import vertexai.generative_models # for Gemini Models
from google.cloud import documentai
from google.cloud import storage
from google.cloud import bigquery

In [13]:
aiplatform.__version__

'1.62.0'

Clients

In [14]:
# vertex ai clients
vertexai.init(project = PROJECT_ID, location = REGION)

# document AI client
LOCATION = REGION.split('-')[0]
docai_client = documentai.DocumentProcessorServiceClient(
    client_options = dict(api_endpoint = f"{LOCATION}-documentai.googleapis.com")
)
docai_async_client = documentai.DocumentProcessorServiceAsyncClient(
    client_options = dict(api_endpoint = f"{LOCATION}-documentai.googleapis.com")
)

# bigquery client
bq = bigquery.Client(project = PROJECT_ID)

# gcs client: assumes bucket already exists
gcs = storage.Client(project = PROJECT_ID)
bucket = gcs.bucket(GCS_BUCKET)

models: [Google Models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#models)

In [15]:
models = dict(
    gemini_pro = vertexai.generative_models.GenerativeModel("gemini-1.5-pro-001"),
    gemini_flash = vertexai.generative_models.GenerativeModel("gemini-1.5-flash-001"),
    embedding = vertexai.language_models.TextEmbeddingModel.from_pretrained('text-embedding-004')
)

generation configs:

In [16]:
grounding_config = vertexai.generative_models.GenerationConfig(
    temperature = 0.0
)

---
## Prompt And Context

The [official rules of baseball](https://img.mlbstatic.com/mlb-images/image/upload/mlb/wqn5ah4c3qtivwx3jatm.pdf), a pdf that is updated annually with the latest changes to the game and published by MLB.


In [36]:
prompt = "what are the dimensions of first base in baseball?"

In [37]:
url = 'https://img.mlbstatic.com/mlb-images/image/upload/mlb/wqn5ah4c3qtivwx3jatm.pdf'
# get the pdf
context_bytes = requests.get(url).content
context_base64 = base64.b64encode(context_bytes).decode('utf-8')

---
## Get/Create Document AI Processors

Document AI is comprised of multiple processors.  In this case the Layout parser is used for its ability to detect and extract paragraphs, tables, titles, heading, page headers, and page footers.  For a more thorough review of Document AI processors, including customized parsers, see the [Working With/Document AI](../Working%20With/Document%20AI/readme.md) section of this repository.  This repository includes example of processing document at larger scales and storing the data for processing and retrieval.

Using the [Layout Parser](https://cloud.google.com/document-ai/docs/layout-parse-chunk).

In [38]:
PARSER_DISPLAY_NAME = 'my_layout_processor'
PARSER_TYPE = 'LAYOUT_PARSER_PROCESSOR'
PARSER_VERSION = 'pretrained-layout-parser-v1.0-2024-06-03'

for p in docai_client.list_processors(parent = f'projects/{PROJECT_ID}/locations/{LOCATION}'):
    if p.display_name == PARSER_DISPLAY_NAME:
        parser = p
try:
    print('Retrieved existing parser: ', parser.name)
except Exception:
    parser = docai_client.create_processor(
        parent = f'projects/{PROJECT_ID}/locations/{LOCATION}',
        processor = dict(display_name = PARSER_DISPLAY_NAME, type_ = PARSER_TYPE, default_processor_version = PARSER_VERSION)
    )
    print('Created New Parser: ', parser.name)

Retrieved existing parser:  projects/1026793852137/locations/us/processors/3779bd3a8f535977


---
## Process Document

Document AI has online and batch processing.  These methods are subject to [limits](https://cloud.google.com/document-ai/limits#content_limits) and [qoutas](https://cloud.google.com/document-ai/quotas).  In this case online is limited to 15 pages and batch is limited to 500 pages.  The document is >100 pages so we either have to split it into smaller sections, like pages, for online processing or use batch processing.  Batch processing works for documents stored in GCS.

> NOTE: The code below could be extended to many document in many locations.

### Move Document To GCS

In [39]:
blob = bucket.blob(f'{SERIES}/{EXPERIMENT}/mlb_rules.pdf')
blob.upload_from_string(context_bytes, content_type = 'application/pdf')

### Batch Process Document

In [40]:
from google.api_core.exceptions import InternalServerError
from google.api_core.exceptions import RetryError

batch_job = docai_client.batch_process_documents(
    request = documentai.BatchProcessRequest(
        name = parser.name,
        input_documents = documentai.BatchDocumentsInputConfig(
            gcs_documents = documentai.GcsDocuments(
                documents = [
                        documentai.GcsDocument(
                            gcs_uri = f'gs://{GCS_BUCKET}/{SERIES}/{EXPERIMENT}/mlb_rules.pdf',
                            mime_type = 'application/pdf'
                    )
                ]
            )
        ),
        document_output_config = documentai.DocumentOutputConfig(
            gcs_output_config = documentai.DocumentOutputConfig.GcsOutputConfig(
                gcs_uri = f'gs://{GCS_BUCKET}/{SERIES}/{EXPERIMENT}/parsing'
            )
        ),
        process_options = documentai.ProcessOptions(
            layout_config = documentai.ProcessOptions.LayoutConfig(
                chunking_config = documentai.ProcessOptions.LayoutConfig.ChunkingConfig(
                    chunk_size = 100,
                    include_ancestor_headings = True,
                )
            )
        )
    )
)
print(f'Waiting on batch job to complete: {batch_job.operation.name}')
batch_job.result()
        
print(documentai.BatchProcessMetadata(batch_job.metadata).state)

Waiting on batch job to complete: projects/1026793852137/locations/us/operations/11528564131844070679
State.SUCCEEDED


### Retrieve Document Parsing Results

In [41]:
documents = []
for process in documentai.BatchProcessMetadata(batch_job.metadata).individual_process_statuses:
    matches = re.match(r"gs://(.*?)/(.*)", process.output_gcs_destination)
    output_bucket, output_prefix = matches.groups()
    output_blobs = bucket.list_blobs(prefix = output_prefix)
    for blob in output_blobs:
        document = documentai.Document.from_json(blob.download_as_bytes(), ignore_unknown_fields = True)
        documents.append(document)

In [42]:
len(documents)

1

In [43]:
parsed_document = documentai.Document.to_dict(documents[0])

In [44]:
parsed_document.keys()

dict_keys(['shard_info', 'document_layout', 'chunked_document', 'mime_type', 'text', 'text_styles', 'pages', 'entities', 'entity_relations', 'text_changes', 'revisions'])

In [45]:
parsed_document['chunked_document'].keys()

dict_keys(['chunks'])

### Parse Chunks

Create a list of dictionaries for each chunk

In [46]:
len(parsed_document['chunked_document']['chunks'])

867

In [47]:
parsed_document['chunked_document']['chunks'][0].keys()

dict_keys(['chunk_id', 'content', 'page_span', 'page_footers', 'source_block_ids', 'page_headers'])

In [48]:
parsed_document['chunked_document']['chunks'][0]

{'chunk_id': 'c1',
 'content': '# OFFICIAL BASEBALL RULES\n\n2023 Edition TM TM',
 'page_span': {'page_start': 1, 'page_end': 7},
 'page_footers': [{'text': 'V1',
   'page_span': {'page_start': 6, 'page_end': 6}},
  {'text': 'vii', 'page_span': {'page_start': 7, 'page_end': 7}}],
 'source_block_ids': [],
 'page_headers': []}

In [49]:
chunks = [
    dict(
        chunk_id = chunk['chunk_id'],
        content = chunk['content'],
    ) for chunk in parsed_document['chunked_document']['chunks']
]

In [50]:
chunks[0]

{'chunk_id': 'c1',
 'content': '# OFFICIAL BASEBALL RULES\n\n2023 Edition TM TM'}

### Generate Embeddings For Each Chunk

Add the embeddings to the chunks dictionary by using the [Vertex AI Embeddings APIs](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings).

In [28]:
async def embedding_runner(chunks, limit_concur_requests = 1500):
    limit = asyncio.Semaphore(limit_concur_requests)
    results = [None] * len(chunks)
    
    # make requests - async
    async def make_request(p):
        
        async with limit:
            if limit.locked():
                await asyncio.sleep(0.01)
                
            ########### manual Error Handling ############################################
            fail_count = 0
            while fail_count <= 20:
                try:
                    result = await models['embedding'].get_embeddings_async([chunks[p]['content']])
                    if fail_count > 0:
                        print(f'Item {p} succeeded after fail count = {fail_count}')
                    break
                except:
                    fail_count += 1
                    #print(f'Item {p} failed: current fail count = {fail_count}')
                    await asyncio.sleep(2^(min(fail_count, 6) - 1))
            ##############################################################################
            
        results[p] = result[0].values
    
    # manage tasks
    tasks = [asyncio.create_task(make_request(p)) for p in range(len(chunks))]
    responses = await asyncio.gather(*tasks)
    
    # add embeddings to input list of dictionaries for all the chunks
    for c, content in enumerate(chunks):
        content['embedding'] = results[c]
    
    await asyncio.sleep(60)
    
    return

In [29]:
await embedding_runner(chunks)

---
## Results To BigQuery

BigQuery is the offline store to the Vertex AI Feature Store as featured in the next section.  This section show how to load the chunks and embeddings to a BigQuery table for use in Feature Store.

BigQuery also has vector indexing and searching functions as well as built in connection LLMs on Vertex AI.  For a review of doing RAG from BQ, and even inside of BQ check out the companion workflow in this series here:
- [Grounding Overview - RAG With BigQuery](./Grounding%20Overview%20-%20RAG%20With%20BigQuery.ipynb)


### Create/Recall Dataset

In [30]:
dataset = bigquery.Dataset(f"{BQ_PROJECT}.{BQ_DATASET}")
dataset.location = BQ_REGION
bq_dataset = bq.create_dataset(dataset, exists_ok = True)

### Load JSON TO BigQuery Table

In [31]:
bq_table = bq_dataset.table(BQ_TABLE)

In [32]:
job_config = bigquery.LoadJobConfig(
    source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON,
    write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE,
    autodetect = True
)

In [33]:
load_job = bq.load_table_from_json(
    json_rows = chunks,
    destination = bq_table,
    job_config = job_config
)
load_job.result()

LoadJob<project=statmike-mlops-349915, location=US, id=25a1b145-e1df-4a52-b867-ec079464aa00>

In [34]:
bq.query(f"SELECT * FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}` LIMIT 5").to_dataframe()

Unnamed: 0,embedding,content,chunk_id
0,"[-0.0014270133106037974, 0.0252088475972414, -...",# OFFICIAL BASEBALL RULES\n\n2023 Edition TM TM,c1
1,"[0.008681542240083218, 0.06999468058347702, 0....",# OFFICIAL BASEBALL RULES\n\n## Official Baseb...,c2
2,"[0.006522744428366423, 0.07181957364082336, 0....",# OFFICIAL BASEBALL RULES\n\n## Official Baseb...,c3
3,"[0.0074585191905498505, 0.0359831228852272, 0....",# OFFICIAL BASEBALL RULES\n\n## FOREWORD\n\nTh...,c4
4,"[0.008693071082234383, 0.048620305955410004, 0...",# OFFICIAL BASEBALL RULES\n\n## FOREWORD\n\nMo...,c5


---
## Vertex AI Feature Store

[Vertex AI Feature Store](https://cloud.google.com/vertex-ai/docs/featurestore/latest/overview) creates online feature views from BigQuery Tables/Views either directly or through the Feature Registry.  The online store have a low-latency API that does [vector matching](https://cloud.google.com/vertex-ai/docs/featurestore/latest/embeddings-search) on input vectors or input entities.  The response include all feature which means the actual text content will also be returned since it is in the table with the embeddings.

Read more about Feature Store [here in this repository](../MLOps/Feature%20Store/readme.md).

### Feature Online Store Admin Client

Used to create online stores and feature views

In [19]:
online_admin_client = aiplatform.gapic.FeatureOnlineStoreAdminServiceClient(client_options = dict(api_endpoint = f'{REGION}-aiplatform.googleapis.com'))

### Create/Retrieve Online Store

**NOTE:** This can take around 10 minutes if creating a new feature store instance

**Reference:**
- [Online Serving Types](https://cloud.google.com/vertex-ai/docs/featurestore/latest/online-serving-types)
- [Create an Online Store Instance](https://cloud.google.com/vertex-ai/docs/featurestore/latest/create-onlinestore)

In [20]:
FEATURE_ONLINE_STORE_NAME = SERIES.replace('-', '_')

In [21]:
try:
    online_store = online_admin_client.get_feature_online_store(name = f'projects/{PROJECT_ID}/locations/{REGION}/featureOnlineStores/{FEATURE_ONLINE_STORE_NAME}')
except Exception:
    create_online_store = online_admin_client.create_feature_online_store(
        request = aiplatform.gapic.CreateFeatureOnlineStoreRequest(
            parent = f'projects/{PROJECT_ID}/locations/{REGION}',
            feature_online_store_id = FEATURE_ONLINE_STORE_NAME,
            feature_online_store = aiplatform.gapic.FeatureOnlineStore(
                optimized = aiplatform.gapic.FeatureOnlineStore.Optimized()
            )
        )
    )
    online_store = create_online_store.result()
    
online_store.name

'projects/1026793852137/locations/us-central1/featureOnlineStores/applied_genai'

In [22]:
print(f'Review in the console:\n\nhttps://console.cloud.google.com/vertex-ai/locations/{REGION}/online-stores/{FEATURE_ONLINE_STORE_NAME}?project={PROJECT_ID}')

Review in the console:

https://console.cloud.google.com/vertex-ai/locations/us-central1/online-stores/applied_genai?project=statmike-mlops-349915


<p align="center">
    <img src="./resources/images/screenshots/grounding/fs_onlinestore.png" width="75%">
<p>

### Important Notes About Setting Up An Index

Working with embeddings, a vectors of numbers, a list of floating points... the nature of vector database solutions.  These are considerations to be taken regardless of the solution being used.  Here Vertex AI Feature Store, which has many configurable options to aide in this.

- **Storage**
    - filter attributes: values that can be used to limit a search
    - crowding attributes: limit the number of matches with these attributes
    - additional columns inline, like the text chunk an embedding represents to prevent the additional step of retrieving data for matches
- **Indexing**
    - a brute force configuration to force search across all embeddings, good for benchmarkinng and ground truth retrieval
    - a method of segmenting embeddings, primarily:
        - inverted file index (IVF), or k-means clustering of embeddings
        - TreeAH, or [ScaNN](https://research.google/blog/announcing-scann-efficient-vector-similarity-search/), for compressing embeddings
    - setting to configure the size of a cluster (IVF) or leaf nodes (TreeAH)
    - distance type to set how matches are computed: dot product, euclidean, manhatten, cosine
- **Retrieval**
    - a brute force override to retrieve ground truth across the full index
    - ability to control the number of neighbors retrieved at query time
    - option to set distance calcuation type at query time
    - usage of filtering and crowding attributes to tailor neighbors list

### Create Feature View: From BigQuery Source

Create a feature view directly from a BigQuery table/view - the table created above.

**Reference:**
- [Create a feature view from a BigQuery source](https://cloud.google.com/vertex-ai/docs/featurestore/latest/create-featureview#create_from_bq)
- API Link for [`aiplatform.gapic.FeatureView.IndexConfig()`](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.FeatureView.IndexConfig)


In [23]:
BQ_FEATURE_VIEW_NAME = BQ_TABLE.replace('-','_')

In [24]:
try:
    bq_view = online_admin_client.get_feature_view(name = f'{online_store.name}/featureViews/{BQ_FEATURE_VIEW_NAME}')
except Exception:
    create_bq_view = online_admin_client.create_feature_view(
        request = aiplatform.gapic.CreateFeatureViewRequest(
            parent = online_store.name,
            feature_view_id = BQ_FEATURE_VIEW_NAME,
            feature_view = aiplatform.gapic.FeatureView(
                big_query_source = aiplatform.gapic.FeatureView.BigQuerySource(
                    uri = f'bq://{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}',
                    entity_id_columns = ['chunk_id']
                ),
                sync_config = aiplatform.gapic.FeatureView.SyncConfig(cron = 'TZ=America/New_York 10 * * * *'),
                index_config = aiplatform.gapic.FeatureView.IndexConfig(
                    embedding_column = 'embedding',
                    embedding_dimension = len(chunks[0]['embedding']),
                    #filter_columns = ['splits'],
                    #crowding_column = 'Class',
                    tree_ah_config = aiplatform.gapic.FeatureView.IndexConfig.TreeAHConfig(
                        leaf_node_embedding_count = 500
                    ),
                    # 0 = unspecified, 1 = Euclidean (squared L2), 2 = Cosine, 3 = Dot Product (negative of dot product)
                    distance_measure_type = aiplatform.gapic.FeatureView.IndexConfig.DistanceMeasureType(3)
                )
            ),
            run_sync_immediately = True
        )
    )
    bq_view = create_bq_view.result()
    
bq_view.name

'projects/1026793852137/locations/us-central1/featureOnlineStores/applied_genai/featureViews/grounding_overview'

In [25]:
print(f'Review in the console:\n\nhttps://console.cloud.google.com/vertex-ai/locations/{REGION}/online-stores/{FEATURE_ONLINE_STORE_NAME}/feature-views/{BQ_FEATURE_VIEW_NAME}/details?project={PROJECT_ID}')

Review in the console:

https://console.cloud.google.com/vertex-ai/locations/us-central1/online-stores/applied_genai/feature-views/grounding_overview/details?project=statmike-mlops-349915


### Start/Get Sync Manually: BQ View

Manually start a sync for the feature view create from BigQuery source.

**References:**
- [Sync feature data to online store](https://cloud.google.com/vertex-ai/docs/featurestore/latest/sync-data)
- [List sync operations](https://cloud.google.com/vertex-ai/docs/featurestore/latest/list-data-syncs)

In [26]:
bq_sync = list(online_admin_client.list_feature_view_syncs(parent = bq_view.name))[-1]

In [27]:
bq_sync.name

'projects/1026793852137/locations/us-central1/featureOnlineStores/applied_genai/featureViews/grounding_overview/featureViewSyncs/3471515395549036544'

In [28]:
waited = 0
while True:
    feature_view_sync = online_admin_client.get_feature_view_sync(name = bq_sync.name)
    if feature_view_sync.run_time.end_time.seconds > 0:
        status = feature_view_sync.final_status.code
        break
    else:
        print(f'Waited {waited} seconds. Update in 30 seconds...')
    time.sleep(30)
    waited += 30
    
if status == 0: print('Succeeded!')
else: print('Failed!')

Succeeded!


In [29]:
online_admin_client.list_feature_view_syncs(
    request = dict(
        parent = bq_view.name,
        page_size = 1,
        #filter = f'create_time > "{(datetime.now() - timedelta(hours = 9)).strftime("%Y-%m-%dT%X")}"'
    )
)

ListFeatureViewSyncsPager<feature_view_syncs {
  name: "projects/1026793852137/locations/us-central1/featureOnlineStores/applied_genai/featureViews/grounding_overview/featureViewSyncs/3564824350328619008"
  create_time {
    seconds: 1724353801
    nanos: 193815000
  }
  run_time {
    start_time {
      seconds: 1724353801
      nanos: 193815000
    }
    end_time {
      seconds: 1724353987
      nanos: 590589000
    }
  }
  final_status {
  }
}
next_page_token: "AMEw9yP4rLyfWoQuA-YARRWa2vXgeCNyMxEQy3r5hRokQKXNQ3G6odbZMiP1L0JuWqA2bpN2"
>

In [30]:
print(f'Review in the console:\n\nhttps://console.cloud.google.com/vertex-ai/locations/{REGION}/online-stores/{FEATURE_ONLINE_STORE_NAME}/feature-views/{BQ_FEATURE_VIEW_NAME}/details?project={PROJECT_ID}')

Review in the console:

https://console.cloud.google.com/vertex-ai/locations/us-central1/online-stores/applied_genai/feature-views/grounding_overview/details?project=statmike-mlops-349915


<p align="center">
    <img src="./resources/images/screenshots/grounding/fs_featureview.png" width="75%">
<p>

---
## Online Retrieval With Vertex AI Feature Store 

**Note:** It might take a few minutes for the newly created feature view created above to respond here.  If you get an error wait a minute and retry.  

### Setup New SDK

In [31]:
try:
    from vertexai.resources.preview.feature_store.feature_view import FeatureView
except Exception as err:
    print('Retry')

In [32]:
fsv = FeatureView(name = bq_view.name)

In [33]:
bq_view.name

'projects/1026793852137/locations/us-central1/featureOnlineStores/applied_genai/featureViews/grounding_overview'

### Retrieve Features For An Entity

**NOTE:** The embedding is also retrieved.

In [51]:
chunks[0]['chunk_id']

'c1'

In [52]:
results = fsv.read(key = ['c1']).to_dict()['features']

In [53]:
for f, feature in enumerate(results):
    if feature['name'] == 'embedding':
        results.pop(f)

In [54]:
results

[{'name': 'content',
  'value': {'string_value': '# OFFICIAL BASEBALL RULES\n\n2023 Edition TM TM'}}]

### Search For Matches Based On Entity

In [55]:
fsv.search(
    entity_id = chunks[0]['chunk_id'],
    neighbor_count = 5,
    return_full_entity = False
)

SearchNearestEntitiesResponse(_response=nearest_neighbors {
  neighbors {
    entity_id: "c3"
    distance: -0.83912760019302368
  }
  neighbors {
    entity_id: "c21"
    distance: -0.77282094955444336
  }
  neighbors {
    entity_id: "c12"
    distance: -0.77026277780532837
  }
  neighbors {
    entity_id: "c20"
    distance: -0.76806008815765381
  }
  neighbors {
    entity_id: "c14"
    distance: -0.76674884557724
  }
}
)

### Search For Matches: Return All Features

In [56]:
results = fsv.search(
    entity_id = chunks[0]['chunk_id'],
    neighbor_count = 3,
    return_full_entity = True
).to_dict()['neighbors']

In [57]:
matches = []
for result in results:
    for feature in result['entity_key_values']['key_values']['features']:
        if feature['name'] == 'content':
            matches.append(dict(
                chunk_id = result['entity_id'],
                content = feature['value']['string_value']
            ))

In [58]:
matches

[{'chunk_id': 'c3',
  'content': '# OFFICIAL BASEBALL RULES\n\n## Official Baseball Rules 2023 Edition\n\n### All rights reserved.\n\nNo part of the Official Baseball Rules may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system now known or to be invented, without permission in writing from the Office of the Commissioner of Baseball. The Major League Baseball silhouetted batter logo is a registered trademark of Major League Baseball Properties, Inc. Cover photo by MLB Photos. ISBN 978-1-63727-284-8 Printed in the United States of America'},
 {'chunk_id': 'c21',
  'content': '# Table of Contents 2023 Official Baseball Rules\n\n## 5.00-PLAYING THE GAME\n\n|-|-|-|\n| 9.01 | Official Scorer (General Rules) | 105 |\n| 9.02 | Official Scorer Report | 108 |\n| 9.03 | Official Scorer Report (Additional Rules) | 111 |\n| 9.04 | Runs Batted In | 114 |\n| 9.05 | Base Hits | 114 |\n| 9.

### Search For Matches Based On Embedding

In [59]:
results = fsv.search(
    embedding_value = models['embedding'].get_embeddings([prompt])[0].values,
    neighbor_count = 5,
    return_full_entity = True
).to_dict()['neighbors']

In [60]:
matches = []
for result in results:
    for feature in result['entity_key_values']['key_values']['features']:
        if feature['name'] == 'content':
            matches.append(dict(
                chunk_id = result['entity_id'],
                content = feature['value']['string_value']
            ))

In [61]:
matches

[{'chunk_id': 'c32',
  'content': "# 2.00-THE PLAYING FIELD\n\n## 2.01 Layout of the Field\n\nThe distance between first base and third base is 127 feet, 3\\frac{3}{8} inches. All measurements from home base shall be taken from the point where the first and third base lines intersect.The catcher's box, the batters' boxes, the coaches' boxes, the three- foot first base lines and the next batter's boxes shall be laid out as shown in the diagrams in Appendices 1 and 2. 2"},
 {'chunk_id': 'c39',
  'content': "# 2.00-THE PLAYING FIELD\n\n## 2.02 Home Base\n\nIt shall be set in the ground with the point at the intersection of the lines extending from home base to first base and to third base; with the 17-inch edge facing the pitcher's plate, and the two 12-inch edges coinciding with the first and third base lines. The top edges of home base shall be beveled and the base shall be fixed in the ground level with the ground surface. (See drawing D in Appendix 2.)3"},
 {'chunk_id': 'c31',
  'cont

---
## RAG With Feature Store

Based on a user query retrieve content from Feature Store and then provide it as context to an LLM, in this case [Gemini On Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models).

For a detailed overview of this approach, as well as extentions like reranking, and comparing ot other grounding methods make sure to check out the companion workflow in this repository: 
- [Grounding Overview](./Grounding%20Overview.ipynb)

In [78]:
prompt

'what are the dimensions of first base in baseball?'

### Retrieval

In [62]:
results = fsv.search(
    embedding_value = models['embedding'].get_embeddings([prompt])[0].values,
    neighbor_count = 20,
    return_full_entity = True
).to_dict()['neighbors']

In [63]:
matches = []
for result in results:
    for feature in result['entity_key_values']['key_values']['features']:
        if feature['name'] == 'content':
            matches.append(dict(
                chunk_id = result['entity_id'],
                distance = result['distance'],
                content = feature['value']['string_value']
            ))

In [64]:
for m, match in enumerate(matches): print(f"{m+1} - chunk_id: {match['chunk_id']} with distance: {match['distance']}")

1 - chunk_id: c32 with distance: -0.7050017714500427
2 - chunk_id: c39 with distance: -0.6960663795471191
3 - chunk_id: c31 with distance: -0.691490650177002
4 - chunk_id: c24 with distance: -0.6890405416488647
5 - chunk_id: c60 with distance: -0.6760091781616211
6 - chunk_id: c29 with distance: -0.6737022399902344
7 - chunk_id: c838 with distance: -0.6642963886260986
8 - chunk_id: c30 with distance: -0.6638877391815186
9 - chunk_id: c38 with distance: -0.661100447177887
10 - chunk_id: c28 with distance: -0.6602258086204529
11 - chunk_id: c59 with distance: -0.6543205976486206
12 - chunk_id: c37 with distance: -0.6491357088088989
13 - chunk_id: c41 with distance: -0.6488881707191467
14 - chunk_id: c23 with distance: -0.6433005332946777
15 - chunk_id: c40 with distance: -0.6430721879005432
16 - chunk_id: c25 with distance: -0.6380656361579895
17 - chunk_id: c61 with distance: -0.6375458836555481
18 - chunk_id: c36 with distance: -0.6370067000389099
19 - chunk_id: c837 with distance: -0.

In [65]:
retrieved_context = "Context presented as chunks of text extracted from source documents:\n"
for match in matches:
    retrieved_context += f"\n- Chunk {match['chunk_id']}: {match['content']}"

In [66]:
#display(retrieved_context)

In [68]:
print(matches[15]['chunk_id'], '\n\n', matches[16]['content'])

c25 

 # Rule 3.03(g) to 3.05

## 3.05 First Baseman's Glove

The web of the mitt shall measure not more than five inches from its top to the base of the thumb crotch. The web may be either a lacing, lacing through leather tunnels, or a center piece of leather which may be an extension of the palm connected to the mitt with lacing and constructed so that it will not exceed the above mentioned measurements. The webbing shall not be constructed of wound or wrapped lacing or deepened to make a net type of trap. The glove may be of any weight.


### Generation

In [72]:
grounded_with_rag_vertexfs = models['gemini_pro'].generate_content([retrieved_context, prompt], generation_config = grounding_config)
display(grounded_with_rag_vertexfs.text)

The provided text snippets describe the layout and markings of a baseball field but do not specify the dimensions of first base itself. 

However, the text *does* say that first base is marked by a bag with these characteristics:

* **Shape:** Square
* **Size:** 18 inches by 18 inches
* **Thickness:** Not less than 3 inches, not more than 5 inches
* **Material:** White canvas or rubber-covered 
* **Filling:** Soft material

Therefore, while a specific dimension for first base isn't given, you know it's marked by an 18-inch square bag. 


In [77]:
grounded_with_rag_vertexfs = models['gemini_flash'].generate_content([retrieved_context, prompt, 'State which chunk helped answer the question.'], generation_config = grounding_config)
display(grounded_with_rag_vertexfs.text)

First base in baseball is marked by a white canvas or rubber-covered bag that is 18 inches square. 

This information is found in chunk c40:

## 2.03 The Bases

First, second and third bases shall be marked by white canvas or rubber-covered bags, securely attached to the ground as indicated in Diagram 2. The first and third base bags shall be entirely within the infield. The second base bag shall be centered on second base. The bags shall be 18 inches square, not less than three nor more than five inches thick, and filled with soft material. 


---
## Cleanup

Vertex AI Feature Store Online instances continue to run and have cost.  If done with this demonstration then the following code can be used to remove the online instance but will leave the BigQuery table in place for future use.

Review in the console and complete delete of feature views and onlines store with the following link:

In [264]:
print(f'Review in the console:\n\nhttps://console.cloud.google.com/vertex-ai/locations/{REGION}/online-stores/{FEATURE_ONLINE_STORE_NAME}?project={PROJECT_ID}')

Review in the console:

https://console.cloud.google.com/vertex-ai/locations/us-central1/online-stores/applied_genai?project=statmike-mlops-349915


In [269]:
# change to True to force deletion of the online store created above
remove_online_store = False #True

In [268]:
if remove_online_store:
    online_admin_client.delete_feature_online_store(
        name = online_store.name,
        force = True
    )