In [None]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Building a Gen AI RAG application with Vertex AI Feature Store and BigQuery

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fuse-cases%2Fretrieval-augmented-generation%2Frag_qna_with_bq_and_featurestore.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg" alt="Vertex AI logo"><br> Open in Vertex AI Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/bigquery/import?url=https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/bigquery/v1/32px.svg" alt="BigQuery Studio logo"><br> Open in BigQuery Studio
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb">
      <img width="32px" src="https://upload.wikimedia.org/wikipedia/commons/9/91/Octicons-mark-github.svg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

<div style="clear: both;"></div>

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/53/X_logo_2023_original.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>            

| | |
|-|-|
| Authors | [Elia Secchi](https://github.com/eliasecchig) [Lorenzo Spataro](https://github.com/lspataroG) |

## Overview

This notebook guides you through building a low-latency vector search system for your Gen AI application using [BigQuery Vector Search](https://cloud.google.com/bigquery/docs/vector-search-intro) and [Vertex AI Feature Store](https://cloud.google.com/vertex-ai/docs/featurestore/latest/overview). We'll leverage the [`BigQueryVectorStore` LangChain integration]([https://github.com/langchain-ai/langchain-google/blob/main/libs/community/langchain_google_community/bq_storage_vectorstores/featurestore.py#L33]) and [`VertexFSVectorStore` LangChain integration]([https://github.com/langchain-ai/langchain-google/blob/main/libs/community/langchain_google_community/bq_storage_vectorstores/bigquery.py#L26]) to streamline this process.

Vertex AI Feature Store seamlessly integrates with BigQuery, providing a unified data storage and flexible vector search options:

- **BigQuery Vector Search**: with **`BigQueryVectorStore`** LangChain class, ideal for batch retrieval and prototyping, as it requires no infrastructure setup.
- **Feature Store Online Store**: with **`VertexFSVectorStore`** LangChain class, enables low-latency retrieval with manual or scheduled data sync. Perfect for production-ready user-facing Gen AI applications. 

As part of this notebook you will learn how to:
1. Ingest data and embedding using BigQuery Vector Search with the class `BigQueryVectorStore`
2. Perform retrieval leveraging BigQuery Vector Search with the class `BigQueryVectorStore`
3. Transition to Vertex AI Feature Store with the class `VertexFSVectorStore` for low-latency retrieval
4. Understand pros and cons of both options through a performance deep dive

![bq_fs_diagram_journey.png](https://storage.googleapis.com/github-repo/generative-ai/gemini/use-cases/retrieval-augmented-generation/bq_fs_diagram_journey.png)

## Get started

### Install Vertex AI SDK and other required packages

In [None]:
%pip install --upgrade --user --quiet google-cloud-aiplatform "langchain-google-vertexai" "langchain-google-community[featurestore]" pypdf==4.2.0

### Restart runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

The restart might take a minute or longer. After it's restarted, continue to the next step.

In [None]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>

### Authenticate your notebook environment (Colab only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [None]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information and initialize Vertex AI SDK

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}


import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

## Getting started

### Import libraries

In [None]:
from langchain.chains import RetrievalQA
from langchain.globals import set_debug
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader
from langchain_google_community import BigQueryVectorStore, VertexFSVectorStore
from langchain_google_vertexai import VertexAI, VertexAIEmbeddings

In [None]:
DATASET = "sample_app"  # @param {type:"string"}
TABLE = "fixmycar"  # @param {type:"string"}

## Add documents to `BigQueryVectorStore`

This step ingests and parse PDF documents, split them, generate embeddings and add the embeddings to the vector store. The document corpus used as dataset is a collection of owners car manual.

**Summary steps**
- Create text embeddings: LangChain `VertexAIEmbeddings`
- Ingest PDF files: LangChain `PyPDFLoader`
- Chunk documents: LangChain `TextSplitter`
- Create Vector Store: LangChain  `VertexAIFeatureStore`

### Create the Vertex AI Embedding model

In [None]:
embedding_model = VertexAIEmbeddings(
    model_name="textembedding-gecko@latest", project=PROJECT_ID
)

### Ingest PDF file

The document is hosted on Cloud Storage bucket (at `gs://github-repo/generative-ai/sample-apps/fixmycar/cymbal-starlight-2024.pdf`) and LangChain provides a convenient document loader [`PyPDFLoader`](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/) to load documents from pdfs.

In [None]:
GCS_BUCKET_DOCS = "github-repo/generative-ai/sample-apps/fixmycar"

# Copy the file to the current path
!gsutil cp "gs://$GCS_BUCKET_DOCS/*.pdf" .

In [None]:
# Ingest PDF files
loader = PyPDFLoader("cymbal-starlight-2024.pdf")
documents = loader.load()

# Add document name and source to the metadata
for document in documents:
    doc_md = document.metadata
    document_name = doc_md["source"].split("/")[-1]
    # derive doc source from Document loader
    doc_source_prefix = "/".join(GCS_BUCKET_DOCS.split("/")[:3])
    doc_source_suffix = "/".join(doc_md["source"].split("/")[4:-1])
    source = f"{doc_source_prefix}/{doc_source_suffix}"
    document.metadata = {"source": source, "document_name": document_name}

print(f"# of documents loaded (pre-chunking) = {len(documents)}")

Verify document metadata

In [None]:
documents[0].metadata

## Chunk documents - `TextSplitter`

Split the documents to smaller chunks. When splitting the document, ensure a few chunks can fit within the context length of LLM.

In [None]:
# split the documents into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=50,
    separators=["\n\n", "\n", ".", "!", "?", ",", " ", ""],
)
doc_splits = text_splitter.split_documents(documents)

# Add chunk number to metadata
for idx, split in enumerate(doc_splits):
    split.metadata["chunk"] = idx

print(f"# of documents = {len(doc_splits)}")

In [None]:
doc_splits[0].metadata

## Configure `BigQueryVectorStore` as Vector Store

We will start the journey using BigQuery Vector Store as it requires no startup time, making the class great for prototyping.

You can initialize the class by providing:
- `project_id`
- `location`
- `dataset_name`
- `table_name`

The table will be used to store embeddings and metadata. You can also point to an existing table. The class will use [BigQuery Vector Search](https://cloud.google.com/bigquery/docs/vector-search-intro) to perform vector search.

See [here](https://github.com/langchain-ai/langchain-google/blob/main/libs/community/langchain_google_community/bq_storage_vectorstores/bigquery.py#L26) for the full list of parameters of the class.

In [None]:
bq_store = BigQueryVectorStore(
    project_id=PROJECT_ID,
    location=LOCATION,
    dataset_name=DATASET,
    table_name=TABLE,
    embedding=embedding_model,
)

### Add documents to the store

Note: If you have precomputed embeddings, you can add text, embeddings and potential metadata using the method `add_texts_with_embeddings`

In [None]:
doc_ids = bq_store.add_documents(doc_splits)

Verify the `BigQueryVectorSearch` with similarity search

In [None]:
bq_store.similarity_search(
    "What should I do when I call the emergency roadside assistance?"
)

### Get a langchain retriever
The retriever will be used in a LangChain Chain to find the most similar documents for a given query.

In [None]:
langchain_retriever = bq_store.as_retriever()

### Compose a LangChain Chain

We are going to use the [`RetrievalQA` chain](https://python.langchain.com/docs/modules/chains/popular/vector_db_qa)
There are several different chain types available, listed [here](https://docs.langchain.com/docs/components/chains/index_related_chains).

In [None]:
# Set high verbosity
set_debug(True)

llm = VertexAI(model_name="gemini-pro")

search_query = "What should I do when calling the emergency roadside assistance?"  # @param {type:"string"}

retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=langchain_retriever
)
response = retrieval_qa.invoke(search_query)
print("\n################ Final Answer ################\n")
print(response["result"])

## Low latency Vector Search with FeatureStore with `VertexFSVectorStore`

We are now ready to perform low latency serving with Feature Store!

To do that, you can simply use the method `.to_vertex_fs_vector_store()`, to get a `VertexFSVectorStore` object.

See the [class definition](https://github.com/langchain-ai/langchain-google/blob/main/libs/community/langchain_google_community/bq_storage_vectorstores/featurestore.py#L33) for all the parameters you can use.
Moving back to `BigQueryVectorStore` is equivalently easy with the `.to_bq_vector_search()` method.

Note: Any method we run earlier can be equivalently called on both `BigQueryVectorStore` and `VertexFSVectorStore`. For instance it is possible to add new documents to an instance of `VertexFSVectorStore` as both stores share the same underlying BQ source.

In [None]:
vertex_fs = bq_store.to_vertex_fs_vector_store()  # pass optional parameters here

You can also initialize the `VertexFSVectorStore` class directly

In [None]:
vertex_fs = VertexFSVectorStore(
    project_id=PROJECT_ID,
    location=LOCATION,
    dataset_name=DATASET,
    table_name=TABLE,
    embedding=embedding_model,
    # pass optional parameters here
)

#### Kick off a synchronization process

We use the `sync_data` method to synchronize the data from BigQuery to the Feature Online Store, to achieve low latency serving.

> Note: The first synchronization process will take around ~20 minutes because of Feature Online Store creation.

In [None]:
vertex_fs.sync_data()

When in a production environment, you can also use the `cron_schedule` class parameter to setup an automatic scheduled synchronization. For example:
```python
store = VertexFSVectorStore(cron_schedule="TZ=America/Los_Angeles 00 13 11 8 *", ...)
```

In [None]:
vertex_fs.similarity_search("Hello world")

You can monitor the synchronization process from the Google Cloud Console: [Vertex AI Feature Store Tab](https://console.cloud.google.com/vertex-ai/feature-store/online-stores)

#### Serve with Feature Online Store

You are now ready to use Vertex AI Feature Store as part of your chain through a retriever object!

In [None]:
langchain_retriever = vertex_fs.as_retriever()

In [None]:
%%time
results = langchain_retriever.invoke("Leaks under the vehicle")
results

### Filtering by metadata

It is possible to post-filter results by metadata by passing the filter parameter to any search method

`VertexFSVectorStore` also support metadata filter while performing search, for this to work:
- the `filter_columns` parameter must be passed to `VertexFSVectorStore` when the online feature store feature view is created (first time the class is initialised with a given online store name and feature view name).

- the `string_filters` parameter must be passed to any search method. Note only string fields are supported at the moment. See [here](https://github.com/googleapis/python-aiplatform/blob/8a4a41afe47aaff2f69a73e5011b34bcba5cd2e9/google/cloud/aiplatform_v1beta1/types/feature_online_store_service.py#L345)

In [None]:
vertex_fs.similarity_search(search_query, filter={"chunk": 28})

# When should I use which class? A performance deep dive

We precompute the embedding so that we exclude that latency from the equation

In [None]:
my_embedding = embedding_model.embed(search_query)[0]

## Batch search with `BigQueryVectorStore`

For some use cases it is necessary to run batch searches (ie. when running a retrieval evaluation).

Leveraging the power of scale of BigQuery, we can run efficient batch searches using the `BigQueryVectorStore` which offers a specialized `batch_search` method.

In [None]:
results = bq_store.batch_search(
    embeddings=None,  # can pass embeddings or
    queries=["search_query", "search_query"],  # can pass queries
)

### Batch search with 10.000 embeddings

We can run 10.000 batched searches with `BigQueryVectorStore` in ~20 seconds!

In [None]:
%%time
fake_embeddings = [my_embedding] * 10000
results = bq_store.batch_search(embeddings=fake_embeddings)
results[:2]

## Low latency serving with Feature Store
If you are instead looking at powering an online application, Vertex Feature Store might be a good solution as it offers low latency serving.

We run a small load test composed by 10 requests to demonstrate the latency reduction.

### BigQuery single request

In [None]:
%%timeit -r10
bq_store.similarity_search_by_vector(my_embedding)

### Feature Store single request

In [None]:
%%timeit -r10
vertex_fs.similarity_search_by_vector(my_embedding)

Feature store is faster than BigQuery over single requests!


> Note: for server side latency estimate we suggest leveraging the Feature Store dashboards.
You can do it by:
1. Visit the [Feature Store console](https://console.cloud.google.com/vertex-ai/feature-store/online-stores)
2. Click on your newly created Feature Online Store
3. Scroll down to "Serving Latency" dashboard!

# Appendix

### Get documents by ID

For both Vector Stores you can also use the function `get_documents` to retrieve a set of documents given a document ID:

In [None]:
vertex_fs.get_documents(ids=["my_id1"])

### Remove documents by ID

You can also use the function `delete` to remove a set of documents given a document ID:

In [None]:
vertex_fs.delete(ids=["my_id1", "my_id2"])

## Maximal marginal relevance search

You can also use [maximal marginal relevance search](https://python.langchain.com/v0.1/docs/modules/model_io/prompts/example_selectors/mmr/) for both Vector Stores.

In [None]:
mmr_retriever = vertex_fs.as_retriever(search_type="mmr")
mmr_retriever.invoke("Lane departure warning?")[1]

## Cleaning up

In [None]:
# Delete BigQuery dataset. Uncomment and run the command below if you want to delete the BigQuery set.
# from google.cloud import bigquery
# Do this only if the dataset is created for this demo.
# dataset = f"{PROJECT_ID}.{DATASET_ID}"
# dataset_object = bigquery.Dataset(dataset)
# client.delete_dataset(dataset_object, delete_contents=True, not_found_ok=True)

vertex_fs.feature_view.delete()
vertex_fs.online_store.delete()