# Building a GenAI RAG application with Feature Store and BigQuery

## Overview
This notebook guides you through building a low-latency vector search system for your GenAI application using Vertex AI Feature Store. We'll leverage the [Vertex Feature Store Langchain integration]([link to integration]) to streamline this process.

Feature Store seamlessly integrates with BigQuery, providing a unified data storage and flexible vector search options:

- **BigQuery Vector Search**: Ideal for batch retrieval and prototyping, as it requires no infrastructure setup.
- **Feature Store Online Store**: Enables low-latency retrieval with manual or scheduled data sync. Perfect for production-ready user-facing GenAI applications.
![Image notebook journey](diagram_journey.png)


# Setup


### Install libraries

In [37]:
!pip install langchain-google-vertexai==1.0.3 pypdf==4.2.0 langchain pyarrow==16.0.0 db-dtypes==1.2.0

Collecting db-dtypes
  Downloading db_dtypes-1.2.0-py2.py3-none-any.whl.metadata (3.0 kB)
Downloading db_dtypes-1.2.0-py2.py3-none-any.whl (14 kB)
Installing collected packages: db-dtypes
Successfully installed db-dtypes-1.2.0


### Authenticating your notebook environment
* If you are using **Colab** to run this notebook, uncomment the cell below and continue.
* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env).

In [9]:
# from google.colab import auth

# auth.authenticate_user()

### Import libraries

In [1]:
%load_ext autoreload
%autoreload 2
from lib.integration import VertexFeatureStore
import logging
logging.basicConfig(level=logging.INFO)

### Define environment variables

In [2]:
PROJECT_ID = "cloud-llm-preview2"
DATASET = "vertex_documentation"
TABLE = "mytest4"
REGION = "europe-west4"

# Add documents to `VertexAIFeatureStore`

This step ingests and parse PDF documents, split them, generate embeddings and add the embeddings to the vector store. The document corpus used as dataset is a collection of owners car manual.

**Summary steps**
- Create text embeddings: LangChain `VertexAIEmbeddings`
- Ingest PDF files: LangChain `PyPDFLoader`
- Chunk documents: LangChain `TextSplitter`
- Create Vector Store: LangChain  `VertexAIFeatureStore` 

### Create the VertexAI Embedding model

In [3]:
from langchain_google_vertexai import VertexAIEmbeddings
from langchain_community.vectorstores import BigQueryVectorSearch

embedding_model = VertexAIEmbeddings(
    model_name="textembedding-gecko@latest", project=PROJECT_ID
)

### Ingest PDF file

The document is hosted on Cloud Storage bucket (at `gs://github-repo/generative-ai/sample-apps/fixmycar/cymbal-starlight-2024.pdf`) and LangChain provides a convenient document loader [`PyPDFLoader`](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/) to load documents from pdfs.


Make a Google Cloud Storage bucket in your GCP project to copy the document files into, or directlt copy the file to the current path.

In [4]:
GCS_BUCKET_DOCS = (
    "github-repo/generative-ai/sample-apps/fixmycar"  # @param {type: "string"}
)

# Copy the file to the current path
!gsutil cp "gs://$GCS_BUCKET_DOCS/*.pdf" .

Copying gs://github-repo/generative-ai/sample-apps/fixmycar/cymbal-starlight-2024.pdf...
- [1 files][328.9 KiB/328.9 KiB]                                                
Operation completed over 1 objects/328.9 KiB.                                    


In [5]:
# Ingest PDF files
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("cymbal-starlight-2024.pdf")
documents = loader.load()


# Add document name and source to the metadata
for document in documents:
    doc_md = document.metadata
    document_name = doc_md["source"].split("/")[-1]
    # derive doc source from Document loader
    doc_source_prefix = "/".join(GCS_BUCKET_DOCS.split("/")[:3])
    doc_source_suffix = "/".join(doc_md["source"].split("/")[4:-1])
    source = f"{doc_source_prefix}/{doc_source_suffix}"
    document.metadata = {"source": source, "document_name": document_name}

print(f"# of documents loaded (pre-chunking) = {len(documents)}")

# of documents loaded (pre-chunking) = 22


Verify document metadata

In [6]:
documents[0].metadata

{'source': 'github-repo/generative-ai/sample-apps/',
 'document_name': 'cymbal-starlight-2024.pdf'}

## Chunk documents - TextSplitter

Split the documents to smaller chunks. When splitting the document, ensure a few chunks can fit within the context length of LLM.

In [7]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split the documents into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=50,
    separators=["\n\n", "\n", ".", "!", "?", ",", " ", ""],
)
doc_splits = text_splitter.split_documents(documents)

# Add chunk number to metadata
for idx, split in enumerate(doc_splits):
    split.metadata["chunk"] = idx

print(f"# of documents = {len(doc_splits)}")

# of documents = 60


In [8]:
doc_splits[0].metadata

{'source': 'github-repo/generative-ai/sample-apps/',
 'document_name': 'cymbal-starlight-2024.pdf',
 'chunk': 0}

## Configure `VertexFeatureStore` as Vector Store

You are now ready to start using Vertex Feature Store! 
You can initialize the class by providing `project_id`, `location`, a BQ `dataset_name` and `table_name` to be used to store embeddings. 
You can also point to an existing table. By default the class will use [BigQuery Vector Search](https://cloud.google.com/bigquery/docs/vector-search-intro) to perform vector search.

See [here](TODO) for the full list of parameters of the class. 

In [9]:
vertex_fs = VertexFeatureStore(
    project_id=PROJECT_ID,
    location=REGION,
    dataset_name=DATASET,
    table_name=TABLE,
    embedding=embedding_model,
 )

INFO:lib.integration:BigQuery table cloud-llm-preview2.vertex_documentation.mytest4 initialized/validated as persistent storage. Access via BigQuery console:
 https://console.cloud.google.com/bigquery?project=cloud-llm-preview2&ws=!1m5!1m4!4m3!1scloud-llm-preview2!2svertex_documentation!3smytest4


In [11]:
vertex_fs.add_documents(doc_splits)

['6420aaaa1fdd4bae9fed083714fb6a10',
 'f90361ee2581437fb476a97899bab304',
 '2ba7b6b4e71b473eba7ff5802702baa0',
 'cb92c490b058456691250f1b5afd91a3',
 '1364467c299b44a7910aafc70ca49717',
 '1afb323ceed54a9db42eb851fd8edffd',
 '2c3c233495a942e3abc9057143596e78',
 '2b24b7ea1850427ab1a51f4bad4d3cef',
 'fa3d47ef3a1c403a89ba99f6eb809756',
 'd22896d9dd944d0da0b1038fcde11719',
 '49fa3cb1e5cc422d8c2fd967f355eeba',
 'e6704fba5cab435db08e76c8f936583b',
 '02efe35a166942f398dc59a1d6df7125',
 '5a2e3062d4d84b498ad79e82ee089888',
 'd6a048818ad54fdd9c765f139069a6fc',
 '2325fef7bc664db2b9f8e02d96487f46',
 '7d7c0c1a70764bf0aab3c9d515e2d961',
 'bc6d9592127943af975c54e6f34a5428',
 'da9dee9d88a048418939ff1b68f91c7f',
 '4eb748e6ad7543baaad4c1b99e544e0c',
 '83564fbd80b545f4bb886800e75be522',
 '8cb72976f3294bd798fe8da89e02148a',
 'b4dde69c38434075811b4c6cc81b8279',
 '838df84cb6164a32b9ff3bbf1716f9a4',
 '6c0eb96a605f4f1ca27e1cdc9544591b',
 '9e2d6feaaa9740e2b0bec3486e9304b2',
 '44fb9eba381f4647b8296df7da525c5f',
 

Verify the BigQueryVectorSearch with similarity search

In [12]:
vertex_fs.similarity_search(
    "What should I do when I call the emergency roadside assistance?"
)

[Document(page_content="manual.md 2024-03-23\n21 / 22Wash your vehicle regularly to remove dirt and grime.\nWax your vehicle twice a year to protect the paint.\nCheck the tire pressure regularly and adjust it as needed.\nInspect the brakes regularly for wear and tear.\nKeep the interior of your vehicle clean and free of debris.\nBy following these tips, you can help to keep your Cymbal Starlight 2024 in top condition for many years to\ncome.\nChapter 18: Emergencies\nRoadside Assistance\nIf you experience a roadside emergency, such as a flat tire or a dead battery, you can call roadside\nassistance for help. Roadside assistance is available 24 hours a day, 7 days a week.\nTo call roadside assistance, dial the following number:\n1-800-555-1212\nWhen you call roadside assistance, be prepared to provide the following information:\nYour name and contact information\nYour vehicle's make, model, and year\nYour vehicle's location\nThe nature of the emergency\nFlat Tire", metadata={'doc_id': '

### Get a langchain retriever
The retriever will be used in a Langchain Chain to find the most similar documents for a given query.

In [14]:
langchain_retriever = vertex_fs.as_retriever()

### Compose a Langchain Chain

We are going to use the [`RetrievalQA` chain](https://python.langchain.com/docs/modules/chains/popular/vector_db_qa)
There are several different chain types available, listed [here](https://docs.langchain.com/docs/components/chains/index_related_chains).

In [15]:
%%time
from langchain_google_vertexai import VertexAI
from langchain.chains import RetrievalQA
from langchain.globals import set_debug

# Set high verbosity
set_debug(True)

llm = VertexAI(model_name="gemini-pro")

search_query = "What should I do when call the emergency roadside assistance?"  # @param {type:"string"}

retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=langchain_retriever
)
response = retrieval_qa.invoke(search_query)
print("\n################ Final Answer ################\n")
print(response["result"])

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "What should I do when call the emergency roadside assistance?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain > 4:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "What should I do when call the emergency roadside assistance?",
  "context": "manual.md 2024-03-23\n21 / 22Wash your vehicle regularly to remove dirt and grime.\nWax your vehicle twice a year to protect the paint.\nCheck the tire pressure regularly and adjust it as needed.\nInspect the brakes regularly for wear and tear.\nKeep the interior of your vehicle clean and free of debris.\nBy following these tips, you can help to keep your Cymbal Starlight 2024 in top condition for many years to\ncome.\nChapter 18: Emergencies\nRoadside Assis

## Low latency Vector Search with FeatureStore

We are now ready to perform low latency serving with Feature Store! 

To do that, you can simply use the method `set_executor`, to `feature_online_store` type. 

See the [function definition](TODO) for all the parameters you can use.

In [22]:
vertex_fs.set_executor({"type": "feature_online_store", "view_name": "mytest1" })

#### Kick off a synchronization process

You can use the method `sync` to synchronize the data from BigQuery to the Feature Online Store, to achieve low latency serving.
When in a production environment, you can also use `cron_schedule` to setup an automatic scheduled synchronization. 

The synchronization process will take around ~20 minutes. 

In [25]:
vertex_fs.sync()

Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing

You can also monitor the synchronization process from GCP Console: [Vertex AI Feature Store Tab](https://console.cloud.google.com/vertex-ai/feature-store/online-stores)

#### Serve with Feature Online Store

You are now ready to serve with Feature Store! You can re-use the same retriever to perform low-latency Vector Search.

In [21]:
results = langchain_retriever.invoke(search_query)
results[0]

Document(page_content="manual.md 2024-03-23\n21 / 22Wash your vehicle regularly to remove dirt and grime.\nWax your vehicle twice a year to protect the paint.\nCheck the tire pressure regularly and adjust it as needed.\nInspect the brakes regularly for wear and tear.\nKeep the interior of your vehicle clean and free of debris.\nBy following these tips, you can help to keep your Cymbal Starlight 2024 in top condition for many years to\ncome.\nChapter 18: Emergencies\nRoadside Assistance\nIf you experience a roadside emergency, such as a flat tire or a dead battery, you can call roadside\nassistance for help. Roadside assistance is available 24 hours a day, 7 days a week.\nTo call roadside assistance, dial the following number:\n1-800-555-1212\nWhen you call roadside assistance, be prepared to provide the following information:\nYour name and contact information\nYour vehicle's make, model, and year\nYour vehicle's location\nThe nature of the emergency\nFlat Tire", metadata={'doc_id': '7

In [24]:
%%time
results = langchain_retriever.invoke("Leaks under the vehicle")

CPU times: user 15.7 ms, sys: 3.57 ms, total: 19.2 ms
Wall time: 438 ms


In [25]:
%%time
response = retrieval_qa.invoke(search_query)
print("\n################ Final Answer ################\n")
print(response["result"])

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "What should I do when call the emergency roadside assistance?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain > 4:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "What should I do when call the emergency roadside assistance?",
  "context": "manual.md 2024-03-23\n21 / 22Wash your vehicle regularly to remove dirt and grime.\nWax your vehicle twice a year to protect the paint.\nCheck the tire pressure regularly and adjust it as needed.\nInspect the brakes regularly for wear and tear.\nKeep the interior of your vehicle clean and free of debris.\nBy following these tips, you can help to keep your Cymbal Starlight 2024 in top condition for many years to\ncome.\nChapter 18: Emergencies\nRoadside Assis

### Filtering by metadata


# Appendix

We add here other useful examples to work with the `VertexFeatureStore` Langchain integration.

### Local Bruteforce

You can also prototype by using a (local) bruteforce executor. During initialization, data is downloaded from BQ to your memory.

You can use it for prototyping when the number of documents is low. 

In [26]:
vertex_fs.set_executor({"type":"brute_force"})

Reading data from cloud-llm-preview2.vertex_documentation.mytest4. It might take a few minutes...


In [27]:
%%time
langchain_retriever.invoke("Lane departure warning?")[1]

CPU times: user 47.4 ms, sys: 20.7 ms, total: 68.1 ms
Wall time: 300 ms


Document(page_content='and a visual alert. If you do not take corrective action, the lane keeping assist system will gently steer the\nvehicle back into the lane.\nIf the automatic support system detects that you are approaching another vehicle too quickly, it will warn\nyou with a chime and a visual alert. If you do not take corrective action, the automatic emergency braking\nsystem can automatically apply the brakes to avoid a collision.\nTips for Using Cruise Control and the Automatic Support System\nCruise control is not a substitute for attentive driving. Always be aware of your surroundings and be\nprepared to take control of the vehicle at any time.\nThe automatic support system is designed to assist you in driving, but it does not replace the need\nfor you to be attentive and in control of the vehicle.', metadata={'doc_id': '9e2d6feaaa9740e2b0bec3486e9304b2', 'content': 'and a visual alert. If you do not take corrective action, the lane keeping assist system will gently steer t

### Max Marginal Relevance

In [28]:
mmr_retriever = vertex_fs.as_retriever(search_type="mmr")

In [29]:
mmr_retriever.invoke("Lane departure warning?")[1]



### Get documents by ID

You can also use the function `get_documents` to retrieve a set of documents given a document ID:


In [22]:
vertex_fs.get_documents(ids=["Your document ID"])

[Document(page_content="manual.md 2024-03-23\n1 / 22\nCymbal Starlight 2024: Owner's Manual\nChapter 1: Safety\nIntroduction\nYour safety and the safety of others is paramount. This chapter provides important information to help you\noperate your Cymbal Starlight 2024 safely and responsibly. Please read and understand this information\nthoroughly before operating your vehicle.\nSeat Belts\nAll occupants must wear seat belts at all times.\nAdjust the seat belt to fit snugly around your hips and across your chest.\nNever wear a seat belt under your arm or behind your back.\nReplace any seat belt that has been damaged or frayed.\nAirbags\nAirbags are supplemental restraints and work in conjunction with seat belts to provide additional\nprotection in the event of a collision.\nDo not place objects on or near the airbag deployment areas (e.g., dashboard, steering wheel, seat\nbacks).\nChildren under the age of 12 should never ride in the front seat.", metadata={'source': 'github-repo/genera