# Building a GenAI RAG application with Feature Store and BigQuery

## Overview
This notebook guides you through building a low-latency vector search system for your GenAI application using Vertex AI Feature Store. We'll leverage the [Vertex Feature Store Langchain integration]([link to integration]) to streamline this process.

Feature Store seamlessly integrates with BigQuery, providing a unified data storage and flexible vector search options:

- **BigQuery Vector Search**: Ideal for batch retrieval and prototyping, as it requires no infrastructure setup.
- **Feature Store Online Store**: Enables low-latency retrieval with manual or scheduled data sync. Perfect for production-ready user-facing GenAI applications.
![Image notebook journey](diagram_journey.png)


# Setup


### Install libraries

In [3]:
!pip install langchain-google-vertexai pypdf==4.2.0 langchain pyarrow==16.0.0 db-dtypes==1.2.0 --upgrade

Collecting langchain-google-vertexai
  Downloading langchain_google_vertexai-1.0.4-py3-none-any.whl (57 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.7/57.7 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting langchain
  Downloading langchain-0.2.1-py3-none-any.whl (973 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m973.5/973.5 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting langchain-text-splitters<0.3.0,>=0.2.0
  Downloading langchain_text_splitters-0.2.0-py3-none-any.whl (23 kB)
Collecting langchain-core<0.3,>=0.1.42
  Downloading langchain_core-0.2.1-py3-none-any.whl (308 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m308.5/308.5 kB[0m [31m9.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: langchain-core, langchain-text-splitters, langchain, langchain-google-vertexai
  Attempting uninstall: langchain-core
    Found existing installation: langchain-core 

### Authenticating your notebook environment
* If you are using **Colab** to run this notebook, uncomment the cell below and continue.
* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env).

In [9]:
# from google.colab import auth

# auth.authenticate_user()

### Import libraries

In [17]:
# import sys
# sys.path.append("../")

In [1]:
%load_ext autoreload
%autoreload 2
from langchain_google_vertexai.vectorstores.feature_store.bq_vectorstore import BigQueryVectorStore
# import logging
# logging.basicConfig(level=logging.INFO)

### Define environment variables

In [3]:
PROJECT_ID = "cloud-llm-preview2"
DATASET = "vertex_documentation2"
TABLE = "mytest4"
REGION = "europe-west4"

# Add documents to `VertexAIFeatureStore`

This step ingests and parse PDF documents, split them, generate embeddings and add the embeddings to the vector store. The document corpus used as dataset is a collection of owners car manual.

**Summary steps**
- Create text embeddings: LangChain `VertexAIEmbeddings`
- Ingest PDF files: LangChain `PyPDFLoader`
- Chunk documents: LangChain `TextSplitter`
- Create Vector Store: LangChain  `VertexAIFeatureStore` 

### Create the VertexAI Embedding model

In [4]:
from langchain_google_vertexai import VertexAIEmbeddings
from langchain_community.vectorstores import BigQueryVectorSearch

embedding_model = VertexAIEmbeddings(
    model_name="textembedding-gecko@latest", project=PROJECT_ID
)

### Ingest PDF file

The document is hosted on Cloud Storage bucket (at `gs://github-repo/generative-ai/sample-apps/fixmycar/cymbal-starlight-2024.pdf`) and LangChain provides a convenient document loader [`PyPDFLoader`](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/) to load documents from pdfs.


In [5]:
GCS_BUCKET_DOCS = (
    "github-repo/generative-ai/sample-apps/fixmycar"  # @param {type: "string"}
)

# Copy the file to the current path
!gsutil cp "gs://$GCS_BUCKET_DOCS/*.pdf" .



Updates are available for some Google Cloud CLI components.  To install them,
please run:
  $ gcloud components update

Copying gs://github-repo/generative-ai/sample-apps/fixmycar/cymbal-starlight-2024.pdf...
- [1 files][328.9 KiB/328.9 KiB]                                                
Operation completed over 1 objects/328.9 KiB.                                    


In [6]:
# Ingest PDF files
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("cymbal-starlight-2024.pdf")
documents = loader.load()


# Add document name and source to the metadata
for document in documents:
    doc_md = document.metadata
    document_name = doc_md["source"].split("/")[-1]
    # derive doc source from Document loader
    doc_source_prefix = "/".join(GCS_BUCKET_DOCS.split("/")[:3])
    doc_source_suffix = "/".join(doc_md["source"].split("/")[4:-1])
    source = f"{doc_source_prefix}/{doc_source_suffix}"
    document.metadata = {"source": source, "document_name": document_name}

print(f"# of documents loaded (pre-chunking) = {len(documents)}")

# of documents loaded (pre-chunking) = 22


Verify document metadata

In [7]:
documents[0].metadata

{'source': 'github-repo/generative-ai/sample-apps/',
 'document_name': 'cymbal-starlight-2024.pdf'}

## Chunk documents - TextSplitter

Split the documents to smaller chunks. When splitting the document, ensure a few chunks can fit within the context length of LLM.

In [8]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split the documents into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=50,
    separators=["\n\n", "\n", ".", "!", "?", ",", " ", ""],
)
doc_splits = text_splitter.split_documents(documents)

# Add chunk number to metadata
for idx, split in enumerate(doc_splits):
    split.metadata["chunk"] = idx

print(f"# of documents = {len(doc_splits)}")

# of documents = 60


In [9]:
doc_splits[0].metadata

{'source': 'github-repo/generative-ai/sample-apps/',
 'document_name': 'cymbal-starlight-2024.pdf',
 'chunk': 0}

## Configure `VertexFeatureStore` as Vector Store

You are now ready to start using Vertex Feature Store! 
You can initialize the class by providing `project_id`, `location`, a BQ `dataset_name` and `table_name` to be used to store embeddings. 
You can also point to an existing table. By default the class will use [BigQuery Vector Search](https://cloud.google.com/bigquery/docs/vector-search-intro) to perform vector search.

See [here](TODO) for the full list of parameters of the class. 

In [11]:
# %load_ext autoreload
# %autoreload 2
# from langchain_google_vertexai import VertexFeatureStore
# PROJECT_ID = "cloud-llm-preview2"
# DATASET = "vertex_test"
# TABLE = "mytest5"
# REGION = "europe-west4"
vertex_fs = FeatureStore(
    project_id=PROJECT_ID,
    location="us-central1",
    dataset_name=DATASET,
    table_name=TABLE,
    embedding=embedding_model,
    # executor={"type": "feature_online_store", "online_store_name": "langchain_test_fos", "location": "us-central1"}
 )

BigQuery table cloud-llm-preview2.vertex_documentation2.mytest4 initialized/validated as persistent storage. Access via BigQuery console:
 https://console.cloud.google.com/bigquery?project=cloud-llm-preview2&ws=!1m5!1m4!4m3!1scloud-llm-preview2!2svertex_documentation2!3smytest4


In [52]:
vertex_fs.add_documents(doc_splits)

Creating FeatureView
Create FeatureView backing LRO: projects/323656405210/locations/us-central1/featureOnlineStores/langchain_test_fos/featureViews/mytest5/operations/8520828396408668160
FeatureView created. Resource name: projects/323656405210/locations/us-central1/featureOnlineStores/langchain_test_fos/featureViews/mytest5
To use this FeatureView in another session:
feature_view = aiplatform.FeatureView('projects/323656405210/locations/us-central1/featureOnlineStores/langchain_test_fos/featureViews/mytest5')
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync Succeed for projects/cloud-llm-preview2/locations/us-central1/featureOnlineStores/langchain_test_fos/featureViews/mytest5/featureViewSyncs/8954412199707672576.
nearest_neighbors {
  neighbors {
    entity_id: "8f

['c97a77467fb7467c9842b3b7b94ff31f',
 '1a5ceb2a0fca4080b65e6b5581cd1599',
 '03e85c1aee3840d3b33b781424ea52ec',
 '04124559cba446bdab4dd13b3b4f75ec',
 '82c32d44abf6460596f515a6492ccd94',
 'f7eaaa9308544f478d87adcbdd92f01b',
 '0e0b26538c1f426f91f786e101c1fe18',
 '8f868ea52a9e4aaaa70c278695720f01',
 '191d58942f404ff29822e34a0b7baabc',
 '081e4857b73846bbbe6bdcde64616ce6',
 'e2a84cf20679437a87015c71110ef025',
 '26479685af8b4ffc98ac796dc28c29c0',
 '2d1d5f14f17a4654858eed073a1ff8bf',
 'f7f8dccc93ac4e7abfcc9438dcf26919',
 '725886fcf783452288b598da030ee82e',
 'fa23e1951962442b93c94dedbc9f69d0',
 'cf059dac763c44c2a97342f5ee519449',
 '4b2b0abca36644a78972dd29086ab5a8',
 '124ec6fbdfe6467984a354f01bb7bc1e',
 '912de19def6f4a999ecb55cf431fbbcb',
 'e3d76b6bcc4f4ad48b5d376e673efd50',
 '858032d38e054a64b9f4c24729c2bb6a',
 '9d5fdcd1fd774790aa1828b171aaeaa1',
 '6b763689c8e748c0be83c4191bc9a5f3',
 '5e60236a31ef487fb7c45b93d6b1fd08',
 '2ce2ae851b3342c785728ab4f7a91e84',
 'fe0bb63c02d744268b50fcfb111e275f',
 

In [133]:
%%time
vertex_fs.executor.search_neighbors_by_ids(ids=["8f868ea52a9e4aaaa70c278695720f01"])

CPU times: user 8.49 ms, sys: 1.92 ms, total: 10.4 ms
Wall time: 183 ms


[[Document(page_content="manual.md 2024-03-23\n4 / 22The hazard lights are used to signal to other drivers that your vehicle is disabled or in an emergency\nsituation.\nTo activate the hazard lights, press the hazard light button located on the dashboard.\nThe hazard lights will flash until you press the button again to turn them off.\nHorn\nThe horn is used to alert other drivers and pedestrians of your presence.\nTo sound the horn, press the horn button located on the steering wheel.\nEmergency Roadside Assistance\nYour Cymbal Starlight 2024 comes with 24/7 emergency roadside assistance.\nTo contact emergency roadside assistance, call the following number: 1-800-555-1212.\nWhen you call, be prepared to provide the following information:\nYour name and contact information\nYour vehicle's make, model, and year\nYour vehicle's location\nThe nature of the emergency\nTire Repair Kit\nYour Cymbal Starlight 2024 comes with a tire repair kit that can be used to temporarily repair a flat\ntir

In [140]:
%%time
vertex_fs.get_documents(ids=["8f868ea52a9e4aaaa70c278695720f01"])

CPU times: user 6.56 ms, sys: 6.38 ms, total: 12.9 ms
Wall time: 503 ms


[Document(page_content='the safety features in your Cymbal Starlight 2024. This chapter provides important information on how to\nuse the following emergency assistance features:\nHazard lights\nHorn\nEmergency roadside assistance\nTire repair kit\nHazard Lights', metadata={'source': 'github-repo/generative-ai/sample-apps/', 'document_name': 'cymbal-starlight-2024.pdf', 'chunk': '7'})]

In [19]:
from google.cloud.aiplatform_v1beta1 import NearestNeighborQuery




In [161]:
%%time

vertex_fs.similarity_search(
            "treat",
            k=6,
            # \string_filters=[
            #    NearestNeighborQuery.StringFilter({"name":"kind","deny_tokens":["treat"]})
            # ]
        )

CPU times: user 27.8 ms, sys: 3.82 ms, total: 31.6 ms
Wall time: 496 ms


[Document(page_content="manual.md 2024-03-23\n2 / 22VSC can be turned off by pressing the VSC OFF button on the dashboard. However, it is\nrecommended to leave VSC on for optimal safety.\nAnti-Lock Braking System (ABS)\nABS prevents the wheels from locking during braking, allowing you to maintain control of the vehicle.\nABS can be felt as a pulsation in the brake pedal during braking. Do not release the brake pedal;\ncontinue applying steady pressure until the vehicle comes to a stop.\nTire Safety\nMaintain proper tire pressure at all times (see the Tire Pressure Information label on the driver's door\njamb).\nCheck tire tread depth regularly and replace tires when they reach the minimum tread depth of 2/32\ninches.\nAvoid sudden starts, stops, and turns that can cause excessive tire wear.\nVehicle Inspection\nInspect your vehicle regularly for any signs of damage or malfunction, including:\nLeaks under the vehicle\nUnusual noises or vibrations\nDim or flickering lights\nWorn or damag

Verify the BigQueryVectorSearch with similarity search

### Get a langchain retriever
The retriever will be used in a Langchain Chain to find the most similar documents for a given query.

In [12]:
langchain_retriever = vertex_fs.as_retriever()

### Compose a Langchain Chain

We are going to use the [`RetrievalQA` chain](https://python.langchain.com/docs/modules/chains/popular/vector_db_qa)
There are several different chain types available, listed [here](https://docs.langchain.com/docs/components/chains/index_related_chains).

In [13]:
%%time
from langchain_google_vertexai import VertexAI
from langchain.chains import RetrievalQA
from langchain.globals import set_debug

# Set high verbosity
set_debug(True)

llm = VertexAI(model_name="gemini-pro")

search_query = "What should I do when call the emergency roadside assistance?"  # @param {type:"string"}

retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=langchain_retriever
)
response = retrieval_qa.invoke(search_query)
print("\n################ Final Answer ################\n")
print(response["result"])

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "What should I do when call the emergency roadside assistance?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain > 4:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "What should I do when call the emergency roadside assistance?",
  "context": "manual.md 2024-03-23\n21 / 22Wash your vehicle regularly to remove dirt and grime.\nWax your vehicle twice a year to protect the paint.\nCheck the tire pressure regularly and adjust it as needed.\nInspect the brakes regularly for wear and tear.\nKeep the interior of your vehicle clean and free of debris.\nBy following these tips, you can help to keep your Cymbal Starlight 2024 in top condition for many years to\ncome.\nChapter 18: Emergencies\nRoadside Assis

## Low latency Vector Search with FeatureStore

We are now ready to perform low latency serving with Feature Store! 

To do that, you can simply use the method `set_executor`, to `feature_online_store` type. 

See the [function definition](TODO) for all the parameters you can use.

In [14]:
vertex_fs.set_executor({"type": "feature_online_store"})

#### Kick off a synchronization process

You can use the method `sync` to synchronize the data from BigQuery to the Feature Online Store, to achieve low latency serving.
When in a production environment, you can also use `cron_schedule` to setup an automatic scheduled synchronization. 

The synchronization process will take around ~20 minutes. 

In [15]:
vertex_fs.sync()

Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync ongoing, waiting for 30 seconds.
Sync Succeed for projects/cloud-llm-preview2/locations/europe-west4/featureOnlineStores/vertex_documentation2/featureViews/mytest4/featureViewSyncs/6767276228220026880.


You can also monitor the synchronization process from GCP Console: [Vertex AI Feature Store Tab](https://console.cloud.google.com/vertex-ai/feature-store/online-stores)

#### Serve with Feature Online Store

You are now ready to serve with Feature Store! You can re-use the same retriever to perform low-latency Vector Search.

In [18]:
results = langchain_retriever.invoke(search_query)
results[0]

Document(page_content="manual.md 2024-03-23\n21 / 22Wash your vehicle regularly to remove dirt and grime.\nWax your vehicle twice a year to protect the paint.\nCheck the tire pressure regularly and adjust it as needed.\nInspect the brakes regularly for wear and tear.\nKeep the interior of your vehicle clean and free of debris.\nBy following these tips, you can help to keep your Cymbal Starlight 2024 in top condition for many years to\ncome.\nChapter 18: Emergencies\nRoadside Assistance\nIf you experience a roadside emergency, such as a flat tire or a dead battery, you can call roadside\nassistance for help. Roadside assistance is available 24 hours a day, 7 days a week.\nTo call roadside assistance, dial the following number:\n1-800-555-1212\nWhen you call roadside assistance, be prepared to provide the following information:\nYour name and contact information\nYour vehicle's make, model, and year\nYour vehicle's location\nThe nature of the emergency\nFlat Tire", metadata={'content': "

In [19]:
%%time
results = langchain_retriever.invoke("Leaks under the vehicle")

CPU times: user 16.1 ms, sys: 3.77 ms, total: 19.9 ms
Wall time: 432 ms


In [25]:
%%time
response = retrieval_qa.invoke(search_query)
print("\n################ Final Answer ################\n")
print(response["result"])

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "What should I do when call the emergency roadside assistance?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain > 4:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "What should I do when call the emergency roadside assistance?",
  "context": "manual.md 2024-03-23\n21 / 22Wash your vehicle regularly to remove dirt and grime.\nWax your vehicle twice a year to protect the paint.\nCheck the tire pressure regularly and adjust it as needed.\nInspect the brakes regularly for wear and tear.\nKeep the interior of your vehicle clean and free of debris.\nBy following these tips, you can help to keep your Cymbal Starlight 2024 in top condition for many years to\ncome.\nChapter 18: Emergencies\nRoadside Assis

### Filtering by metadata


# Appendix

We add here other useful examples to work with the `VertexFeatureStore` Langchain integration.

### Local Bruteforce

You can also prototype by using a (local) bruteforce executor. During initialization, data is downloaded from BQ to your memory.

You can use it for prototyping when the number of documents is low. 

In [26]:
vertex_fs.set_executor({"type":"brute_force"})

Reading data from cloud-llm-preview2.vertex_documentation.mytest4. It might take a few minutes...


In [27]:
%%time
langchain_retriever.invoke("Lane departure warning?")[1]

CPU times: user 47.4 ms, sys: 20.7 ms, total: 68.1 ms
Wall time: 300 ms


Document(page_content='and a visual alert. If you do not take corrective action, the lane keeping assist system will gently steer the\nvehicle back into the lane.\nIf the automatic support system detects that you are approaching another vehicle too quickly, it will warn\nyou with a chime and a visual alert. If you do not take corrective action, the automatic emergency braking\nsystem can automatically apply the brakes to avoid a collision.\nTips for Using Cruise Control and the Automatic Support System\nCruise control is not a substitute for attentive driving. Always be aware of your surroundings and be\nprepared to take control of the vehicle at any time.\nThe automatic support system is designed to assist you in driving, but it does not replace the need\nfor you to be attentive and in control of the vehicle.', metadata={'doc_id': '9e2d6feaaa9740e2b0bec3486e9304b2', 'content': 'and a visual alert. If you do not take corrective action, the lane keeping assist system will gently steer t

### Max Marginal Relevance

In [28]:
mmr_retriever = vertex_fs.as_retriever(search_type="mmr")

In [29]:
mmr_retriever.invoke("Lane departure warning?")[1]



### Get documents by ID

You can also use the function `get_documents` to retrieve a set of documents given a document ID:


In [22]:
vertex_fs.get_documents(ids=["Your document ID"])

[Document(page_content="manual.md 2024-03-23\n1 / 22\nCymbal Starlight 2024: Owner's Manual\nChapter 1: Safety\nIntroduction\nYour safety and the safety of others is paramount. This chapter provides important information to help you\noperate your Cymbal Starlight 2024 safely and responsibly. Please read and understand this information\nthoroughly before operating your vehicle.\nSeat Belts\nAll occupants must wear seat belts at all times.\nAdjust the seat belt to fit snugly around your hips and across your chest.\nNever wear a seat belt under your arm or behind your back.\nReplace any seat belt that has been damaged or frayed.\nAirbags\nAirbags are supplemental restraints and work in conjunction with seat belts to provide additional\nprotection in the event of a collision.\nDo not place objects on or near the airbag deployment areas (e.g., dashboard, steering wheel, seat\nbacks).\nChildren under the age of 12 should never ride in the front seat.", metadata={'source': 'github-repo/genera