In [1]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Build with RAG Engine in Vertex AI

| | |
|-|-|
| Author(s) | [Laxmi Harikumar](https://github.com/laxmih-genai) |

## Overview

**Retrieval Augmented Generation (RAG)** improves large language models by allowing them to access and process external information sources during generation. This ensures the model's responses are grounded in factual data and avoids hallucinations.

Vertex AI RAG Engine is a data framework for developing context-augmented large language model (LLM) applications. For more information, refer to the public documentation for [Vertex AI RAG Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/rag-overview).

This notebook aims at providing a hands on tutorial for RAG Engine API with the following steps.

Part 1: Use default managed Vector Db
- Create a RAG corpus by specifying an embedding model and vector database
- Upload a local PDF file to the corpus
- Import a scanned PDF (This requires creating Document AI Layout Parser)
- Set up a retrieval tool
- Use your RAG retrieval tool to add context to  Gemini's responses to user queries

Part 2: Use Vertex AI Vector Search
- Set up [Vertex AI Feature Store](https://cloud.google.com/vertex-ai/docs/featurestore/latest/overview) as the vector database to use with RAG Engine.
- To use the Vertex AI Feature Store, do the following:
  - Create a BigQuery table schema
  - Provision a FeatureOnlineStore Instance
  - Create a FeatureView resource
  - Upload data and online serving
- Create RAG Corpus with Vertex AI Feature Store
- Import files into the BigQuery table using the RAG API
- Run a synchronization process to construct a FeatureOnlineStore index
- Set up a retrieval tool
- Use your RAG retrieval tool to add context to  Gemini's responses to user queries

- Clean up
  Delete the corpus


## Get started

### Install Vertex AI SDK and other required packages


In [3]:
%pip install --upgrade --user --quiet google-cloud-aiplatform google-cloud-documentai google-cloud-discoveryengine

### Restart runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

The restart might take a minute or longer. After it's restarted, continue to the next step.

In [4]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [None]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information and initialize Vertex AI SDK

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [3]:
# Use the environment variable if the user doesn't provide Project ID.
import os

import vertexai

PROJECT_ID = "demos-vertex"  # @param {type:"string", isTemplate: true}
if PROJECT_ID == "[your-project-id]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

vertexai.init(project=PROJECT_ID, location=LOCATION)

### Import libraries

In [2]:
from IPython.display import Markdown
from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool

from google.api_core.client_options import ClientOptions
from google.api_core.exceptions import FailedPrecondition
from google.cloud import documentai

from google.cloud import bigquery
from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
from vertexai.resources.preview import feature_store

## PART 1

Create a RAG Corpus by uploading a local file and a scanned document from GCS Bucket. Set up a retrieval tool and get responses to queries

### Helper functions

In [4]:
def list_files_in_corpus(rag_corpus_name):
  # List the files
  files = rag.list_files(corpus_name=rag_corpus_name)
  for file in files:
      print(file.display_name)
      print(file.name)

### Create a RAG Corpus

Configure the Embedding model

In [5]:
EMBEDDING_MODEL = "text-embedding-004"  # @param {type:"string", isTemplate: true}
embedding_model_config = rag.EmbeddingModelConfig(publisher_model=f"""publishers/google/models/{EMBEDDING_MODEL}""")

In [6]:
embedding_model_config

EmbeddingModelConfig(publisher_model='publishers/google/models/text-embedding-004', endpoint=None, model=None, model_version_id=None)

In [7]:
CORPUS_DISPLAY_NAME = "rag-corpus-for-demo"

In [8]:
rag_corpus = rag.create_corpus(
    display_name=CORPUS_DISPLAY_NAME,
    embedding_model_config=embedding_model_config
)

### Check the corpus just created

In [9]:
rag.list_corpora()

ListRagCorporaPager<rag_corpora {
  name: "projects/demos-vertex/locations/us-central1/ragCorpora/4467570830351532032"
  display_name: "Feature Store Corpus"
  create_time {
    seconds: 1734470172
    nanos: 103948000
  }
  update_time {
    seconds: 1734470172
    nanos: 103948000
  }
  rag_embedding_model_config {
    vertex_prediction_endpoint {
      endpoint: "projects/756696270058/locations/us-central1/publishers/google/models/text-embedding-004"
    }
  }
  rag_vector_db_config {
    vertex_feature_store {
      feature_view_resource_name: "projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id/featureViews/feature_view_id"
    }
    rag_embedding_model_config {
      vertex_prediction_endpoint {
        endpoint: "projects/756696270058/locations/us-central1/publishers/google/models/text-embedding-004"
      }
    }
  }
  corpus_status {
    state: ACTIVE
  }
  vector_db_config {
    vertex_feature_store {
      feature_view_resource_name: "projects/

### Upload a local file to the corpus

In [10]:
rag_file = rag.upload_file(
    corpus_name=rag_corpus.name,
    path="/content/contents/veo-imagen-blog.pdf",
    display_name="veo-imagen-blog.pdf",
    description="Veo and Imagen3 announcement",
)

In [11]:
# Check if file is in corpus
list_files_in_corpus(rag_corpus.name)

veo-imagen-blog.pdf
projects/756696270058/locations/us-central1/ragCorpora/4683743612465315840/ragFiles/5327143838195301619


### Enable the DocumentAI API

[RAG Engine supports DocumentAI's layout parser](https://cloud.google.com/vertex-ai/generative-ai/docs/layout-parser-integration) to extracts content elements (text, tables, lists) from documents for better information retrieval.

To use the layour parser Document AI API has to be enabled.



In [12]:
!gcloud config set project demos-vertex
!gcloud services enable documentai.googleapis.com discoveryengine.googleapis.com

Updated property [core/project].
Operation "operations/acat.p2-756696270058-87551bc0-8b3c-4d4f-bd70-7ae70203fb48" finished successfully.


### Create a Document AI Layout Parser

In [13]:
def create_parser_processor(
    project_id: str, location: str, processor_display_name: str
) -> documentai.Processor:

    # Set the `api_endpoint`
    client_options = ClientOptions(api_endpoint=f"{parser_location}-documentai.googleapis.com")

    client = documentai.DocumentProcessorServiceClient(client_options=client_options)

    # The full resource name of the location
    # e.g.: projects/project_id/locations/location
    parent = client.common_location_path(project_id, location)

    # try:
    # Get the list of existing processors
    processor_list = client.list_processors(parent=parent)

    # Print the processor information
    for processor in processor_list:
        if processor.display_name == processor_display_name:
          return processor
        else:
          # Create a processor
          processor = client.create_processor(
            parent=parent,
            processor=documentai.Processor(
                display_name=processor_display_name, type_="LAYOUT_PARSER_PROCESSOR"
            ),
          )
          return(processor)

In [14]:
# See https://cloud.google.com/document-ai/docs/regions for all options.
parser_location = "us"

# Must be unique per project
parser_display_name = "rag-engine-demo-processor"

processor = create_parser_processor(PROJECT_ID, parser_location, parser_display_name)

In [15]:
print(processor.name)

projects/756696270058/locations/us/processors/24886c09a2f841e2


In [16]:
layout_parser_processor_name = processor.name

### Import a scanned document

In [17]:
INPUT_GCS_BUCKET = "gs://rag-agent-demo/"

response = rag.import_files(
    corpus_name=rag_corpus.name,
    paths=[INPUT_GCS_BUCKET],
    chunk_size=512,  # Optional
    chunk_overlap=100,  # Optional
    max_embedding_requests_per_min=900,  # Optional

    layout_parser=rag.LayoutParserConfig(
        processor_name=layout_parser_processor_name,
        max_parsing_requests_per_min=120,  # Optional
    )
)
print(f"Imported {response.imported_rag_files_count} files.")

Imported 1 files.


In [18]:
# List the files in the corpus
list_files_in_corpus(rag_corpus.name)

veo-imagen-blog.pdf
projects/756696270058/locations/us-central1/ragCorpora/4683743612465315840/ragFiles/5327143838195301619
contract_1.pdf
projects/756696270058/locations/us-central1/ragCorpora/4683743612465315840/ragFiles/5327144029487778441


### Create RAG Retrieval Tool

In [19]:
# Create a tool for the RAG Corpus
rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_corpora=[rag_corpus.name],
            similarity_top_k=2,
            vector_distance_threshold=0.5,
        ),
    )
)

### Generate Content with Gemini using Rag Retrieval Tool

In [21]:
# Load tool into Gemini model
rag_gemini_model = GenerativeModel(
    "gemini-2.0-flash-exp",
    tools=[rag_retrieval_tool]
)

Question from the uploaded file

In [22]:
response = rag_gemini_model.generate_content("What is Google's video generation model?")

display(Markdown(response.text))

Google's video generation model is called Veo, which is developed by Google DeepMind. It generates high-quality, high-definition videos based on text or image prompts. Veo is available on Vertex AI in private preview.


Question from the scanned PDF

In [23]:
response = rag_gemini_model.generate_content("What is the price per unit of the office supplies")
display(Markdown(response.text))

The price per unit of the office supplies is $15.00.


### Clean up - Delete the corpus

In [24]:
rag.delete_corpus(rag_corpus.name)

Successfully deleted the RagCorpus.


# Part 2: Using Vertex AI RAG Engine with Feature Store as Vector DB


Part 2: Use Vertex AI Vector Search
- Set up [Vertex AI Feature Store](https://cloud.google.com/vertex-ai/docs/featurestore/latest/overview) as the vector database to use with RAG Engine.
- To use the Vertex AI Feature Store, do the following:
  - Create a BigQuery table schema
  - Provision a FeatureOnlineStore Instance
  - Create a FeatureView resource
- Create RAG Corpus with Vertex AI Feature Store
- Import files into the BigQuery table using the RAG API
- Run a synchronization process to construct a FeatureOnlineStore index
- Set up a retrieval tool
- Use your RAG retrieval tool to add context to  Gemini's responses to user queries

- Context Retrieval and Reranking
- Clean up
  - Delete the corpus

### Helper Functions

In [25]:
## Create a BigQuery table schema

def create_bq_table_schema(dataset_id, table_id):
  client = bigquery.Client(project=PROJECT_ID)

  schema = [
      bigquery.SchemaField("corpus_id", "STRING", mode="REQUIRED"),
      bigquery.SchemaField("file_id", "STRING", mode="REQUIRED"),
      bigquery.SchemaField("chunk_id", "STRING", mode="REQUIRED"),
      bigquery.SchemaField("chunk_data_type", "STRING", mode="NULLABLE"),
      bigquery.SchemaField("chunk_data", "STRING", mode="NULLABLE"),
      bigquery.SchemaField("file_original_uri", "STRING", mode="NULLABLE"),
      bigquery.SchemaField("embeddings", "FLOAT64", mode="REPEATED"),
  ]

  dataset_ref = bigquery.DatasetReference(PROJECT_ID, dataset_id)

  try:
      dataset = client.get_dataset(dataset_ref)
      print(f"Dataset {dataset_id} already exists.")
  except Exception:
      dataset = bigquery.Dataset(dataset_ref)
      dataset.location = "US"  # Set the location (optional, adjust if needed)
      dataset = client.create_dataset(dataset)
      print(f"Created dataset {dataset.dataset_id}")

  table_ref = dataset_ref.table(table_id)
  table = client.create_table(bigquery.Table(table_ref, schema=schema))
  print(f"Created table {PROJECT_ID}.{dataset_id}.{table_id}")

  return table

To enable online serving of features, use the CreateFeatureOnlineStore API to set up a FeatureOnlineStore instance.

Note: If you are provisioning a FeatureOnlineStore for the first time, the operation might take approximately five minutes to complete.



In [26]:
## Create Feature Online Store
def create_feature_online_store():
  fos = feature_store.FeatureOnlineStore.create_optimized_store(FEATURE_ONLINE_STORE_ID)
  return fos

To connect the BigQuery table, which stores the feature data source, to the FeatureOnlineStore instance, call the CreateFeatureView API to create a FeatureView resource.

In [27]:
## Create a FeatureView
def create_feature_view(fos):
  fv = fos.create_feature_view(
      name=FEATURE_VIEW_ID,
      source=feature_store.utils.FeatureViewVertexRagSource(uri=BIGQUERY_TABLE),
  )
  return fv

### Setup Vertex AI Feature Store

Create BigQuery Table Schema, Y and Z

In [28]:
# Define dataset and table name
dataset_id="rag_engine_demo_ds_id"  # @param {type:"string"}
table_id = "rag_engine_demo_table_id"  # @param {type:"string"}
table = create_bq_table_schema(dataset_id, table_id)

BIGQUERY_TABLE = f'bq://{table.full_table_id.replace(":", ".")}'


Created dataset rag_engine_demo_ds_id
Created table demos-vertex.rag_engine_demo_ds_id.rag_engine_demo_table_id


In [34]:
FEATURE_ONLINE_STORE_ID = "fs_online_store_id"  # @param {type: "string"}
fos = create_feature_online_store()

INFO:vertexai.resources.preview.feature_store.feature_online_store:Creating FeatureOnlineStore
INFO:vertexai.resources.preview.feature_store.feature_online_store:Create FeatureOnlineStore backing LRO: projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id/operations/5193384191290507264
INFO:vertexai.resources.preview.feature_store.feature_online_store:FeatureOnlineStore created. Resource name: projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id
INFO:vertexai.resources.preview.feature_store.feature_online_store:To use this FeatureOnlineStore in another session:
INFO:vertexai.resources.preview.feature_store.feature_online_store:feature_online_store = aiplatform.FeatureOnlineStore('projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id')


In [35]:
FEATURE_VIEW_ID = "feature_view_id"  # @param {type: "string"}
fv = create_feature_view(fos)

INFO:vertexai.resources.preview.feature_store.feature_online_store:Creating FeatureView
INFO:vertexai.resources.preview.feature_store.feature_online_store:Create FeatureView backing LRO: projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id/featureViews/feature_view_id/operations/701606512941858816
INFO:vertexai.resources.preview.feature_store.feature_online_store:FeatureView created. Resource name: projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id/featureViews/feature_view_id
INFO:vertexai.resources.preview.feature_store.feature_online_store:To use this FeatureView in another session:
INFO:vertexai.resources.preview.feature_store.feature_online_store:feature_view = aiplatform.FeatureView('projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id/featureViews/feature_view_id')


Use the Vertex AI Feature Store instance as the vector database to create a RAG corpus

In [42]:
vector_db = rag.VertexFeatureStore(resource_name=fv.resource_name)

# Name your corpus
CORPUS_DISPLAY_NAME = "Feature Store Corpus"

# Create RAG Corpus
rag_corpus = rag.create_corpus(display_name=CORPUS_DISPLAY_NAME, vector_db=vector_db)
print(f"Created RAG Corpus resource: {rag_corpus.name}")

Created RAG Corpus resource: projects/756696270058/locations/us-central1/ragCorpora/8142508126285856768


In [43]:
# Check the newly created corpus
rag.list_corpora()

ListRagCorporaPager<rag_corpora {
  name: "projects/demos-vertex/locations/us-central1/ragCorpora/8142508126285856768"
  display_name: "Feature Store Corpus"
  create_time {
    seconds: 1734557930
    nanos: 159743000
  }
  update_time {
    seconds: 1734557930
    nanos: 159743000
  }
  rag_embedding_model_config {
    vertex_prediction_endpoint {
      endpoint: "projects/756696270058/locations/us-central1/publishers/google/models/text-embedding-004"
    }
  }
  rag_vector_db_config {
    vertex_feature_store {
      feature_view_resource_name: "projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id/featureViews/feature_view_id"
    }
    rag_embedding_model_config {
      vertex_prediction_endpoint {
        endpoint: "projects/756696270058/locations/us-central1/publishers/google/models/text-embedding-004"
      }
    }
  }
  corpus_status {
    state: ACTIVE
  }
  vector_db_config {
    vertex_feature_store {
      feature_view_resource_name: "projects/

### Import files into the BigQuery table using the RAG API

Use the ImportRagFiles API to import files from Google Cloud Storage or Google Drive into the BigQuery table of the Vertex AI Feature Store instance. The files are embedded and stored in the BigQuery table.

In [44]:
GCS_BUCKET = "gs://cloud-samples-data/gen-app-builder/search/cymbal-bank-employee"

response = rag.import_files(
    corpus_name=rag_corpus.name,
    paths=[GCS_BUCKET],
    chunk_size=512,
    chunk_overlap=50,
)


### Construct a FeatureOnlineStore index
After uploading your data into the BigQuery table, run a synchronization process to make your data available for online serving.
You must generate a FeatureOnlineStore index using the FeatureView, and the synchronization process might take 20 minutes to complete.



In [45]:
feature_view_sync = fv.sync()
feature_view_sync

<vertexai.resources.preview.feature_store.feature_view.FeatureView.FeatureViewSync object at 0x7d60e6d72e90> 
resource name: projects/756696270058/locations/us-central1/featureOnlineStores/fs_online_store_id/featureViews/feature_view_id/featureViewSyncs/287290920775188480

### Create RAG Retrieval Tool

Call the Vertex AI GenerateContent API to use Gemini models to generate content, and specify Rag resource name in the request to retrieve data from the FeatureOnlineStore index.

In [46]:
rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=rag_corpus.name,  # Currently only 1 corpus is allowed.
                )
            ],
            similarity_top_k=3,
            vector_distance_threshold=0.4,
        ),
    )
)

### Generate Content with Gemini using Rag Retrieval Tool

In [52]:
MODEL_NAME = "gemini-2.0-flash-exp"

In [53]:
rag_fs_model = GenerativeModel(MODEL_NAME, tools=[rag_retrieval_tool])

Question from the Alphabet docs in Google GCS Bucket

In [54]:
response = rag_fs_model.generate_content("What are the five key principles for employees at Cymbal Bank?")
display(Markdown(response.text))

The five key principles for employees at Cymbal Bank are to be open-minded, respectful, inclusive, diverse, and to be yourself.


### Perform direct context retrieval

Use retrieved contexts with your preferred SDK or API for final output.

In [63]:
RETRIEVAL_QUERY = "What are the five key principles for employees at Cymbal Bank?"

rag_resource = rag.RagResource(
    rag_corpus=rag_corpus.name,
    # Need to manually get the ids from rag.list_files.
    # rag_file_ids=[],
)

response = rag.retrieval_query(
    rag_resources=[rag_resource],  # Currently only 1 corpus is allowed.
    text=RETRIEVAL_QUERY,
    similarity_top_k=10,
)

# The retrieved context can be passed to any SDK or model generation API to generate final results.
print(response)


contexts {
  contexts {
    source_uri: "gs://cloud-samples-data/gen-app-builder/search/cymbal-bank-employee/Cymbal Bank New Employee Guide.pdf"
    text: "Cymbal Bank New Employe Guide\r\nUpdated August 11, 2023\r\nAuthored by Lewis Cymbal, CEO\r\n1. Welcome to Cymbal Bank\r\nCongratulations on joining Cymbal Bank! We are excited to have you on our team.\r\n2. Your first day\r\nYour first day at Cymbal Bank will be a busy one. You will meet with your manager, HR department, and team\r\nmembers. You will also be given a tour of the office and introduced to your new colleagues.\r\n3. Your first week\r\nIn your first week at Cymbal Bank, you will be given a lot of information to take in. You will learn about the company\r\nculture, the products and services that we offer, and the expectations that we have for our employees. You will\r\nalso be given a lot of paperwork to fill out.\r\n4. Your first month\r\nIn your first month at Cymbal Bank, you will start to get into the swing of things

In [62]:
rag_retrieval_config = rag.RagRetrievalConfig(
    top_k=10,
    ranking=rag.Ranking(
        llm_ranker=rag.LlmRanker(
            model_name=MODEL_NAME
        ),
    )
)

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=rag_corpus.name,
        )
    ],
    text=RETRIEVAL_QUERY,
    rag_retrieval_config=rag_retrieval_config,
    similarity_top_k=10
)
print(response)

contexts {
  contexts {
    source_uri: "gs://cloud-samples-data/gen-app-builder/search/cymbal-bank-employee/Cymbal Bank Company Culture.pdf"
    text: "Cymbal Bank Company Culture\r\nUpdated August 11, 2023\r\nAuthored by Lewis Cymbal, CEO\r\n1. Be open-minded\r\nOne of the most important things you can do to be a productive member of an inclusive and diverse workforce is\r\nto be open-minded. This means being willing to learn about different cultures and perspectives, and being willing\r\nto change your own views when necessary. It also means being respectful of others, even if you disagree with\r\nthem.\r\n2. Be respectful\r\nRespect is essential for any productive workplace, but it is especially important in an inclusive and diverse\r\nworkforce. This means being respectful of other people\'s cultures, religions, and sexual orientations. It also means\r\nbeing respectful of other people\'s opinions, even if you disagree with them.\r\n3. Be inclusive\r\nInclusion is another importan

### Cleanup

In [40]:
# rag.delete_corpus(rag_corpus.name)

Successfully deleted the RagCorpus.
