## Overview

Retrieval Augmented Generation (RAG) improves Large Language Models (LLMs) by allowing them to access and process external information sources during generation. This ensures the model's responses are grounded in factual data and avoids hallucinations.

A common problem with LLMs is that they don't understand private knowledge, that
is, your organization's data. With RAG Engine, you can enrich the
LLM context with additional private information, because the model can reduce
hallucinations and answer questions more accurately.

By combining additional knowledge sources with the existing knowledge that LLMs
have, a better context is provided. The improved context along with the query
enhances the quality of the LLM's response.

The following concepts are key to understanding Vertex AI RAG Engine. These concepts are listed in the order of the
retrieval-augmented generation (RAG) process.

1. **Data ingestion**: Intake data from different data sources. For example,
  local files, Google Cloud Storage, and Google Drive.

1. **Data transformation**: Conversion of the data in preparation for indexing. For example, data is split into chunks.

1. **Embedding**: Numerical representations of words or pieces of text. These numbers capture the
   semantic meaning and context of the text. Similar or related words or text
   tend to have similar embeddings, which means they are closer together in the
   high-dimensional vector space.

1. **Data indexing**: RAG Engine creates an index called a corpus.
   The index structures the knowledge base so it's optimized for searching. For
   example, the index is like a detailed table of contents for a massive
   reference book.

1. **Retrieval**: When a user asks a question or provides a prompt, the retrieval
  component in RAG Engine searches through its knowledge
  base to find information that is relevant to the query.

1. **Generation**: The retrieved information becomes the context added to the
  original user query as a guide for the generative AI model to generate
  factually grounded and relevant responses.

For more information, refer to the public documentation for [Vertex AI RAG Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/rag-overview).

## Get started

### Install Vertex AI SDK and Google Gen AI SDK


In [1]:
%pip install --upgrade --quiet google-cloud-aiplatform google-genai

Note: you may need to restart the kernel to use updated packages.


You should consider upgrading via the 'c:\Users\moham\AppData\Local\Programs\Python\Python310\python.exe -m pip install --upgrade pip' command.


### Restart runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

The restart might take a minute or longer. After it's restarted, continue to the next step.

In [None]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>


### Set Google Cloud project information and initialize Vertex AI SDK

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
# Use the environment variable if the user doesn't provide Project ID.
import os
from google.cloud import storage
from google import genai
import vertexai

PROJECT_ID = "oa-bta-learning-dv"  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
LOCATION = "europe-west4"

vertexai.init(project=PROJECT_ID, location=LOCATION)
client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)

### Import libraries

In [6]:
from IPython.display import Markdown, display
from google.genai.types import GenerateContentConfig, Retrieval, Tool, VertexRagStore
from vertexai import rag

### Create a RAG Corpus

In [7]:
# Currently supports Google first-party embedding models
EMBEDDING_MODEL = "publishers/google/models/text-embedding-005"  # @param {type:"string", isTemplate: true}

rag_corpus = rag.create_corpus(
    display_name="my-rag-corpus",
    backend_config=rag.RagVectorDbConfig(
        rag_embedding_model_config=rag.RagEmbeddingModelConfig(
            vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
                publisher_model=EMBEDDING_MODEL
            )
        )
    ),
)

### Check the corpus just created

In [8]:
rag.list_corpora()

ListRagCorporaPager<rag_corpora {
  name: "projects/oa-bta-learning-dv/locations/europe-west4/ragCorpora/6917529027641081856"
  display_name: "my-rag-corpus"
  create_time {
    seconds: 1750681983
    nanos: 580964000
  }
  update_time {
    seconds: 1750681983
    nanos: 580964000
  }
  corpus_status {
    state: ACTIVE
  }
  vector_db_config {
    rag_managed_db {
      knn {
      }
    }
    rag_embedding_model_config {
      vertex_prediction_endpoint {
        endpoint: "projects/oa-bta-learning-dv/locations/europe-west4/publishers/google/models/text-embedding-005"
      }
    }
  }
}
>

### Import files from Google Cloud Storage

Here we will import pdf files into the rag corpus

In [12]:
def list_blobs(bucket_name):
    """Lists all the blobs in the bucket."""
    # bucket_name = "hackathon-team-1-rag-data"
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blobs = bucket.list_blobs()
    for blob in blobs:
        print(blob.name)
list_blobs("hackathon-team-1-rag-data")

Hackathon_user_profile_synthethic_data.xlsx
Noli Article PDFs/Cleanse.pdf
Noli Article PDFs/Moisturise.pdf
Noli Article PDFs/Prepare.pdf
Noli Article PDFs/Protect.pdf
Noli Article PDFs/Skin_Layering.pdf
Noli Article PDFs/The_truth_about_Adenosine.pdf
Noli Article PDFs/The_truth_about_Caffeine.pdf
Noli Article PDFs/The_truth_about_Citric_Acid.pdf
Noli Article PDFs/The_truth_about_Glycerin.pdf
Noli Article PDFs/The_truth_about_Glycolic_Acid.pdf
Noli Article PDFs/The_truth_about_Hyaluronic_Acid.pdf
Noli Article PDFs/The_truth_about_Niacinamide.pdf
Noli Article PDFs/The_truth_about_Panthenol.pdf
Noli Article PDFs/The_truth_about_Propylene_Glycol.pdf
Noli Article PDFs/The_truth_about_Vitamin_C.pdf
Noli Article PDFs/The_truth_about_Vitamin_E.pdf
Noli Article PDFs/The_truth_about_balanced_skin.pdf
Noli Article PDFs/The_truth_about_balms.pdf
Noli Article PDFs/The_truth_about_blemish_prone_skin.pdf
Noli Article PDFs/The_truth_about_ceramides.pdf
Noli Article PDFs/The_truth_about_cleansers.pdf
N

In [17]:
INPUT_GCS_BUCKET = (
    "gs://hackathon-team-1-rag-data/Noli FAQ PDFs/"
)

response = rag.import_files(
    corpus_name=rag_corpus.name,
    paths=[INPUT_GCS_BUCKET],
    # Optional
    transformation_config=rag.TransformationConfig(
        chunking_config=rag.ChunkingConfig(chunk_size=1024, chunk_overlap=100)
    ),
    max_embedding_requests_per_min=900,  # Optional
)

### Optional: Perform direct context retrieval

In [18]:
# Direct context retrieval
response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=rag_corpus.name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    rag_retrieval_config=rag.RagRetrievalConfig(
        top_k=10,  # Optional
        filter=rag.Filter(
            vector_distance_threshold=0.5,  # Optional
        ),
    ),
    text="What is the support email in the Noli FAQ PDFs",
)
print(response)

# Optional: The retrieved context can be passed to any SDK or model generation API to generate final results.
# context = " ".join([context.text for context in response.contexts.contexts]).replace("\n", "")

contexts {
  contexts {
    source_uri: "gs://hackathon-team-1-rag-data/Noli FAQ PDFs/Contact_Us.pdf"
    text: "FAQs - Contact Us\r\nQ: Email\r\nEmail us at: help@support.noli.com"
    source_display_name: "Contact_Us.pdf"
    score: 0.18890072023445559
    chunk {
      text: "FAQs - Contact Us\r\nQ: Email\r\nEmail us at: help@support.noli.com"
      page_span {
        first_page: 1
        last_page: 1
      }
    }
  }
  contexts {
    source_uri: "gs://hackathon-team-1-rag-data/Noli FAQ PDFs/My_Account.pdf"
    text: "FAQs - My Account\r\nQ: How do I create an account?\r\nCreating your account is easy. Click here and follow the simple steps.\r\nQ: How do I re-set my password?\r\nForget your password or just want to change it? No stress! Please log out or go directly to the\r\nsign-in screen and click ‘Forgotten password?’ and follow the instructions to reset your\r\npassword.\r\nQ: How do I delete my account?\r\nAlthough it\'s sad to see you go, you can email us at help@support.n

### Create RAG Retrieval Tool

In [19]:
# Create a tool for the RAG Corpus
rag_retrieval_tool = Tool(
    retrieval=Retrieval(
        vertex_rag_store=VertexRagStore(
            rag_corpora=[rag_corpus.name],
            similarity_top_k=10,
            vector_distance_threshold=0.5,
        )
    )
)

### Generate Content with Gemini using RAG Retrieval Tool

In [21]:
MODEL_ID = "gemini-2.0-flash-001"

In [22]:
response = client.models.generate_content(
    model=MODEL_ID,
    contents="What is the support email in the Noli FAQ PDFs",
    config=GenerateContentConfig(tools=[rag_retrieval_tool]),
)

display(Markdown(response.text))

The support email is help@support.noli.com.


In [24]:
%pip install gradio --quiet

Note: you may need to restart the kernel to use updated packages.


You should consider upgrading via the 'c:\Users\moham\AppData\Local\Programs\Python\Python310\python.exe -m pip install --upgrade pip' command.


In [25]:
import gradio as gr

def ask_agent(question):
    # RAG retrieval and Gemini generation
    response = client.models.generate_content(
        model=MODEL_ID,
        contents=question,
        config=GenerateContentConfig(tools=[rag_retrieval_tool]),
    )
    return response.text

demo = gr.Interface(
    fn=ask_agent,
    inputs=gr.Textbox(lines=2, label="Ask a question"),
    outputs=gr.Textbox(label="Answer"),
    title="Vertex AI RAG Q&A",
    description="Ask a question and get an answer from your RAG-powered agent."
)

demo.launch()  # This will run on localhost by default

  from .autonotebook import tqdm as notebook_tqdm


* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.


