In [None]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

## Overview

[Grounding in Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini) lets you use generative text models to generate content grounded in your own documents and data. This capability lets the model access information at runtime that goes beyond its training data. By grounding model responses in Google Search results or data stores within [Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/enterprise-search-introduction), LLMs that are grounded in data can produce more accurate, up-to-date, and relevant responses.

Grounding provides the following benefits:

- Reduces model hallucinations (instances where the model generates content that isn't factual)
- Anchors model responses to specific information, documents, and data sources
- Enhances the trustworthiness, accuracy, and applicability of the generated content

In the context of grounding in Vertex AI, you can configure two different sources of grounding:

1. Google Search results for data that is publicly available and indexed
2. [Data stores in Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/create-datastore-ingest), which can include your own data in the form of website data, unstructured data, or structured data

### Allowlisting

Some of the features in this sample notebook require access to certain features via an allowlist. [Grounding with Vertex AI Search](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini) is available in Public Preview, whereas Grounding with Google Search results is generally available.

If you use this service in a production application, you will also need to [use a Google Search entry point](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/grounding-search-entry-points).

### Objective

In this tutorial, you learn how to:

- Generate LLM text and chat model responses grounded in Google Search results
- Compare the results of ungrounded LLM responses with grounded LLM responses
- Create and use a data store in Vertex AI Search to ground responses in custom documents and data
- Generate LLM text and chat model responses grounded in Vertex AI Search results

This tutorial uses the following Google Cloud AI services and resources:

- Vertex AI

The steps performed include:

- Configuring the LLM and prompt for various examples
- Sending example prompts to generative text and chat models in Vertex AI
- Sending example prompts with various levels of grounding (no grounding, web grounding, data store grounding)

## Before you begin

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.
1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
1. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).

Restart the kernel after installing packages:

### Set Google Cloud project information and initialize Vertex AI SDK

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

**If you don't know your project ID**, try the following:
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [None]:
PROJECT_ID = "your-project-id"  # @param {type:"string"}
REGION = "us-central1"  # @param {type: "string"}

In [None]:
import vertexai

vertexai.init(project=PROJECT_ID, location=REGION)

### Import libraries

In [None]:
from IPython.display import Markdown, display
from vertexai.generative_models import (
    GenerationResponse,
    GenerativeModel,
    Tool,
    grounding,
)
from vertexai.preview.generative_models import grounding as preview_grounding

Initialize the Gemini model from Vertex AI:

In [None]:
model = GenerativeModel("gemini-1.5-pro")

## Helper functions

In [None]:
def print_grounding_response(response: GenerationResponse):
    """Prints Gemini response with grounding citations."""
    grounding_metadata = response.candidates[0].grounding_metadata

    # Citation indices are in byte units
    ENCODING = "utf-8"
    text_bytes = response.text.encode(ENCODING)

    prev_index = 0
    markdown_text = ""

    for grounding_support in grounding_metadata.grounding_supports:
        text_segment = text_bytes[
            prev_index : grounding_support.segment.end_index
        ].decode(ENCODING)

        footnotes_text = ""
        for grounding_chunk_index in grounding_support.grounding_chunk_indices:
            footnotes_text += f"[{grounding_chunk_index + 1}]"

        markdown_text += f"{text_segment} {footnotes_text}\n"
        prev_index = grounding_support.segment.end_index

    if prev_index < len(text_bytes):
        markdown_text += str(text_bytes[prev_index:], encoding=ENCODING)

    markdown_text += "\n----\n## Grounding Sources\n"

    if grounding_metadata.web_search_queries:
        markdown_text += (
            f"\n**Web Search Queries:** {grounding_metadata.web_search_queries}\n"
        )
        if grounding_metadata.search_entry_point:
            markdown_text += f"\n**Search Entry Point:**\n {grounding_metadata.search_entry_point.rendered_content}\n"
    elif grounding_metadata.retrieval_queries:
        markdown_text += (
            f"\n**Retrieval Queries:** {grounding_metadata.retrieval_queries}\n"
        )

    markdown_text += "### Grounding Chunks\n"

    for index, grounding_chunk in enumerate(
        grounding_metadata.grounding_chunks, start=1
    ):
        context = grounding_chunk.web or grounding_chunk.retrieved_context
        if not context:
            print(f"Skipping Grounding Chunk {grounding_chunk}")
            continue

        markdown_text += f"{index}. [{context.title}]({context.uri})\n"

    display(Markdown(markdown_text))

## Example: Grounding with Google Search results

In this example, you'll compare LLM responses with no grounding with responses that are grounded in the results of a Google Search. You'll ask a question about a recent hardware release from the Google Store.

In [None]:
PROMPT = "You are an expert in astronomy. When is the next solar eclipse in the US?"

### Text generation without grounding

Make a prediction request to the LLM with no grounding:

In [None]:
response = model.generate_content(PROMPT)

display(Markdown(response.text))

### Text generation grounded in Google Search results

Now you can add the `tools` keyword arg with a `grounding` tool of `grounding.GoogleSearchRetrieval()` to instruct the LLM to first perform a Google Search with the prompt, then construct an answer based on the web search results.

The search queries and [Search Entry Point](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/grounding-search-entry-points) are available for each `Candidate` in the response. The helper function `print_grounding_response()` prints the response text with citations.

In [None]:
tool = Tool.from_google_search_retrieval(grounding.GoogleSearchRetrieval())

response = model.generate_content(PROMPT, tools=[tool])

print_grounding_response(response)

Note that the response without grounding only has limited information from the LLM about solar eclipses. Whereas the response that was grounded in web search results contains the most up to date information from web search results that are returned as part of the LLM with grounding request.

## Example: Grounded chat responses

You can also use grounding when working with chat models in Vertex AI. In this example, you'll compare LLM responses with no grounding with responses that are grounded in the results of a Google Search and a data store in Vertex AI Search.

You'll ask a question about Vertex AI and a follow up question about managed datasets in Vertex AI:

In [None]:
PROMPT = "What are managed datasets in Vertex AI?"
PROMPT_FOLLOWUP = "What types of data can I use?"

### Chat session without grounding

Start a chat session and send messages to the LLM with no grounding:

In [None]:
chat = model.start_chat()

response = chat.send_message(PROMPT)
display(Markdown(response.text))

response = chat.send_message(PROMPT_FOLLOWUP)
display(Markdown(response.text))

### Chat session grounded in Google Search results

Now you can add the `tools` keyword arg with a grounding tool of `grounding.GoogleSearchRetrieval()` to instruct the chat model to first perform a Google Search with the prompt, then construct an answer based on the web search results:

In [None]:
chat = model.start_chat()
tool = Tool.from_google_search_retrieval(grounding.GoogleSearchRetrieval())

response = chat.send_message(PROMPT, tools=[tool])
print_grounding_response(response)

response = chat.send_message(PROMPT_FOLLOWUP, tools=[tool])
print_grounding_response(response)