In [1]:
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Getting Started with Grounding in Vertex AI

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/language/grounding/intro-grounding.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo"><br> Run in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/grounding/intro-grounding.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/blob/main/language/grounding/intro-grounding.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"><br> Open in Vertex AI Workbench
    </a>
  </td>
</table>

**_NOTE_**: This notebook has been tested in the following environment:

* Python version = 3.10

## Overview

[Grounding in Vertex AI](https://cloud.google.com/vertex-ai/docs/generative-ai/grounding/overview) lets you use language models (e.g., [`text-bison` and `chat-bison`](https://cloud.google.com/vertex-ai/docs/generative-ai/language-model-overview)) to generate content grounded in your own documents and data. This capability lets the model access information at runtime that goes beyond its training data. By grounding model responses in Google Search results or data stores within [Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/enterprise-search-introduction), LLMs that are grounded in data can produce more accurate, up-to-date, and relevant responses.

Grounding provides the following benefits:

- Reduces model hallucinations (instances where the model generates content that isn't factual)
- Anchors model responses to specific information, documents, and data sources
- Enhances the trustworthiness, accuracy, and applicability of the generated content

In the context of grounding in Vertex AI, you can configure two different sources of grounding:

1. Google Search results for data that is publicly available and indexed
1. [Data stores in Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/create-datastore-ingest), which can include your own data in the form of website data, unstructured data, or structured data

### Objective

In this tutorial, you learn how to:

- Generate LLM text and chat model responses grounded in Google Search results
- Compare the results of ungrounded LLM responses with grounded LLM responses
- Create and use a data store in Vertex AI Search to ground responses in custom documents and data
- Generate LLM text and chat model responses grounded in Vertex AI Search results
- Use the asynchronous text and chat models APIs with grounding

This tutorial uses the following Google Cloud AI services and resources:

- Vertex AI
- Vertex AI Search and Conversation

The steps performed include:

- Configuring the LLM and prompt for various examples
- Sending example prompts to generative text and chat models in Vertex AI
- Setting up a data store in Vertex AI Search with your own data
- Sending example prompts with various levels of grounding (no grounding, web grounding, data store grounding)

## Before you begin

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.
1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
1. Enable the [Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com) and [Vertex AI Search and Conversation API](https://console.cloud.google.com/flows/enableapi?apiid=discoveryengine.googleapis.com).
1. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).

### Installation

Install the following packages required to execute this notebook.

In [2]:
!pip install --upgrade --quiet google-cloud-aiplatform==1.36.4

Restart the kernel after installing packages:

In [3]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

### Configure your project ID

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [4]:
PROJECT_ID = "your-project-id"  # @param {type:"string"}

# Set the project ID
!gcloud config set project {PROJECT_ID}

Updated property [core/project].


### Configure your region

You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [5]:
REGION = "us-central1"  # @param {type: "string"}

### Authenticate your Google Cloud account

If you are running this notebook on Google Colab, you will need to authenticate your environment. To do this, run the new cell below. This step is not required if you are using Vertex AI Workbench.

In [6]:
import sys

if "google.colab" in sys.modules:
    # Authenticate user to Google Cloud
    from google.colab import auth
    auth.authenticate_user()

### Import libraries

In [7]:
import vertexai
from google.cloud import aiplatform
from vertexai.language_models import TextGenerationModel, ChatModel, GroundingSource

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project:

In [8]:
aiplatform.init(project=PROJECT_ID, location=REGION)

Initialize the generative text and chat models from Vertex AI:

In [9]:
text_model = TextGenerationModel.from_pretrained("text-bison@001")
chat_model = ChatModel.from_pretrained("chat-bison@001")

## Example: Grounding with Google Search results

In this example, you'll compare LLM responses with no grounding with responses that are grounded in the results of a Google Search. You'll ask a question about a recent hardware release from the Google Store.

In [10]:
PROMPT = "What's the release date and price of a Pixel Tablet in USD?"

### Text generation without grounding

Make a prediction request to the LLM with no grounding:

In [11]:
response = text_model.predict(PROMPT)

response, response.grounding_metadata

(The Pixel Tablet is expected to be released in 2023. The price has not yet been announced, but it is expected to be in the same price range as other Pixel devices.,
 GroundingMetadata(citations=[], search_queries=[]))

### Text generation grounded in Google Search results

Now you can add the `grounding_source` keyword arg with a grounding source of `GroundingSource.WebSearch()` to instruct the LLM to first perform a Google Search with the prompt, then construct an answer based on the web search results:

In [12]:
grounding_source = GroundingSource.WebSearch()

response = text_model.predict(
    PROMPT,
    grounding_source=grounding_source,
)

response, response.grounding_metadata

(The Google Pixel Tablet was released in June 2023. The price of the Pixel Tablet starts at $499 for the 128GB version.,
 GroundingMetadata(citations=[GroundingCitation(start_index=0, end_index=50, url='https://www.androidauthority.com/google-pixel-tablet-3163922/', title=None, license=None, publication_date=None), GroundingCitation(start_index=51, end_index=118, url='https://www.phonearena.com/google-pixel-tablet-release-date-price-features-and-news', title=None, license=None, publication_date=None)], search_queries=["What's the release date and price of a Pixel Tablet in USD?"]))

Note that the response without grounding only has outdated information from the LLM about the potential release date of the Pixel tablet. Whereas the response that was grounded in web search results contains the most up to date information from web search results that are returned as part of the LLM with grounding request.

## Example: Grounding with custom documents and data

In this example, you'll compare LLM responses with no grounding with responses that are grounded in the [results of a data store in Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/create-datastore-ingest). You'll ask a question about a GoogleSQL query to create an [object table in BigQuery](https://cloud.google.com/bigquery/docs/object-table-introduction).

### Creating a data store in Vertex AI Search

Follow the steps in the [Vertex AI Search getting started documentation](https://cloud.google.com/generative-ai-app-builder/docs/try-enterprise-search#create_a_search_app_for_website_data) to create a data store in Vertex AI Search with sample data. In this example, you'll use a website-based data store that contains content from the Google Cloud website, including documentation.

Once you've created a data store, obtain the Data Store ID and input it below.

In [13]:
DATA_STORE_ID = "your-data-store-id_1701141840334"  # Replace this with your data store ID from Vertex AI Search
DATA_STORE_REGION = "global"

In [14]:
PROMPT = "What is the SQL command to create an object table in BigQuery with caching enabled?"

### Text generation without grounding

Make a prediction request to the LLM with no grounding:

In [15]:
response = text_model.predict(PROMPT)

response.grounding_metadata, response

(GroundingMetadata(citations=[], search_queries=[]),
 ```sql
CREATE TABLE my_table (
  id INT64 NOT NULL,
  name STRING(MAX) NOT NULL,
  data BLOB,
)
OPTIONS (
  gcs_staging_dir = 'gs://my-bucket/my-table',
  caching = 'ALL'
);
```

This command will create an object table in BigQuery with the following properties:

* The table will be named `my_table`.
* The table will have three columns: `id`, `name`, and `data`.
* The `)

### Text generation grounded in Vertex AI Search results

Now we can add the `grounding_source` keyword arg with a grounding source of `GroundingSource.VertexAISearch()` to instruct the LLM to first perform a search within your custom data store, then construct an answer based on the relevant documents:

In [16]:
grounding_souce = GroundingSource.VertexAISearch(
    data_store_id=DATA_STORE_ID, location=DATA_STORE_REGION
)

response = text_model.predict(
    PROMPT,
    grounding_source=grounding_souce,
)

response.grounding_metadata, response

(GroundingMetadata(citations=[], search_queries=['How to create an object table in BigQuery with caching enabled?']),
 The SQL command to create an object table in BigQuery with caching enabled is:

```
CREATE TABLE my_table
OPTIONS (
  object_metadata="DIRECTORY",
  uris = ['gs://bq-object-tables-sports-bucket/*' ],
  max_staleness=INTERVAL 30 MINUTES
);
```

This command creates a table named `my_table` that is stored in the Google Cloud Storage bucket `bq-object-tables-sports-bucket`. The table is configured to use metadata caching, which means that the table's metadata will)

Note that the response without grounding only has limited information from the LLM about GoogleSQL syntax that might not be completely valid. Whereas the response that was grounded in Vertex AI Search results contains the most up to date information about the correct GoogleSQL query syntax based on documents returned from the Google Cloud documentation about BigQuery.

## Example: Grounded chat responses

You can also use grounding when working with chat models in Vertex AI. In this example, you'll compare LLM responses with no grounding with responses that are grounded in the results of a Google Search and a data store in Vertex AI Search.

You'll ask a question about Vertex AI and a follow up question about how to get started with Vertex AI using the C# client library.

In [17]:
PROMPT = "What is Vertex AI?"

### Chat session without grounding

Start a chat session and send messages to the LLM with no grounding:

In [18]:
chat = chat_model.start_chat()

response = chat.send_message(PROMPT)
print(response.grounding_metadata)
print(response.text)

response = chat.send_message(
    "Is there a simple tutorial available for the C# client library?"
)
print(response.grounding_metadata)
print(response.text)

GroundingMetadata(citations=[], search_queries=[])
 Vertex AI is a unified machine learning platform that helps you build, deploy, and manage machine learning models at scale. It provides a single environment for all your machine learning needs, from data preparation to model training to model deployment. Vertex AI is built on the Google Cloud Platform, and it integrates with other Google Cloud services such as BigQuery, Cloud Storage, and Cloud Dataproc.
GroundingMetadata(citations=[], search_queries=[])
Yes, there is a simple tutorial available for the C# client library. You can find it here: https://cloud.google.com/vertex-ai/docs/reference/libraries/csharp/quickstart


### Chat session grounded in Google Search results

Now you can add the `grounding_source` keyword arg with a grounding source of `GroundingSource.WebSearch()` to instruct the chat model to first perform a Google Search with the prompt, then construct an answer based on the web search results:

In [19]:
chat = chat_model.start_chat()
grounding_source = GroundingSource.WebSearch()

response = chat.send_message(
    PROMPT,
    grounding_source=grounding_source,
)

response = chat.send_message(PROMPT)
print(response.grounding_metadata)
print(response.text)

response = chat.send_message(
    "Is there a simple tutorial available for the C# client library?"
)
print(response.grounding_metadata)
print(response.text)

GroundingMetadata(citations=[], search_queries=[])
Vertex AI is a machine learning (ML) platform that lets you train and deploy ML models and AI applications, and customize large language models (LLMs) for use in production.
GroundingMetadata(citations=[], search_queries=[])
Yes, there is a simple tutorial available for the C# client library.


### Chat session grounded in Vertex AI Search results

Now you can add the `grounding_source` keyword arg with a grounding source of `GroundingSource.VertexAISearch()` to instruct the chat model to first perform a search within your custom data store, then construct an answer based on the relevant documents:

In [20]:
chat = chat_model.start_chat()
grounding_source = GroundingSource.VertexAISearch(
    data_store_id=DATA_STORE_ID, location=DATA_STORE_REGION
)

response = chat.send_message(
    PROMPT,
    grounding_source=grounding_source,
)

response = chat.send_message(PROMPT)
print(response.grounding_metadata)
print(response.text)

response = chat.send_message(
    "Is there a simple tutorial available for the C# client library?"
)
print(response.grounding_metadata)
print(response.text)

GroundingMetadata(citations=[], search_queries=[])
Vertex AI is a machine learning (ML) platform that lets you train and deploy ML models and AI applications, and customize large language models (LLMs) for use in production.
GroundingMetadata(citations=[], search_queries=[])
Yes, there is a simple tutorial available for the C# client library.


## Example: Grounded async text and chat responses

You can also use grounding in Vertex AI when working with the asynchronous APIs for the text and chat models. In this example, you'll compare LLM responses with no grounding with responses that are grounded in the results of a data store in Vertex AI Search.

You'll ask a question about different services available in Google Cloud.

In [21]:
PROMPT = "What are the different types of databases available in Google Cloud?"

### Async text generation grounded in Google Search results

In [22]:
grounding_souce = GroundingSource.WebSearch()

response = await text_model.predict_async(
    PROMPT,
    grounding_source=grounding_souce,
)

response.grounding_metadata, response

(GroundingMetadata(citations=[GroundingCitation(start_index=134, end_index=172, url='https://cloud.google.com/blog/topics/developers-practitioners/your-google-cloud-database-options-explained', title=None, license=None, publication_date=None)], search_queries=['What are the different types of databases available in Google Cloud?']),
 Google Cloud Database offers fully managed solutions both for relational and non relational databases with high security features. · Relational : Cloud SQL and AlloyDB. · Non-relational : Cloud Firestore, Cloud Bigtable, Cloud Spanner.)

### Async text generation grounded in Vertex AI Search results

In [24]:
grounding_souce = GroundingSource.VertexAISearch(
    data_store_id=DATA_STORE_ID, location=DATA_STORE_REGION
)

response = await text_model.predict_async(
    PROMPT,
    grounding_source=grounding_souce,
)

response.grounding_metadata, response

(GroundingMetadata(citations=[GroundingCitation(start_index=134, end_index=172, url='https://developers.google.com/learn/topics/databases', title=None, license=None, publication_date=None)], search_queries=['What are the different types of databases available in Google Cloud?']),
 Google Cloud Database offers fully managed solutions both for relational and non relational databases with high security features. · Relational : Cloud SQL and AlloyDB. · Non-relational : Cloud Firestore, Cloud Bigtable, Cloud Spanner.)

### Async chat session grounded in Google Search results

In [26]:
chat = chat_model.start_chat()

grounding_source = GroundingSource.WebSearch()
response = await chat.send_message_async(
    PROMPT,
    grounding_source=grounding_source,
)

response.grounding_metadata, response

(GroundingMetadata(citations=[], search_queries=[]),
 Google Cloud offers a wide range of databases to meet the needs of your business. These include relational databases, NoSQL databases, and in-memory databases.

Relational databases are the most common type of database and are used for storing structured data. Google Cloud offers a variety of relational databases, including Cloud SQL, Cloud Spanner, and BigQuery.

NoSQL databases are a type of database that is not based on the relational model. They are typically used for storing unstructured data or data that does not fit well into a relational model. Google Cloud offers a variety of NoSQL databases, including Cloud Firestore, Cloud Bigtable, and Cloud Datastore.

In-memory databases are a type of database that stores data in memory rather than on disk. This makes them very fast, but they also have a limited capacity. Google Cloud offers a single in-memory database, Cloud Memorystore.)

### Async chat session grounded in Vertex AI Search results

In [27]:
chat = chat_model.start_chat()

grounding_source = GroundingSource.VertexAISearch(
    data_store_id=DATA_STORE_ID, location=DATA_STORE_REGION
)
response = await chat.send_message_async(
    PROMPT,
    grounding_source=grounding_source,
)

response.grounding_metadata, response

(GroundingMetadata(citations=[], search_queries=[]),
 Google Cloud offers a wide range of databases to meet the needs of your business. These include relational databases, NoSQL databases, and in-memory databases.

Relational databases are the most common type of database and are used for storing structured data. Google Cloud offers a variety of relational databases, including Cloud SQL, Cloud Spanner, and BigQuery.

NoSQL databases are a type of database that is not based on the relational model. They are typically used for storing unstructured data or data that does not fit well into a relational model. Google Cloud offers a variety of NoSQL databases, including Cloud Firestore, Cloud Bigtable, and Cloud Datastore.

In-memory databases are a type of database that stores data in memory rather than on disk. This makes them very fast, but they also have a limited capacity. Google Cloud offers a single in-memory database, Cloud Memorystore.)

## Cleaning up

To avoid incurring charges to your Google Cloud account for the resources used in this notebook, follow these steps:

1. To avoid unnecessary Google Cloud charges, use the [Google Cloud console](https://console.cloud.google.com/) to delete your project if you do not need it. Learn more in the Google Cloud documentation for [managing and deleting your project](https://cloud.google.com/resource-manager/docs/creating-managing-projects).
1. If you used an existing Google Cloud project, delete the resources you created to avoid incurring charges to your account. For more information, refer to the documentation to [Delete data from a data store in Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/delete-datastores), then delete your data store.
1. Disable the [Vertex AI Search and Conversation API](https://pantheon.corp.google.com/apis/api/discoveryengine.googleapis.com) and [Vertex AI API](https://pantheon.corp.google.com/apis/api/aiplatform.googleapis.com) in the Google Cloud Console.