In [1]:
#@title LICENSE

# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Use Vertex AI Extensions to query data in Vertex AI Search

## Overview
Vertex AI Extensions is a platform for creating and managing extensions that connect large language models to external systems via APIs. These external systems can provide LLMs with real-time data and perform data processing actions on their behalf. You can use pre-built or third-party extensions in Vertex AI Extensions.

Learn more about [Vertex AI Extensions](https://cloud.google.com/vertex-ai/docs/generative-ai/extensions/private/overview).

The Vertex AI search extension (YAML) provides access to data indexed in [Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/try-enterprise-search). This extension can handle user queries and send summaries back to users. It allows users take advantage of an LLM agent to perform various tasks such as answering user's questions over documents and data.

The steps performed include:

- Creating a pre-built extension in your project
- Getting detailed information about the extension
- Setting up various queries to the extension
- Running the extension and working with data store and results

### Additional information

This tutorial uses the following Google Cloud services and resources:
- Vertex AI Extensions
- Vertex AI Search

**NOTE**: This notebook has been tested in the following environment:

- Python version = 3.11

### Authenticate your Google Cloud account

You must authenticate to Google Cloud to access the pre-release version of the Python SDK and the Vertex AI Extensions feature.

In [7]:
import sys

if "google.colab" in sys.modules:
    # Authenticate user to Google Cloud
    from google.colab import auth
    auth.authenticate_user()

### Installation

This tutorial requires a pre-release version of the Python SDK for Vertex AI. You must be logged in with credentials that are registered for the Vertex AI Extensions Private Preview.

Run the following command to download the library as a wheel from a Cloud Storage bucket:

In [2]:
!gsutil cp gs://vertex_sdk_private_releases/llm_extension/google_cloud_aiplatform-1.39.dev20231219+llm.extension-py2.py3-none-any.whl .

Then, install the following packages required to execute this notebook:

In [3]:
!pip install --force-reinstall --quiet google_cloud_aiplatform-1.39.dev20231219+llm.extension-py2.py3-none-any.whl

Restart the kernel after installing packages:

In [4]:
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

## Before you begin

### Set up your Google Cloud project

**The following steps are required**, regardless of your notebook environment.

- [Select or create a Google Cloud project](https://pantheon.corp.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.
- [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
- [Enable the Vertex AI API](https://pantheon.corp.google.com/apis/enableflow?apiid=aiplatform.googleapis.com).
- If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk?hl=en).
- Your project must also be allowlisted for the Vertex AI Extension Private Preview.

This notebook requires that you have the following permissions for your GCP project: `roles/aiplatform.user`

### Set your project ID

**If you don't know your project ID**, try the following:

- Run `gcloud config list`.
- Run `gcloud projects list`.
- See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [5]:
PROJECT_ID = "your-project-id"  # @param {type:"string"}

# Set the project ID
!gcloud config set project {PROJECT_ID}

Updated property [core/project].


### Region

You can also change the REGION variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [6]:
# Set the region
REGION = "us-central1"  # @param {type: "string"}

### Import libraries

In [8]:
import base64
import io
import json
import pprint
import requests

import vertexai
from google.cloud.aiplatform.private_preview import llm_extension

from PIL import Image

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project.

In [9]:
vertexai.init(project=PROJECT_ID, location=REGION)

## Working with Vertex AI Extensions

### Create the extension

Now you can create the extension itself. The following cell uses the Python SDK to create the extension in Vertex AI Extensions.

In [10]:
extension_vertex_ai_search = llm_extension.Extension.create(
    display_name = "Vertex AI Search",
    description = "This extension executes queries against indexed data in Vertex AI Search",
    manifest = {
        "name": "vertex_ai_search",
        "description": "Vertex AI Search Extension",
        "api_spec": {
            "open_api_gcs_uri": "gs://vertex-extension-dev/discovery_search.yaml"
        },
        "auth_config": {
            "auth_type": "GOOGLE_SERVICE_ACCOUNT_AUTH",
            "google_service_account_config": {},
        },
    },
)
extension_vertex_ai_search

Creating Extension
Create Extension backing LRO: projects/964731510884/locations/us-central1/extensions/4349632815109767168/operations/2828402334269177856
Extension created. Resource name: projects/964731510884/locations/us-central1/extensions/4349632815109767168
To use this Extension in another session:
extension = aiplatform.Extension('projects/964731510884/locations/us-central1/extensions/4349632815109767168')


<google.cloud.aiplatform.private_preview.llm_extension.extensions.Extension object at 0x161337990> 
resource name: projects/964731510884/locations/us-central1/extensions/4349632815109767168

Now that you've imported the Vertex AI Search Extension, let's confirm that it's registered:

In [11]:
print("Name:", extension_vertex_ai_search.gca_resource.name)
print("Display Name:", extension_vertex_ai_search.display_name)
print("Description:", extension_vertex_ai_search.gca_resource.description)

Name: projects/964731510884/locations/us-central1/extensions/4349632815109767168
Display Name: Vertex AI Search
Description: This extension executes queries against indexed data in Vertex AI Search


### Create a search app and data store in Vertex AI Search

Follow the steps in the Vertex AI Search documentation to [Create a search app for website data](https://cloud.google.com/generative-ai-app-builder/docs/try-enterprise-search#create_a_search_app_for_website_data). In this example, you'll use a website-based data store that contains content from the Google Cloud website, including documentation.

Once you've created a data store, obtain the Data Store ID and input it below:

In [12]:
DATA_STORE_ID = "your-data-store-id_1234567890123"  # Replace this with your data store ID from Vertex AI Search
DATA_STORE_REGION = "global"  # Replace this with the region that your data store is located in

# Construct an object that points to the relevant data store
DATASTORE = "projects/{project_id}/locations/{data_store_region}/collections/default_collection/dataStores/{data_store_id}/servingConfigs/default_search".format(project_id=PROJECT_ID, data_store_region=DATA_STORE_REGION, data_store_id=DATA_STORE_ID)

### Search across Google Cloud documentation

In this example, we'll send a simple prompt asking about BigQuery:

In [13]:
QUERY = "What is BigQuery" # @param {type:"string"}

In [14]:
response = extension_vertex_ai_search.execute(
    "Search",
    operation_params = {
        "query": QUERY,
        "serving_config": DATASTORE
    },
)

Now you can extract the documents and web pages that are relevant to your query:

In [15]:
for i in response["results"]:
    print("Title:", "\t", i["document"]["derivedStructData"]["title"])
    print("Snippet:", i["document"]["derivedStructData"]["snippets"][0]["snippet"])
    print("Link:", "\t", i["document"]["derivedStructData"]["link"])
    print()

Title: 	 BigQuery overview | Google Cloud
Snippet: BigQuery stores data using a columnar storage format that is optimized for analytical queries. BigQuery presents data in tables, rows, and columns and provides ...
Link: 	 https://cloud.google.com/bigquery/docs/introduction

Title: 	 BigQuery Enterprise Data Warehouse
Snippet: BigQuery is Google Cloud's fully managed and completely serverless enterprise data warehouse. BigQuery supports all data types, works across clouds, and has ...
Link: 	 https://cloud.google.com/bigquery

Title: 	 Pricing | BigQuery: Cloud Data Warehouse | Google Cloud
Snippet: BigQuery is a serverless data analytics platform. You don't need to provision individual instances or virtual machines to use BigQuery. Instead, BigQuery ...
Link: 	 https://cloud.google.com/bigquery/pricing

Title: 	 BigQuery documentation | Google Cloud
Snippet: Use cases. Explore use cases, reference architectures, whitepapers, best practices, and industry solutions. ... Learn patterns a

The output from calling the extension includes a response object from Vertex AI Search that includes information about the relevant documents and other metadata.

## Clean up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources that you created in this tutorial:

In [16]:
# Delete the extension
extension_vertex_ai_search.delete()

Deleting Extension : projects/964731510884/locations/us-central1/extensions/4349632815109767168
Delete Extension  backing LRO: projects/964731510884/locations/us-central1/operations/3736581346626109440
Extension deleted. . Resource name: projects/964731510884/locations/us-central1/extensions/4349632815109767168
