In [None]:
#@title LICENSE

# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Use Vertex AI Extensions with a Custom Extension

## Overview


Vertex AI Extensions is a platform for creating and managing extensions that connect large language models to external systems via APIs. These external systems can provide LLMs with real-time data and perform data processing actions on their behalf. You can use pre-built or third-party extensions in Vertex AI Extensions.

Learn more about [Vertex AI Extensions](https://cloud.google.com/vertex-ai/docs/generative-ai/extensions/private/overview).

This notebook provides a simple getting started experience for the Vertex AI Extensions framework. This guide assumes that you are familiar with the Vertex AI Python SDK, [LangChain](https://python.langchain.com/docs/get_started/introduction), [OpenAPI specification](https://swagger.io/specification/), and [Cloud Run](https://cloud.google.com/run/docs).

### Objective

In this tutorial, you learn how to create an extension service backend on Cloud Run, register the extension with Vertex, and then use the extension in an application.

The steps performed include:

- Creating a simple service running on Cloud Run
- Creating an OpenAPI 3.1 YAML file for the Cloud Run service
- Registering the service as an extension with Vertex AI
- Using the extension to respond to user queries
- Integrate LangChain into the reasoning for an extension

### Additional Information

This tutorial uses the following Google Cloud services and resources:

- Vertex AI Extensions
- Cloud Run

**_NOTE_**: This notebook has been tested in the following environment:

* Python version = 3.11

### Authenticate your Google Cloud account

You must authenticate to Google Cloud to access the pre-release version of the Python SDK and the Vertex AI Extensions feature.

In [None]:
import sys

if "google.colab" in sys.modules:
    # Authenticate user to Google Cloud
    from google.colab import auth
    auth.authenticate_user()

### Installation

This tutorial requires a pre-release version of the Python SDK for Vertex AI. You must be logged in with credentials that are registered for the Vertex AI Extensions Private Preview.

Run the following command to download the library as a wheel from a Cloud Storage bucket:

In [None]:
!gsutil cp gs://vertex_sdk_private_releases/llm_extension/google_cloud_aiplatform-1.39.dev20231219+llm.extension-py2.py3-none-any.whl .

Then, install the following packages required to execute this notebook:

In [None]:
!pip install --force-reinstall --quiet google_cloud_aiplatform-1.39.dev20231219+llm.extension-py2.py3-none-any.whl
!pip install --upgrade --quiet "langchain==0.0.331" \
"openapi-schema-pydantic==1.2.4" \
"openapi-pydantic==0.3.2" \
"google-cloud-storage" \
"shapely<2"

Restart the kernel after installing packages:

In [None]:
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

## Before you begin

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.
1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
1. [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
1. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).
1. Your project must also be allowlisted for the Vertex AI Extension Private Preview.
1. This notebook requires that you have the following permissions for your GCP project:
- `roles/aiplatform.user`

### Set your project ID

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [None]:
PROJECT_ID = "your-project-id"  # @param {type:"string"}

# Set the project id
!gcloud config set project {PROJECT_ID}

Updated property [core/project].


### Region

You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "us-central1"  # @param {type: "string"}

### Create a Cloud Storage bucket

Create a storage bucket to store intermediate artifacts such as datasets.

In [None]:
BUCKET_NAME = "your-bucket-name"  # @param {type:"string"}
BUCKET_URI = f"gs://{BUCKET_NAME}"
extensions_prefix = "extension"

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
!gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI

### Import libraries



In [None]:
import os

import vertexai
from google.cloud.aiplatform.private_preview import llm_extension
from google.cloud import storage

from langchain import PromptTemplate, LLMChain
from langchain.llms import VertexAI
from langchain.tools import OpenAPISpec, APIOperation
from langchain.chains import OpenAPIEndpointChain
from langchain.requests import Requests

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project.

In [None]:
vertexai.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)

## Creating an API backend service

In this tutorial, you create a simple "hello world" service that runs on Cloud Run. This service returns "hello" in one of several languages, depending on the prompt sent from your extension (more on that later).

This simple example does not demonstrate best practices for authentication. Authenticating to your service is covered later.

**Note**: Your backend API service does not need to be hosted on Cloud Run.

### Deploy the API service to Cloud Run

In [None]:
if not os.path.exists("extension"):
    os.mkdir("extension")

In [None]:
%%writefile extension/Dockerfile

FROM python:3.11-slim

ENV PYTHONUNBUFFERED True

ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./

RUN pip install --no-cache-dir -r requirements.txt

CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 extension:app

Overwriting extension/Dockerfile


In [None]:
%%writefile extension/extension.py
from flask import Flask, jsonify, request

app = Flask(__name__)


@app.route("/hello", methods=["GET"])
def hello_world():
    args = request.args
    prompt = args.get("prompt")
    data = {
      "output": "hello"
    }
    if prompt == "French":
        data["output"] = "bonjour"
    elif prompt == "Spanish":
        data["output"] = "hola"

    return jsonify(data)


if __name__ == "__main__":
    app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))

Overwriting extension/extension.py


In [None]:
%%writefile extension/requirements.txt
Flask==2.3.3
gunicorn==21.2.0

Overwriting extension/requirements.txt


In [None]:
%%writefile extension/.dockerignore
Dockerfile
README.md
*.pyc
*.pyo
*.pyd
__pycache__
.pytest_cache

Overwriting extension/.dockerignore


Next, you deploy the service to Cloud Run. However, you might need to log in once more to deploy.

In [None]:
!gcloud auth login

In [None]:
!gcloud run deploy extension --region=us-central1 --allow-unauthenticated --source extension --no-user-output-enabled

Building and deploying...                                                      
  . Uploading sources...                                                       
  . Building Container...                                                      
  . Creating Revision...                                                       
  . Routing traffic...                                                         
  . Setting IAM Policy...                                                      


List the most recent Cloud Run service that was deployed, then you'll copy its URL to the next cell:

In [None]:
!gcloud run services list | sort -k 3 | head -2

[32m✔[39;0m  extension                  us-central1  https://extension-r5gdynozbq-uc.a.run.app                  koverholt@cloudadvocacyorg.joonix.net  2023-12-19T23:24:57.635822Z


In [None]:
# @title Copy paste the output from the previous command here
service_url = "https://your-extension.run.app"  # @param {type:"string"}

### Create an OpenAPI spec

Your Vertex Extension requires an OpenAPI 3.1 YAML file that defines routes, URL, HTTP methods, requests, and responses from your "backend" service. The following code creates a YAML file that you need to upload to your Cloud Storage bucket.

In [None]:
if not os.path.exists("extension-api"):
    os.mkdir("extension-api")

openapi_yaml = f"""
openapi: "3.1.0"
info:
  version: 1.0.0
  title: hello_extensions
  description: Learn to build Vertex AI extensions
servers:
  - url: {service_url}
paths:
  /hello:
    get:
      operationId: say_hello
      description: Prints 'hello' in the prompted language.
      parameters:
        - name: prompt
          in: query
          description: Any of the following strings--French, Spanish, English
          required: true
          schema:
            type: string
      responses:
        '200':
          description: Returns 'Hello' in the specified language.
          content:
            text/plain:
              schema:
                type: string
"""

print(openapi_yaml)


openapi: "3.1.0"
info:
  version: 1.0.0
  title: hello_extensions
  description: Learn to build Vertex AI extensions
servers:
  - url: https://extension-r5gdynozbq-uc.a.run.app
paths:
  /hello:
    get:
      operationId: say_hello
      description: Prints 'hello' in the prompted language.
      parameters:
        - name: prompt
          in: query
          description: Any of the following strings--French, Spanish, English
          required: true
          schema:
            type: string
      responses:
        '200':
          description: Returns 'Hello' in the specified language.
          content:
            text/plain:
              schema:
                type: string



In [None]:
%store openapi_yaml >extension-api/extension.yaml

Writing 'openapi_yaml' (str) to file 'extension-api/extension.yaml'.


Upload the OpenAPI YAML to your Cloud Storage bucket.

In [None]:
storage_client = storage.Client()
bucket = storage_client.bucket(BUCKET_NAME)
blob_name = f"{extensions_prefix}/extension.yaml"
blob = bucket.blob(blob_name)
blob.upload_from_filename("extension-api/extension.yaml")

### Test the service locally using LangChain

First, check that your service can accept simple HTTP `GET` requests:

In [None]:
url = f'{service_url}/hello?prompt=Spanish'
print(url)

https://extension-r5gdynozbq-uc.a.run.app/hello?prompt=Spanish


In [None]:
import requests

r = requests.get(url,
                 headers={
                    'Accept': 'application/json'
                 })

print(f"Status Code: {r.status_code}, Content: {r.text}")


Status Code: 200, Content: {"output":"hola"}



Next, instantiate the Vertex AI LLM with LangChain. Try a simple, multi-step reasoning prompt first to ensure that it has loaded correctly.

In [None]:
template = """Question: {question}

Answer: Let's think step by step."""

prompt = PromptTemplate(template=template, input_variables=["question"])
llm = VertexAI()
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"

llm_chain.run(question)

' Justin Bieber was born on March 1, 1994. The Super Bowl is held in February, so the Super Bowl that happened in the year Justin Bieber was born would have been Super Bowl XXVIII, which was held on January 30, 1994. The Dallas Cowboys won Super Bowl XXVIII.\n\nThe final answer is Dallas Cowboys'

Now, create the OpenAPI chain.

In [None]:
spec = OpenAPISpec.from_file("extension-api/extension.yaml")
operation = APIOperation.from_openapi_spec(spec, "/hello", "get")
chain = OpenAPIEndpointChain.from_api_operation(
    operation,
    llm,
    requests=Requests(),
    verbose=True,
    return_intermediate_steps=True,  # Return request and response text
)

In [None]:
output = chain("Question: How do you say 'hello' in Spanish?")



[1m> Entering new OpenAPIEndpointChain chain...[0m


[1m> Entering new APIRequesterChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful AI Assistant. Please provide JSON arguments to agentFunc() based on the user's instructions.

API_SCHEMA: ```typescript
/* Prints 'hello' in the prompted language. */
type say_hello = (_: {
/* Any of the following strings--French, Spanish, English */
		prompt: string,
}) => any;
```

USER_INSTRUCTIONS: "Question: How do you say 'hello' in Spanish?"

Your arguments must be plain json provided in a markdown block:

ARGS: ```json
{valid json conforming to API_SCHEMA}
```

Example
-----

ARGS: ```json
{"foo": "bar", "baz": {"qux": "quux"}}
```

The block must be no more than 1 line long, and all arguments must be valid JSON. All string arguments must be wrapped in double quotes.
You MUST strictly comply to the types indicated by the provided schema, including all required args.

If you don't have sufficient information to call th




[1m> Finished chain.[0m
[32;1m[1;3m{"prompt": "Spanish"}[0m
[36;1m[1;3m{"output":"hola"}
[0m


[1m> Entering new APIResponderChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful AI assistant trained to answer user queries from API responses.
You attempted to call an API, which resulted in:
API_RESPONSE: {"output":"hola"}


USER_COMMENT: "Question: How do you say 'hello' in Spanish?"


If the API_RESPONSE can answer the USER_COMMENT respond with the following markdown json block:
Response: ```json
{"response": "Human-understandable synthesis of the API_RESPONSE"}
```

Otherwise respond with the following markdown json block:
Response Error: ```json
{"response": "What you did and a concise statement of the resulting error. If it can be easily fixed, provide a suggestion."}
```

You MUST respond as a markdown json code block. The person you are responding to CANNOT see the API_RESPONSE, so if there is any relevant information there you must include it in yo




[1m> Finished chain.[0m
[33;1m[1;3mThe Spanish word for 'hello' is 'hola'.[0m

[1m> Finished chain.[0m


## Creating and using a custom extension

### Create the extension

Now that you've set up the service to fulfill extension requests, you can create the extension itself.

First, you'll define selection, invocation, and response examples:

In [None]:
# Include multiple selection, invocation, and response examples for best results.
extension_selection_examples = [{
    "query": "I want to see 'hello' in Spanish",
    "multi_steps": [{
        "thought": "I should call translate_tool for this",
        "extension_execution": {
          "operation_id": "say_hello",
          "extension_instruction": "return 'hola' from the prompt 'Spanish'",
          "observation": "In Spanish, 'hello' is 'hola'"
        }
      },
      {
        "thought": "Since the observation was successful, I should respond back to the user with results",
        "respond_to_user": {}
      }],
}]

extension_invocation_examples = [{
      "extension_instruction": "say 'hello' in Spanish",
      "operation_id": "say_hello",
      "thought": "Issue a sayHello operation request on hello_extension tool",
      "operation_param": "{\"prompt\": \"Spanish\"}",
      "parameters_mentioned": ["prompt"]
}]

extension_response_examples = [{
  "operation_id": "say_hello",
  "response_template": "{{ response }}",
}]

Then, you'll create your extension and include the examples from the previous cell:

In [None]:
extension_translate = llm_extension.Extension.create(
    display_name = "Hello Extensions",
    description = "Prints and translates hello in different languages",
    manifest = {
        "name": "translate_tool",
        "description": "Prints and translates hello in different languages",
        "api_spec": {
            "open_api_gcs_uri": f"gs://{BUCKET_NAME}/{extensions_prefix}/extension.yaml"
        },
        "auth_config": {
            "auth_type": "NO_AUTH",
        },
        "extension_selection_examples": extension_selection_examples,
        "extension_invocation_examples": extension_invocation_examples,
        "extension_response_examples": extension_response_examples,
    },
)
extension_translate

Creating Extension
Create Extension backing LRO: projects/964731510884/locations/us-central1/extensions/3698299719001309184/operations/2522298297096339456
Extension created. Resource name: projects/964731510884/locations/us-central1/extensions/3698299719001309184
To use this Extension in another session:
extension = aiplatform.Extension('projects/964731510884/locations/us-central1/extensions/3698299719001309184')


<google.cloud.aiplatform.private_preview.llm_extension.extensions.Extension object at 0x14be81f10> 
resource name: projects/964731510884/locations/us-central1/extensions/3698299719001309184

Now that you've create your extension, let's confirm that it's registered:

In [None]:
print("Name:", extension_translate.gca_resource.name)
print("Display Name:", extension_translate.display_name)
print("Description:", extension_translate.gca_resource.description)

Name: projects/964731510884/locations/us-central1/extensions/3698299719001309184
Display Name: Hello Extensions
Description: Prints and translates words in different languages


And you can test the functionality of the extension by executing it:

In [None]:
extension_translate.execute("say_hello",
    operation_params = {
        "prompt": "Spanish",
    },
)

{'output': 'hola'}

### Create a controller

The extension controller allows an application developer to specify which extensions to use.

You'll create an extension controller that refers to the extension/tool that you created in the previous section:

In [None]:
# Define the extensions controller service client
client_options = {"api_endpoint": f"{REGION}-aiplatform.googleapis.com"}
controller_client = llm_extension.extensions.services.extension_controller_service.client.ExtensionControllerServiceClient(
    client_options=client_options)

controller_spec = llm_extension.gapic.types.ExtensionControllerSpec()

controller_req = llm_extension.gapic.types.ExtensionController()
controller_req.display_name = "Translate Extension Controller"
controller_req.description = "Prints and translates hello in different languages"
controller_req.extension_controller_spec.extensions = [{"extension": extension_translate.resource_name}]

parent = f"projects/{PROJECT_ID}/locations/{REGION}"

controller_op = controller_client.create_extension_controller(
    parent=f"projects/{PROJECT_ID}/locations/{REGION}",
    extension_controller=controller_req
)
controller = controller_op.result(timeout=300)
print(controller.name)

projects/964731510884/locations/us-central1/extensionControllers/6418192418956378112


### Use the controller in a query

Now that you have an extension and an extension controller, you can start using the controller to answer queries.

In [None]:
execution_client = llm_extension.extensions.services.extension_controller_execution_service.client.ExtensionControllerExecutionServiceClient(
    client_options=client_options
)

req = {
    "query": {
        "query": "Question: how do I say 'hello' in French?",
    },
    "name": controller.name,
}

response = execution_client.query(req)

print(response)

response: "Bonjour "
metadata {
  steps {
    thought: "I should call translate_tool for this"
    extension_invoked: "translate_tool"
    extension_instruction: "return \'bonjour\' from the prompt \'French\'"
    response: "{\"output\":\"bonjour\"}"
    success: true
    error: ""
  }
  use_creativity: false
}



## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

In [None]:
# Delete the controller
op = controller_client.delete_extension_controller(name=controller.name)
op.result()

# Delete the extension
extension_translate.delete()

# Delete Cloud Storage objects that were created
#delete_bucket = False
#if delete_bucket or os.getenv("IS_TESTING"):
#! gsutil -m rm -r $BUCKET_URI

Deleting Extension : projects/964731510884/locations/us-central1/extensions/3698299719001309184
Delete Extension  backing LRO: projects/964731510884/locations/us-central1/operations/5234591172680220672
Extension deleted. . Resource name: projects/964731510884/locations/us-central1/extensions/3698299719001309184
