In [1]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Controlled Generation with the Gemini API



## Overview

In the past, when we want Large Language Model to output specific format, such as JSON schema, we will need to perform supervise fine-tuning, which is time consuming where you need to prepare label dataset, setup training pipeline, secure accelerator compute resourses.


Now with Gemini 1.5 Pro, it has changed the way we do things. By using controlled schema output configuration, Gemini can now output valid JSON schema that is loading by json library out of the box.

Check this out!


## Get started

### Install Vertex AI SDK and other required packages


In [2]:
%pip install --upgrade --user --quiet google-cloud-aiplatform

Note: you may need to restart the kernel to use updated packages.


### Set Google Cloud project information and initialize Vertex AI SDK

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [3]:
PROJECT_ID = "[your-project-id]"
LOCATION = "us-central1"

import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

## Code Examples

### Import libraries

In [4]:
import json

from vertexai import generative_models
from vertexai.generative_models import GenerationConfig, GenerativeModel, Part

### Gemini 1.5 Flash models

You can set model output using `response_mime_type` configuration option in `generation_config`, and describe the format you want in response.

In [5]:
model = GenerativeModel(
    model_name="gemini-1.5-flash",
    generation_config={"response_mime_type": "application/json"},
)

prompt = """
Based on the context below, identify the intent in two words and result the result as JSON schema:

# Context
Customer: "Hello, I need to file a claim for the accident I was involved in last week. My car was severely damaged, and I have already reported the incident to the police. Can you guide me on the next steps to process my insurance claim?"

Agent: "I'm sorry to hear about the accident. To assist you with your claim, I'll need some information. Could you please provide your policy number and a brief description of the incident?"

Customer: "Sure, my policy number is 12345678. The incident occurred on the 5th of June. I was driving through an intersection when another car ran a red light and hit my vehicle from the side. Fortunately, no one was injured."

# Return Schema
Intent = {"intent_name": STRING}
Return: list[Intent]
"""

Parse the response string to JSON.

In [6]:
response = model.generate_content(prompt)

json_response = json.loads(response.text)
print(json_response)

[{'intent_name': 'File Claim'}]


### Gemini 1.5 Pro models

With Gemini 1.5 Pro models, you can set `response_schema` parameter in `generation_config`, and the model output will strictly follow that schema.

Note that when `response_schema` is specified, the `response_mime_type` has to be set to `application/json`.

In [7]:
model = GenerativeModel("gemini-1.5-pro")


Following the previous example, define the data structure for the model output. Note that all of the fields in the JSON are optional by default unless specified in the `required` field.

In [8]:
response_schema = {
    "type": "OBJECT",
    "properties": {
        "intent_name": {
            "type": "STRING",
        },
    },
    "required": ["intent_name"],
}


When prompting the model to generate the content, pass the schema to the `response_schema` field of the `generation_config`.

In [9]:
prompt2 = """
Based on the context below, identify the intent in two words sentence and result the result as JSON schema:

# Context
Customer: "Hello, I need to file a claim for the accident I was involved in last week. My car was severely damaged, and I have already reported the incident to the police. Can you guide me on the next steps to process my insurance claim?"

Agent: "I'm sorry to hear about the accident. To assist you with your claim, I'll need some information. Could you please provide your policy number and a brief description of the incident?"

Customer: "Sure, my policy number is 12345678. The incident occurred on the 5th of June. I was driving through an intersection when another car ran a red light and hit my vehicle from the side. Fortunately, no one was injured."
"""


response = model.generate_content(
    prompt2,
    generation_config=GenerationConfig(
        response_mime_type="application/json", response_schema=response_schema
    ),
)

print(response.text)

{"intent_name":"file claim"
} 


You can parse the response string to JSON.

In [10]:
json_response = json.loads(response.text)
print(json_response)

{'intent_name': 'file claim'}
