# Shows how to access Google's Gemini LLM

## Prerequisites

1. You need an API key to access the Gemini API.
    - See the documentation, "Get a Gemini API key", at: https://ai.google.dev/gemini-api/docs/api-key
2. Choose a Gemini model to use.
    - See the documentation, "Gemini models", at: https://ai.google.dev/gemini-api/docs/models/gemini
    - They are free (as of 2024-12-29), but they have different rate limits,
        token limits, and supported data types.
3. Choose an API.
    - `google-generativeai`
        - We use this API in this notebook.
        - https://pypi.org/project/google-generativeai/
        - https://ai.google.dev/gemini-api/docs/quickstart?lang=python
        - This API provides more fine-grained control over model behavior, so
            you have greater ability to customize the LLM usage.
    - `google-genai`
        - We are _not_ using this API in this notebook.
        - https://pypi.org/project/google-genai/
        - https://googleapis.github.io/python-genai/
        - This API is more concise and simplified.
        - The documentation says, "Please do not use this SDK in production."

In [1]:
import dotenv
import os
import google.generativeai as generativeai
import json

In [2]:
# List of possible models: https://ai.google.dev/gemini-api/docs/models/gemini
# We experiment with a few different models in this notebook:
GEMINI_1_5_FLASH = "gemini-1.5-flash"
GEMINI_1_5_PRO = "gemini-1.5-pro"
GEMINI_2_0_FLASH_EXP = "gemini-2.0-flash-exp"

# The name of the key that you have defined as one of:
#   - an environment variable;
#   - in a .env file; or
#   - as a Secret in Google Colab.
GEMINI_API_KEY_NAME = "GEMINI_API_KEY"

## Load API Key

In [3]:
# Loads variables defined in '.env' file as environment variables.
#   override=False means that the value of the variable in the environment
#     takes precedence over the value of the variable in the '.env' file.
_ = dotenv.load_dotenv(override=False)
GEMINI_API_KEY_VALUE = os.environ.get(GEMINI_API_KEY_NAME)

In [4]:
# In Google Colab, you have two options:
# (1) Prompt the user for the key value:
# import getpass
# GEMINI_API_KEY_VALUE = getpass.getpass()
# -or-
# (2) Save your API Key in the Secrets, then read that value:
# from google.colab import userdata
# GEMINI_API_KEY_VALUE = userdata.get('GEMINI_API_KEY_NAME')

## Get a model that returns JSON

See:

- https://ai.google.dev/gemini-api/docs/quickstart?lang=python#make-first-request
- https://github.com/google-gemini/cookbook/blob/main/quickstarts/JSON_mode.ipynb
- https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#python
- https://cloud.google.com/vertex-ai/generative-ai/docs/reference/python/latest/vertexai.generative_models.GenerationConfig

In [5]:
generativeai.configure(api_key=GEMINI_API_KEY_VALUE)
generation_configuration = generativeai.GenerationConfig(
    temperature=1.0,
    candidate_count=1,
    max_output_tokens=1024,
    response_mime_type="application/json",
)
model = generativeai.GenerativeModel(
    model_name=GEMINI_2_0_FLASH_EXP, generation_config=generation_configuration
)

## Submit a prompt

In [6]:
prompt = """
You are a creative writer.
I want you to provide a few paragraphs describing a city from the United States in the 1920s,
using a detective noir style.
The city has a corrupt government.
Gangsters on the South Side are making a lot of money
from bootleg liquor during Prohibition.
Detective Harston Cooper is trying to help Mrs. Lucille Robinson
to find her missing husband, Roland Robinson.
Provide the response as JSON in this format:
{
  "city": {
    "name": ,
    "description": ,
    "atmosphere": ,
    "mood":
  }
}
"""

In [7]:
raw_response = model.generate_content(prompt)
raw_response

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "{\n  \"city\": {\n    \"name\": \"Steel City\",\n    \"description\": \"Steel City was a beast of iron and grime, its towering factories belching smoke into the perpetually gray sky. The air hung thick with the stench of coal and desperation, a grim symphony of clanging metal and the low rumble of discontent. The grand facades of downtown hid a rot that festered beneath the surface, where backroom deals and whispered promises greased the wheels of a corrupt government. On the South Side, the streets were a labyrinth of speakeasies and darkened alleys, where the scent of cheap whiskey and danger clung to the brick walls.\",\n    \"atmosphere\": \"The city was a pressure cooker, simmering with secrets and violence. Rain slicked the cobblestone streets, reflect

In [8]:
response = json.loads(raw_response.text)
response

{'city': {'name': 'Steel City',
  'description': 'Steel City was a beast of iron and grime, its towering factories belching smoke into the perpetually gray sky. The air hung thick with the stench of coal and desperation, a grim symphony of clanging metal and the low rumble of discontent. The grand facades of downtown hid a rot that festered beneath the surface, where backroom deals and whispered promises greased the wheels of a corrupt government. On the South Side, the streets were a labyrinth of speakeasies and darkened alleys, where the scent of cheap whiskey and danger clung to the brick walls.',
  'atmosphere': 'The city was a pressure cooker, simmering with secrets and violence. Rain slicked the cobblestone streets, reflecting the neon glow of illicit bars and the flickering gas lamps, casting long, distorted shadows that seemed to writhe with a life of their own. Every corner held a potential threat, every smile a possible deception. The rhythmic pulse of jazz music from hidden 

#### Analysis

The response above is good, formatted as a `dict` with the structure that was requested in the prompt.

In [9]:
type(response)

dict

In [10]:
response["city"]["name"]

'Steel City'

In [11]:
response["city"]["description"]

'Steel City was a beast of iron and grime, its towering factories belching smoke into the perpetually gray sky. The air hung thick with the stench of coal and desperation, a grim symphony of clanging metal and the low rumble of discontent. The grand facades of downtown hid a rot that festered beneath the surface, where backroom deals and whispered promises greased the wheels of a corrupt government. On the South Side, the streets were a labyrinth of speakeasies and darkened alleys, where the scent of cheap whiskey and danger clung to the brick walls.'

## response_schema

Let's try another way to get the structure that we want in the response: The `GenerationConfig` constructor has a `response_schema` argument. According to: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/control-generated-output
as of this writing on 2024-12-29, `response_schema` only works with the following models:

- "gemini-1.5-flash"
- "gemini-1.5-pro"

### Option 1

- Remove the instructions about JSON from the prompt,
- but provide a `response_schema` in the `generation_config` argument
to the `generate_content` method.

In [12]:
prompt = """
You are a creative writer.
I want you to provide a few paragraphs describing a city from the United States in the 1920s,
using a detective noir style.
The city has a corrupt government.
Gangsters on the South Side are making a lot of money
from bootleg liquor during Prohibition.
Mrs. Lucille Robinson has hired Detective Harston Cooper 
to find her missing husband, Roland Robinson.
"""

In [13]:
generation_configuration_with_schema = generativeai.GenerationConfig(
    temperature=0.0,
    candidate_count=1,
    max_output_tokens=1024,
    response_mime_type="application/json",
    response_schema={
        "type": "OBJECT",
        "properties": {
            "city": {
                "type": "OBJECT",
                "properties": {
                    "name": {"type": "STRING"},
                    "description": {"type": "STRING"},
                    "atmosphere": {"type": "STRING"},
                    "mood": {"type": "STRING"},
                },
            }
        },
    },
)
raw_response = generativeai.GenerativeModel(
    model_name=GEMINI_1_5_PRO,
    generation_config=generation_configuration_with_schema,
).generate_content(prompt)
raw_response

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "{\"city\": {\"atmosphere\": \"A thick fog hung over the city, clinging to the cold, damp streets like a shroud.  The air was heavy with the scent of coal smoke and cheap perfume, a grim symphony of despair and fleeting pleasure.\", \"description\": \"Chicago in the 1920s. A city of stark contrasts, where gleaming skyscrapers cast long shadows over the grimy alleys below.  A place where fortunes were made and lost in the blink of an eye, where the pursuit of pleasure masked a deep-seated corruption. The South Side, a labyrinth of speakeasies and gambling dens, pulsed with a dangerous energy, fueled by the illicit fortunes of bootlegging gangs.\", \"mood\": \"The city's mood was as unpredictable as a loaded dice roll. One moment, a burst of raucous laughter fr

In [14]:
response = json.loads(raw_response.text)
response

{'city': {'atmosphere': 'A thick fog hung over the city, clinging to the cold, damp streets like a shroud.  The air was heavy with the scent of coal smoke and cheap perfume, a grim symphony of despair and fleeting pleasure.',
  'description': 'Chicago in the 1920s. A city of stark contrasts, where gleaming skyscrapers cast long shadows over the grimy alleys below.  A place where fortunes were made and lost in the blink of an eye, where the pursuit of pleasure masked a deep-seated corruption. The South Side, a labyrinth of speakeasies and gambling dens, pulsed with a dangerous energy, fueled by the illicit fortunes of bootlegging gangs.',
  'mood': "The city's mood was as unpredictable as a loaded dice roll. One moment, a burst of raucous laughter from a hidden speakeasy, the next, the chilling silence of a dark alley where secrets were buried.  A sense of unease permeated every corner, a constant reminder that beneath the veneer of prosperity, something sinister lurked.",
  'name': 'Ch

#### Analysis

The response above is good with well-formed JSON.

### Option 2a

Keep the JSON information in the prompt, as well as specifying
the `response_schema`.

In [15]:
prompt = """
You are a creative writer.
I want you to provide a few paragraphs describing a city from the United States in the 1920s,
using a detective noir style.
The city has a corrupt government.
Gangsters on the South Side are making a lot of money
from bootleg liquor during Prohibition.
Mrs. Lucille Robinson has hired Detective Harston Cooper 
to find her missing husband, Roland Robinson.

In your response, provide 4 properties of the "city":
"name": "name of the city",
"description": "description of the city",
"atmosphere": "the feeling of the city when you walk around",
"mood": "describe how people feel"

The 'description', 'atmosphere', and 'mood' should only be a couple of sentences each.
"""
raw_response = generativeai.GenerativeModel(
    model_name=GEMINI_1_5_PRO,
    generation_config=generation_configuration_with_schema,
).generate_content(prompt)
raw_response

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "{\"city\": {\"name\": \"New Corinthiana, IL (fictional city name) during the Roaring Twenties era of the United States of America in the 1920s decade of the 20th century A.D. Gregorian calendar year time period and epoch of human history and civilization and development and progress and advancement and growth and evolution and improvement and betterment and prosperity and success and achievement and accomplishment and attainment and realization and fulfillment and consummation and perfection and completion and culmination and climax and apex and summit and pinnacle and zenith and acme and peak and crest and crown and capstone and keystone and cornerstone and foundation and groundwork and infrastructure and superstructure and edifice and monument and memorial

#### Analysis

This is a bad response clearly. Besides the semantics being weird, even the format is incorrect.

Note: `"finish_reason": "MAX_TOKENS"` in the response identifies this as a truncated response.

### Option 2b

Modify the prompt slightly from (2a), keeping the `response_schema` in the configuration.

In [16]:
prompt = """
You are a creative writer.
I want you to provide a few paragraphs describing a city from the United States in the 1920s,
using a detective noir style.
The city has a corrupt government.
Gangsters on the South Side are making a lot of money
from bootleg liquor during Prohibition.
Mrs. Lucille Robinson has hired Detective Harston Cooper 
to find her missing husband, Roland Robinson.

In your response, provide 4 properties of the "city":
"name": "name of the city",
"description": "description of the city",
"atmosphere": "the feeling of the city when you walk around",
"mood": "describe how people feel"
"""
raw_response = generativeai.GenerativeModel(
    model_name=GEMINI_1_5_PRO,
    generation_config=generation_configuration_with_schema,
).generate_content(prompt)
raw_response

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "{\"city\": {\"name\": \"New Corinth, IL (fictional city name) during the Roaring Twenties era of the United States of America in the 1920s decade of the 20th century A.D. Gregorian calendar year time period and epoch of human history and civilization and development and progress and advancement and growth and evolution and prosperity and success and achievement and accomplishment and greatness and glory and grandeur and splendor and magnificence and brilliance and wonder and amazement and awe and inspiration and motivation and encouragement and hope and optimism and positivity and happiness and joy and love and peace and harmony and unity and togetherness and cooperation and collaboration and teamwork and synergy and innovation and creativity and imagination

### Option 2c

Remove more of the instructions from (2b), keeping the response_schema in the configuration.

In [17]:
prompt = """
You are a creative writer.
I want you to provide a few paragraphs describing a city from the United States in the 1920s,
using a detective noir style.
The city has a corrupt government.
Gangsters on the South Side are making a lot of money
from bootleg liquor during Prohibition.
Mrs. Lucille Robinson has hired Detective Harston Cooper 
to find her missing husband, Roland Robinson.
"""
raw_response = generativeai.GenerativeModel(
    model_name=GEMINI_1_5_PRO,
    generation_config=generation_configuration_with_schema,
).generate_content(prompt)
raw_response

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "{\"city\": {\"atmosphere\": \"A thick fog hung over the city, clinging to the cold, damp streets like a shroud.  The air was heavy with the scent of coal smoke and desperation.\", \"description\": \"Chicago in the 1920s. A city of stark contrasts. Opulence and poverty danced a dangerous tango in the flickering gaslight.  Skyscrapers clawed at the heavens, casting long shadows over the grimy alleys below.  The rattle of streetcars and the blare of jazz music from dimly lit speakeasies filled the night.\", \"mood\": \"The city was a symphony of shadows and whispers.  Corruption had seeped into every corner, from the highest echelons of power down to the beat cops on the street.  A sense of unease permeated the air, a premonition of violence lurking just around

In [18]:
response = json.loads(raw_response.text)
response

{'city': {'atmosphere': 'A thick fog hung over the city, clinging to the cold, damp streets like a shroud.  The air was heavy with the scent of coal smoke and desperation.',
  'description': 'Chicago in the 1920s. A city of stark contrasts. Opulence and poverty danced a dangerous tango in the flickering gaslight.  Skyscrapers clawed at the heavens, casting long shadows over the grimy alleys below.  The rattle of streetcars and the blare of jazz music from dimly lit speakeasies filled the night.',
  'mood': 'The city was a symphony of shadows and whispers.  Corruption had seeped into every corner, from the highest echelons of power down to the beat cops on the street.  A sense of unease permeated the air, a premonition of violence lurking just around the corner.',
  'name': 'Chicago'}}

#### Analysis

The response above is good:

- The `city` object has all four of the requested properties.
- The structure is well-formed.