<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/llm/gemini.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Google GenAI

In this notebook, we show how to use the the `google-genai` Python SDK with LlamaIndex to interact with Google GenAI models.

If you're opening this Notebook on colab, you will need to install LlamaIndex 🦙 and the `google-genai` Python SDK.

In [None]:
%pip install llama-index-llms-google-genai llama-index

## Basic Usage

You will need to get an API key from [Google AI Studio](https://makersuite.google.com/app/apikey). Once you have one, you can either pass it explicity to the model, or use the `GOOGLE_API_KEY` environment variable.

In [None]:
import os

os.environ["GOOGLE_API_KEY"] = "..."

## Basic Usage

You can call `complete` with a prompt:

In [None]:
from llama_index.llms.google_genai import GoogleGenAI

llm = GoogleGenAI(
    model="models/gemini-2.0-flash",
    # api_key="some key",  # uses GOOGLE_API_KEY env var by default
)

resp = llm.complete("Who is Paul Graham?")
print(resp)

Paul Graham is a well-known figure in the tech and startup world, primarily recognized for the following:

*   **Co-founder of Y Combinator (YC):** This is his most famous accomplishment. YC is a highly successful startup accelerator that has funded companies like Airbnb, Dropbox, Stripe, Reddit, and many others. He essentially pioneered a new model for early-stage startup funding and mentorship.

*   **Programmer and Essayist:** Before YC, he was a successful programmer and had a startup called Viaweb, which he sold to Yahoo! in 1998 (it became Yahoo! Store). He's also a prolific essayist, writing on topics like programming, startups, design, creativity, and societal trends. His essays are highly influential and often thought-provoking.

*   **Lisp Advocate:** Graham is a strong advocate for the Lisp programming language, particularly its dialect, Common Lisp. He wrote several books on Lisp programming and has used it in various projects, including Viaweb.

**In summary, Paul Graham i

You can also call `chat` with a list of chat messages:

In [None]:
from llama_index.core.llms import ChatMessage
from llama_index.llms.google_genai import GoogleGenAI

messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="Tell me a story"),
]
llm = GoogleGenAI(model="models/gemini-2.0-flash")
resp = llm.chat(messages)

print(resp)

assistant: Ahoy there, matey! Gather 'round, ye landlubbers, and let ol' One-Eyed Pete spin ye a yarn that'll curl yer toes and make yer grog taste all the sweeter!

This tale be about the infamous Pearl of Paradox, a jewel said to grant the wearer anything their heart desired. Not immortality, mind you, but the *perfect* slice of apple pie, the *exact* winning lottery numbers… small stuff, but enough to make a pirate a king!

Now, this Pearl was guarded by a beast, a kraken of epic proportions, with tentacles that could crush a galleon like a walnut and a beak sharp enough to shave yer beard clean off yer face. The kraken, they say, lived in the Whispering Abyss, a part of the sea so dark, even the stars feared to peek in.

I, One-Eyed Pete, was at the helm of me ship, the *Sea Serpent's Kiss*, a vessel as sleek and deadly as a viper, with a crew that were more rum-sotted scallywags than sailors, but loyal as barnacles to a ship's hull. We'd heard whispers of the Pearl, and greed, tha

## Streaming Support

Every method supports streaming through the `stream_` prefix.

In [None]:
from llama_index.llms.google_genai import GoogleGenAI

llm = GoogleGenAI(model="models/gemini-2.0-flash")

resp = llm.stream_complete("Who is Paul Graham?")
for r in resp:
    print(r.delta, end="")

Paul Graham is a prominent figure in the tech world, known for his contributions to computer science, writing, and venture capital. Here's a breakdown of who he is:

*   **Computer Scientist and Programmer:** He holds a Ph.D. in computer science from Harvard, specializing in Lisp programming. He is well-regarded for his work on Lisp, particularly his book "On Lisp," which is considered a classic in the field. He also developed the Viaweb e-commerce platform using Lisp.

*   **Entrepreneur:** Graham co-founded Viaweb in 1995, which was one of the first Software as a Service (SaaS) platforms. Viaweb allowed businesses to easily create online stores. Yahoo! acquired Viaweb in 1998, and it was renamed Yahoo! Store.

*   **Writer and Essayist:** He is a prolific writer and essayist on topics ranging from technology and startups to art, philosophy, and culture. His essays are widely read and have influenced many people in the tech industry. He often writes about his experiences and perspecti

In [None]:
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="user", content="Who is Paul Graham?"),
]

resp = llm.stream_chat(messages)
for r in resp:
    print(r.delta, end="")

Paul Graham is a well-known figure in the technology and startup world. Here's a breakdown of who he is and why he's significant:

*   **Computer Scientist and Hacker:** Graham holds a Ph.D. in computer science from Harvard University. He is a skilled programmer and is known for his work on Lisp, a programming language.

*   **Entrepreneur:** He co-founded Viaweb in 1995, which was one of the first software-as-a-service (SaaS) companies, providing tools for building and hosting online stores. Yahoo! acquired Viaweb in 1998, and it was rebranded as Yahoo! Store.

*   **Y Combinator (YC) Founder:** Graham is most famous for co-founding Y Combinator (YC) in 2005 with Jessica Livingston, Trevor Blackwell, and Robert Tappan Morris. YC is a seed accelerator that provides funding, mentorship, and networking opportunities to early-stage startups.

*   **Influential Startup Advisor and Essayist:** Through YC, Graham has advised thousands of startups, including Airbnb, Dropbox, Reddit, Stripe, a

## Async Usage

Every synchronous method has an async counterpart.

In [None]:
from llama_index.llms.google_genai import GoogleGenAI

llm = GoogleGenAI(model="models/gemini-2.0-flash")

resp = await llm.astream_complete("Who is Paul Graham?")
async for r in resp:
    print(r.delta, end="")

Paul Graham is a prominent figure in the tech and startup world, known for his work as a programmer, essayist, venture capitalist, and co-founder of Y Combinator. Here's a breakdown of his key contributions:

*   **Programmer:** Graham holds a Ph.D. in computer science from Harvard, where he studied Lisp. He is known for his expertise in Lisp and its applications, particularly in building web applications.

*   **Essayist:** Graham is a prolific and influential essayist, covering topics like startups, technology, design, programming, and life. His essays are known for their clear thinking, insightful observations, and often contrarian perspectives. Many of his essays are considered essential reading for entrepreneurs.

*   **Venture Capitalist:** Graham is a co-founder of Y Combinator (YC), a highly successful startup accelerator. YC has funded many well-known companies, including Airbnb, Dropbox, Reddit, Stripe, and many others.

*   **Y Combinator (YC):** Graham's most significant im

In [None]:
messages = [
    ChatMessage(role="user", content="Who is Paul Graham?"),
]

resp = await llm.achat(messages)
print(resp)

assistant: Paul Graham is a prominent figure in the world of computer science, startups, and writing. Here's a breakdown of who he is and why he's notable:

*   **Computer Scientist and Hacker:** Graham is a computer scientist with a Ph.D. in Computer Science from Harvard. He's known for his work on Lisp programming language and for co-founding Viaweb.

*   **Co-founder of Viaweb (Yahoo! Store):** Viaweb, which he co-founded in 1995, was one of the first software-as-a-service (SaaS) e-commerce platforms. Yahoo! acquired it in 1998 and rebranded it as Yahoo! Store.

*   **Co-founder of Y Combinator:** In 2005, Graham co-founded Y Combinator (YC), a highly influential startup accelerator that has funded and mentored numerous successful companies.

*   **Startup Guru/Advisor:** Through Y Combinator, Graham became a leading figure in the startup world, offering advice and guidance to countless founders on topics ranging from product development to fundraising.

*   **Essayist and Author:**

## Vertex AI Support

By providing the `region` and `project_id` parameters (either through environment variables or directly), you can use an Anthropic model through Vertex AI.

In [None]:
# Set environment variables
!export GOOGLE_GENAI_USE_VERTEXAI=true
!export GOOGLE_CLOUD_PROJECT='your-project-id'
!export GOOGLE_CLOUD_LOCATION='us-central1'

In [None]:
from llama_index.llms.google_genai import GoogleGenAI

# or set the parameters directly
llm = GoogleGenAI(
    model="models/gemini-2.0-flash",
    vertexai_config={"project": "your-project-id", "location": "us-central1"},
)

## Multi-Modal Support

Using `ChatMessage` objects, you can pass in images and text to the LLM.

In [None]:
!wget https://cdn.pixabay.com/photo/2021/12/12/20/00/play-6865967_640.jpg -O image.jpg

In [None]:
from llama_index.core.llms import ChatMessage, TextBlock, ImageBlock
from llama_index.llms.google_genai import GoogleGenAI

llm = GoogleGenAI(model="models/gemini-2.0-flash")

messages = [
    ChatMessage(
        role="user",
        blocks=[
            ImageBlock(path="image.jpg"),
            TextBlock(text="What is in this image?"),
        ],
    )
]

resp = llm.chat(messages)
print(resp)

assistant: The image contains four wooden dice. Each die has black dots indicating the numbers. The dice are arranged on a dark surface.


## Structured Prediction

LlamaIndex provides an intuitive interface for converting any Anthropic LLMs into a structured LLM through `structured_predict` - simply define the target Pydantic class (can be nested), and given a prompt, we extract out the desired object.

In [None]:
from llama_index.llms.google_genai import GoogleGenAI
from llama_index.core.prompts import PromptTemplate
from llama_index.core.bridge.pydantic import BaseModel
from typing import List


class MenuItem(BaseModel):
    """A menu item in a restaurant."""

    course_name: str
    is_vegetarian: bool


class Restaurant(BaseModel):
    """A restaurant with name, city, and cuisine."""

    name: str
    city: str
    cuisine: str
    menu_items: List[MenuItem]


llm = GoogleGenAI(model="models/gemini-2.0-flash")
prompt_tmpl = PromptTemplate(
    "Generate a restaurant in a given city {city_name}"
)

# Option 1: Use `as_structured_llm`
restaurant_obj = (
    llm.as_structured_llm(Restaurant)
    .complete(prompt_tmpl.format(city_name="Miami"))
    .raw
)
# Option 2: Use `structured_predict`
# restaurant_obj = llm.structured_predict(Restaurant, prompt_tmpl, city_name="Miami")

In [None]:
print(restaurant_obj)

name='BurgerPlace' city='Miami' cuisine='American' menu_items=[MenuItem(course_name='burger', is_vegetarian=False)]


#### Structured Prediction with Streaming

Any LLM wrapped with `as_structured_llm` supports streaming through `stream_chat`.

In [None]:
from llama_index.core.llms import ChatMessage
from IPython.display import clear_output
from pprint import pprint

input_msg = ChatMessage.from_str("Generate a restaurant in San Francisco")

sllm = llm.as_structured_llm(Restaurant)
stream_output = sllm.stream_chat([input_msg])
for partial_output in stream_output:
    clear_output(wait=True)
    pprint(partial_output.raw.dict())
    restaurant_obj = partial_output.raw

restaurant_obj

{'city': 'San Francisco',
 'cuisine': 'Italian',
 'menu_items': [{'course_name': 'Pasta', 'is_vegetarian': False}],
 'name': "Tony's"}


/var/folders/lw/xwsz_3yj4ln1gvkxhyddbvvw0000gn/T/ipykernel_47921/1885953561.py:11: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.10/migration/
  pprint(partial_output.raw.dict())


Restaurant(name="Tony's", city='San Francisco', cuisine='Italian', menu_items=[MenuItem(course_name='Pasta', is_vegetarian=False)])

## Tool/Function Calling

Google GenAI supports direct tool/function calling through the API. Using LlamaIndex, we can implement some core agentic tool calling patterns.

In [None]:
from llama_index.core.tools import FunctionTool
from llama_index.core.llms import ChatMessage
from llama_index.llms.google_genai import GoogleGenAI
from datetime import datetime

llm = GoogleGenAI(model="models/gemini-2.0-flash")


def get_current_time(timezone: str) -> dict:
    """Get the current time"""
    return {
        "time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
        "timezone": timezone,
    }


# uses the tool name, any type annotations, and docstring to describe the tool
tool = FunctionTool.from_defaults(fn=get_current_time)

We can simply do a single pass to call the tool and get the result:

In [None]:
resp = llm.predict_and_call([tool], "What is the current time in New York?")
print(resp)



{'time': '2025-03-07 22:58:45', 'timezone': 'America/New_York'}


We can also use lower-level APIs to implement an agentic tool-calling loop!

In [None]:
chat_history = [
    ChatMessage(role="user", content="What is the current time in New York?")
]
tools_by_name = {t.metadata.name: t for t in [tool]}

resp = llm.chat_with_tools([tool], chat_history=chat_history)
tool_calls = llm.get_tool_calls_from_response(
    resp, error_on_no_tool_call=False
)

if not tool_calls:
    print(resp)
else:
    while tool_calls:
        # add the LLM's response to the chat history
        chat_history.append(resp.message)

        for tool_call in tool_calls:
            tool_name = tool_call.tool_name
            tool_kwargs = tool_call.tool_kwargs

            print(f"Calling {tool_name} with {tool_kwargs}")
            tool_output = tool.call(**tool_kwargs)
            print("Tool output: ", tool_output)
            chat_history.append(
                ChatMessage(
                    role="tool",
                    content=str(tool_output),
                    # most LLMs like Anthropic, OpenAI, etc. need to know the tool call id
                    additional_kwargs={"tool_call_id": tool_call.tool_id},
                )
            )

            resp = llm.chat_with_tools([tool], chat_history=chat_history)
            tool_calls = llm.get_tool_calls_from_response(
                resp, error_on_no_tool_call=False
            )
    print("Final response: ", resp.message.content)



Calling get_current_time with {'timezone': 'America/New_York'}
Tool output:  {'time': '2025-03-07 22:58:46', 'timezone': 'America/New_York'}
Final response:  The current time in New York is 2025-03-07 22:58:46.
