# Gemini 3 with LangChain - Complete Guide

This notebook demonstrates all key features of Google's Gemini 3 model family using LangChain.

## Gemini 3 Overview

**Gemini 3 Pro** is Google's most intelligent model family, built on state-of-the-art reasoning capabilities.

### Key Features
- **Advanced Reasoning**: Dynamic thinking process with configurable thinking levels
- **1M Token Context**: Up to 1 million token input, 64k token output
- **Multimodal Excellence**: Images, PDFs, audio, video with granular resolution control
- **Knowledge Cutoff**: January 2025
- **Image Generation**: 4K resolution with grounded generation

### Model Variants

| Model | Context (In/Out) | Best For |
|-------|------------------|----------|
| `gemini-3-pro-preview` | 1M / 64k | Complex reasoning, coding, analysis |
| `gemini-3-pro-image-preview` | 65k / 32k | Image generation & editing |
| `gemini-2.5-flash` | 1M / 8k | Fast, cost-effective tasks |

### New Features in Gemini 3
1. **Thinking Level**: Control reasoning depth (`low` or `high`)
2. **Media Resolution**: Granular control per media type (`low`, `medium`, `high`, `ultra_high`)
3. **Temperature**: Keep at default 1.0 (changing can cause degraded performance)
4. **Thought Signatures**: Automatic reasoning context preservation

## Setup

Load environment variables for API authentication.

https://ai.google.dev/gemini-api/docs/pricing

In [None]:
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.messages import HumanMessage, SystemMessage, AIMessage

## Basic Usage

Demonstrates basic message formats and response structures for both Gemini 3 and 2.5 models.

In [None]:
gemini3 = "gemini-3-pro-preview"
# gemini2 = "gemini-2.5-pro"
gemini2 = "gemini-2.5-flash"

system_msg = SystemMessage("You are a helpful assistant.")

query = "Explain the theory of relativity in simple terms."
messages = [system_msg, HumanMessage(query)]


In [None]:
model = ChatGoogleGenerativeAI(model=gemini3)
response = model.invoke(messages)

In [None]:
response

In [None]:
response.text

In [None]:
model = ChatGoogleGenerativeAI(model=gemini2)
response = model.invoke(messages)

In [None]:
response

In [None]:
response.content
response.content_blocks
response.usage_metadata
response.response_metadata


## Streaming

Stream tokens in real-time as they're generated, improving user experience for long responses.

In [None]:
model = ChatGoogleGenerativeAI(model=gemini2)

query = "Write a short story about the earth and the moon."
for chunk in model.stream(query):
    print(chunk.content, end="", flush=True)

## Multimodal Capabilities

Process images, PDFs, audio, and video alongside text. Gemini 3 supports multiple input modalities with granular resolution control.

In [None]:
model = ChatGoogleGenerativeAI(model=gemini3)

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe the image provided."},
        {
            "type": "image",
            "url": "https://www.shutterstock.com/image-vector/vector-cute-baby-panda-cartoon-600nw-2427356853.jpg",
        },
    ]
)
response = model.invoke([message])

### Image Analysis from URL

In [None]:
response.content
response.text

In [None]:
## Reading media from local file and encoding to base64
## Now use smaller model for faster response

## image mime type example
# mime_type = "image/png"

## pdf mime type example
# mime_type = "application/pdf", type = "file"

## audio mime type example
# mime_type = "audio/mpeg", type = "audio"

import base64

image_bytes = open("data/images/panda.png", "rb").read()
bytes_base64 = base64.b64encode(image_bytes).decode("utf-8")

mime_type = "image/png"

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe the image provided."},
        {
            "type": "image",
            "base64": bytes_base64,
            "mime_type": mime_type,
        },
    ]
)

model = ChatGoogleGenerativeAI(model=gemini2)
response = model.invoke([message])

### Image Analysis from Local File

Base64 encode local images, PDFs, or audio files for analysis.

In [None]:
response

In [None]:
pdf_bytes = open("data/rag-data/apple/apple 10-q q1 2024.pdf", "rb").read()
pdf_base64 = base64.b64encode(pdf_bytes).decode("utf-8")

mime_type = "application/pdf"

message = HumanMessage(
    content=[
        {"type": "text", "text": "Summarize the key financial highlights from this quarterly report."},
        {
            "type": "file",
            "base64": pdf_base64,
            "mime_type": mime_type,
        },
    ]
)
response = model.invoke([message])

### PDF Document Analysis

Extract and analyze content from PDF files. Recommended to use `media_resolution_medium` for PDFs.

In [None]:
response.text

In [None]:
response.content
response.content_blocks
response.usage_metadata
response.response_metadata

## Tool Calling (Function Calling)

Bind custom tools to the model for extended capabilities like web search or API calls.

In [None]:
# first show ollama web_search tool
# then build weather tool

from scripts import base_tools

In [None]:
response = base_tools.web_search.invoke({'query': 'what is the latest US stock market updates?'})
# response

In [None]:
response = base_tools.get_weather.invoke({'location': 'Mumbai'})
# response

In [None]:
model = ChatGoogleGenerativeAI(model=gemini2)
model_with_tools = model.bind_tools([base_tools.web_search, base_tools.get_weather])

In [None]:
model_with_tools
response = model_with_tools.invoke("what is the weather in Mumbai today? and latest news on US stock market?")

In [None]:
response

## Thinking Support (Reasoning)

Configure the model's reasoning depth with `thinking_budget` or `thinking_level`. 

**Gemini 3 Recommendation**: Use `thinking_level="high"` (default) for complex tasks, `"low"` for simple tasks.

**Documentation**: https://ai.google.dev/gemini-api/docs/thinking

Control reasoning depth:
- `thinking_budget`: Legacy parameter (number of tokens)
- `thinking_level`: New parameter (`"low"` or `"high"`)
- `include_thoughts`: Show reasoning process in response

In [None]:
model = ChatGoogleGenerativeAI(model=gemini2,
                               thinking_budget=1024,
                               include_thoughts=True)

query = "explain the theory of relativity in simple terms."
response = model.invoke(query)

In [None]:
print(response)
# print(response.text)

response.content_blocks

response

In [None]:
model_with_tools = model.bind_tools([base_tools.web_search, base_tools.get_weather])

response = model_with_tools.invoke(query)


In [None]:
response.content_blocks
response

## Built-in Tools

Google Gemini provides native tools: Google Search and Code Execution. These require no additional setup.

Google Gemini supports a variety of built-in tools, which can be bound to the model in the usual way.

In [None]:
model = ChatGoogleGenerativeAI(model=gemini2)

#  base_tools.get_weather -> function calling with tool use is unsupported.
model_with_tools = model.bind_tools([{"google_search": {}}, {"code_execution": {}}])

In [None]:
query = "When is the next total solar eclipse in the US and what is 3 + 2?"
response = model_with_tools.invoke(query)

In [None]:
print(response.text)

In [None]:
response.response_metadata

## Context Caching

Cache large documents to reduce costs and latency for repeated queries. Minimum 2,048 tokens required.

**Benefits**:
- Reduced API costs
- Faster response times
- Ideal for analyzing large documents repeatedly

**Resources**:
- [Caching Guide](https://ai.google.dev/gemini-api/docs/caching?hl=en&lang=python#pdfs_1)
- [Pricing Details](https://ai.google.dev/gemini-api/docs/pricing)

In [None]:
import time
from google import genai
from google.genai.types import CreateCachedContentConfig, Content, Part

client = genai.Client()

### Initialize Client and Upload Files

In [None]:
file_paths = [
    "data/rag-data/apple/apple 10-q q1 2024.pdf",
    "data/rag-data/apple/apple 10-q q2 2024.pdf"
]

uploaded_files = []
for path in file_paths:
    file = client.files.upload(file=path)
    while file.state.name == "PROCESSING":
        time.sleep(2)
        file = client.files.get(name=file.name)
    uploaded_files.append(file)

In [None]:
parts = []
for f in uploaded_files:
    part = Part.from_uri(file_uri=f.uri, mime_type=f.mime_type)
    parts.append(part)

contents = [
    Content(
        role="user",
        parts=parts,
    )
]

cache = client.caches.create(
    model=gemini2,
    config=CreateCachedContentConfig(
        display_name="Apple Q1 Q2 2024 Reports",
        system_instruction="You are a financial analyst. Use these Apple quarterly reports to answer questions.",
        contents=contents,
        ttl="1800s",
    ),
)

### Create Cache

Cache content for 300 seconds (5 minutes) with system instructions.

In [None]:
llm = ChatGoogleGenerativeAI(
    model=gemini2,
    cached_content=cache.name,
)

response = llm.invoke("Compare the revenue growth between Q1 and Q2 2024.")

### Query with Cached Content

First query - cache is created and tokens are counted.

In [None]:
# print(response.text)
from IPython.display import Markdown, display
display(Markdown(response.text))

In [None]:
response.usage_metadata

In [None]:
response = llm.invoke("Provide a detailed analysis of Apple's Q1 and Q2 2024 earnings with key financial metrics, revenue comparison, and growth trends. Format this as bullet points suitable for an infographic.")

### Reuse Cache for Second Query

Notice `cache_read` tokens in usage metadata - shows cache is being used.

In [None]:
display(Markdown(response.text))

In [None]:
response.usage_metadata

## Image Generation

Generate high-quality images up to 4K resolution using `gemini-3-pro-image-preview`.

**Features**:
- Text rendering in images
- Multiple aspect ratios
- Grounded generation with Google Search
- Conversational editing

```
# Available aspect ratios
aspect_ratios = ["1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"]

# Available resolutions
resolutions = ["1K", "2K", "4K"]
```

https://github.com/langchain-ai/langchain-google/issues/1235

In [None]:
from langchain_google_genai import Modality
from IPython.display import Image, display

aspect_ratio = "16:9"
resolution = "4K"

In [None]:
image_model = ChatGoogleGenerativeAI(model="gemini-3-pro-image-preview")

image_content = f"Create a professional infographic with this data:\n\n{response.text}"

image_response = image_model.invoke(
    image_content,
    response_modalities=[Modality.TEXT, Modality.IMAGE],
)

In [None]:
image_response.content_blocks

In [None]:
image_response

In [None]:
def get_image_base64(response):
    # Go through each block in the response
    for block in response.content:
        # Check if this block is a dictionary
        if isinstance(block, dict):
            # Check if it has image data
            if "image_url" in block:
                # Extract the URL
                image_url_data = block["image_url"]
                full_url = image_url_data["url"]
                # The URL looks like: "data:image/png;base64,ACTUALBASE64DATA"
                # We only want the part after the comma
                base64_string = full_url.split(",")[1]
                return base64_string


In [None]:
image_base64 = get_image_base64(image_response)
display(Image(data=base64.b64decode(image_base64), width=800))

In [None]:
with open("data/images/apple_earnings_infographic.png", "wb") as f:
    f.write(base64.b64decode(image_base64))

### Native Google GenAI - Advanced Image Generation

LangChain doesn't fully support aspect ratio and resolution parameters yet. Use native Google GenAI SDK for full control.

**Available Options**:
- Aspect Ratios: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`
- Resolutions: `1K`, `2K`, `4K`

In [None]:
from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=image_content,
    config=types.GenerateContentConfig(
        image_config=types.ImageConfig(
            aspect_ratio="1:1",
            image_size="4K"
        )
    )
)

In [None]:
image_parts = [part for part in response.parts if part.inline_data]

image = image_parts[0].as_image()
display(image)

In [None]:
if image_parts:
    image = image_parts[0].as_image()
    image.save('apple_earnings_square_4k.png')

## Structured Output

Force the model to return responses in a specific Pydantic schema format.

**Methods**:
- `function_calling`: Uses tool calling (default)
- `json_schema`: Native JSON schema (more reliable for Gemini 3)

In [None]:
# Use weather tool for the sample structred output
# fields -> location:str, date:str, temperature:str, condition:str

from pydantic import BaseModel

class WeatherOutput(BaseModel):
    location: str
    date: str
    temperature: str
    condition: str

model = ChatGoogleGenerativeAI(model=gemini2)
model_with_tools = model.bind_tools([base_tools.get_weather])

structured_model = model_with_tools.with_structured_output(WeatherOutput)

In [None]:

response = structured_model.invoke("what is the weather in Mumbai today?")

In [None]:
response