# Google Gemini Cheat Sheet: LangChain

The `langchain-google-genai` provides access to Google's powerful Gemini models directly via the Gemini API & Google AI Studio. Google AI Studio enables rapid prototyping and experimentation, making it an ideal starting point for individual developers.

[LangChain](https://python.langchain.com/) is a framework for developing AI applications. The `langchain-google-genai` package connects LangChain with Google's Gemini models. [LangGraph](https://python.langchain.com/docs/langgraph/) is a library for building stateful, multi-actor applications with LLMs. 

All examples use the `gemini-2.0-flash` model. Gemini 2.5 Pro and 2.5 Flash can be used via  `gemini-2.5-pro-preview-03-25` and `gemini-2.5-flash-preview-04-17`. All model ids can be found in the [Gemini API docs](https://ai.google.dev/gemini-api/docs/models).

Start for free and get your API key from [Google AI Studio](https://aistudio.google.com/app/apikey).

In [4]:
import getpass
import os

if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google AI API key: ")

1. Install the package `langchain-google-genai`

In [None]:
%pip install langchain-google-genai

## Google Gemini with LangChain Chat Models

In [5]:
from langchain_google_genai import ChatGoogleGenerativeAI

# Initialize model
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

# Simple invocation
messages = [
    ("system", "You are a helpful assistant that translates English to French."),
    ("human", "I love programming."),
]
response = llm.invoke(messages)
print(response.content)  # Output: J'adore la programmation.

J'adore la programmation.


### Chain calls with Prompt Template

In [7]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import ChatPromptTemplate

# Initialize model
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    temperature=0,
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that translates {input_language} to {output_language}."),
    ("human", "{input}"),
])

chain = prompt | llm
result = chain.invoke({
    "input_language": "English",
    "output_language": "German",
    "input": "I love programming.",
})
print(result.content)  # Output: Ich liebe Programmieren.

Ich liebe Programmieren.


### Image Input

In [13]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage
import base64

# Initialize model
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

# Using an image URL
message_url = HumanMessage(
    content=[
        {"type": "text", "text": "Describe this image."},
        {"type": "image_url", "image_url": "https://picsum.photos/seed/picsum/200/300"},
    ]
)
result_url = llm.invoke([message_url])
print(result_url.content)

# Using a local image
local_image_path = "../assets/react.png"
with open(local_image_path, "rb") as image_file:
    encoded_image = base64.b64encode(image_file.read()).decode('utf-8')

message_local = HumanMessage(
    content=[
        {"type": "text", "text": "Describe this image."},
        {"type": "image_url", "image_url": f"data:image/png;base64,{encoded_image}"}
    ]
)
result_local = llm.invoke([message_local])
print(result_local.content)

Here's a description of the image:

**Overall Impression:**

The image depicts a serene and majestic mountain landscape, likely at either dawn or dusk, judging by the soft, pastel-colored sky.

**Key Elements:**

*   **Mountains:** A snow-covered mountain range dominates the scene. One prominent, craggy peak is visible on the right side of the frame. The lower slopes appear to be covered in deep snow.
*   **Sky:** The sky is a major feature of the image, filled with layers of clouds. The colors are soft and muted, with hues of pink, orange, and light blue/purple, suggesting the light of sunrise or sunset.
*   **Lighting:** The lighting is soft and diffused, creating a tranquil atmosphere. The light seems to be catching the peak of the mountain, highlighting its texture.
*   **Foreground:** In the foreground, there's a dark, smooth, snow-covered slope. This creates a sense of depth and leads the eye towards the mountains in the distance.

**Atmosphere:**

The image evokes a sense of pea

### Audio Input

In [14]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage
import base64

# Initialize model
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

audio_file_path = "../assets/porsche.mp3"
audio_mime_type = "audio/mpeg"

with open(audio_file_path, "rb") as audio_file:
    encoded_audio = base64.b64encode(audio_file.read()).decode('utf-8')

message = HumanMessage(
    content=[
        {"type": "text", "text": "Transcribe this audio."},
        {"type": "media", "data": encoded_audio, "mime_type": audio_mime_type}
    ]
)
response = llm.invoke([message])
print(response.content)

If the Porsche Macan has proven anything, it's that the days of sacrificing performance for practicality are gone, long gone. Engineered to deliver a driving experience like no other, the Macan has demonstrated excellence in style and performance to become the leading sports car in its class. So don't let those five doors fool you. Once you're in the driver's seat, one thing will become immediately clear. This is a Porsche, the Macan, now leasing from 3.99%. Conditions apply.


### Video Input

In [15]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage
import base64

# Initialize model
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

video_file_path = "../assets/screen.mp4"
video_mime_type = "video/mp4"

with open(video_file_path, "rb") as video_file:
    encoded_video = base64.b64encode(video_file.read()).decode('utf-8')

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe what's happening in this video."},
        {"type": "media", "data": encoded_video, "mime_type": video_mime_type}
    ]
)
response = llm.invoke([message])
print(response.content)

Here's a breakdown of what's happening in the video:

*   **0:00-0:09:** The video starts with a gray bar at the bottom and a white space above. A Google search page then appears, with the search query "What is Gemini 2.5 Flash? When was it launched and what are its key capabilities" already entered.
*   **0:09-0:14:** A cookie consent pop-up appears on the Google search results page, written in German.
*   **0:14-0:20:** The cookie consent pop-up remains visible on the Google search results page.
*   **0:20-0:34:** The cookie consent pop-up disappears, and the Google search results page is shown.
*   **0:34-0:40:** The first search result, "Start building with Gemini 2.5 Flash" from Google Blog, is clicked, leading to the corresponding webpage.
*   **0:40-0:50:** The "Start building with Gemini 2.5 Flash" webpage is displayed.
*   **0:50-1:05:** The "Start building with Gemini 2.5 Flash" webpage remains visible.


### Image Generation

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
import base64
from IPython.display import Image, display

# Initialize model for image generation
llm = ChatGoogleGenerativeAI(model="models/gemini-2.0-flash-exp-image-generation")

message = {
    "role": "user",
    "content": "Generate an image of a cat wearing a hat.",
}

response = llm.invoke(
    [message],
    generation_config=dict(response_modalities=["TEXT", "IMAGE"]),
)

# Display the generated image
image_base64 = response.content[0].get("image_url").get("url").split(",")[-1]
image_data = base64.b64decode(image_base64)
display(Image(data=image_data, width=300))

### Tool Calling/Function Calling

In [16]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.tools import tool
from langchain_core.messages import ToolMessage

# Define a tool
@tool(description="Get the current weather in a given location")
def get_weather(location: str) -> str:
    return "It's sunny."

# Initialize model and bind the tool
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")
llm_with_tools = llm.bind_tools([get_weather])

# Invoke with a query that should trigger the tool
query = "What's the weather in San Francisco?"
ai_msg = llm_with_tools.invoke(query)

# Access tool calls in the response
print(ai_msg.tool_calls)

# Pass tool results back to the model
tool_message = ToolMessage(
    content=get_weather(*ai_msg.tool_calls[0]['args']), 
    tool_call_id=ai_msg.tool_calls[0]['id']
)
final_response = llm_with_tools.invoke([ai_msg, tool_message])
print(final_response.content)

[{'name': 'get_weather', 'args': {'location': 'San Francisco'}, 'id': '0a7c27f0-b905-4e5f-b941-6fbb04ac4a39', 'type': 'tool_call'}]


  content=get_weather(*ai_msg.tool_calls[0]['args']),


OK. It's sunny in San Francisco.


### Built-in Tools (Google Search, Code Execution)

In [17]:
from langchain_google_genai import ChatGoogleGenerativeAI
from google.ai.generativelanguage_v1beta.types import Tool as GenAITool

# Initialize model
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

# Google Search
search_resp = llm.invoke(
    "When is the next total solar eclipse in US?",
    tools=[GenAITool(google_search={})],
)
print(search_resp.content)

# Code Execution
code_resp = llm.invoke(
    "What is 2*2, use python",
    tools=[GenAITool(code_execution={})],
)

for c in code_resp.content:
    if isinstance(c, dict):
        if c["type"] == 'code_execution_result':
            print(f"Code execution result: {c['code_execution_result']}")
        elif c["type"] == 'executable_code':
            print(f"Executable code: {c['executable_code']}")
    else:
        print(c)

The next total solar eclipse that will be visible from the contiguous United States will be on August 23, 2044. The eclipse will begin in Greenland and pass through Canada before reaching the U.S., ending around sunset in Montana, North Dakota, and South Dakota.

Another total solar eclipse will occur on August 12, 2045, and its path will traverse the US coast to coast.
Executable code: print(2 * 2)

Code execution result: 4

2 * 2 = 4


        - 'executable_code': Always present.  
        - 'execution_result' & 'image_url': May be absent for some queries.  

        Validate before using in production.



### Structured Output

In [18]:
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_google_genai import ChatGoogleGenerativeAI

# Define the desired structure
class Person(BaseModel):
    '''Information about a person.'''
    name: str = Field(..., description="The person's name")
    height_m: float = Field(..., description="The person's height in meters")

# Initialize the model
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0)
structured_llm = llm.with_structured_output(Person)

# Invoke the model with a query asking for structured information
result = structured_llm.invoke("Who was the 16th president of the USA, and how tall was he in meters?")
print(result)  # Output: name='Abraham Lincoln' height_m=1.93


For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  exec(code_obj, self.user_global_ns, self.user_ns)


name='Abraham Lincoln' height_m=1.93


### Token Usage Tracking

In [19]:
from langchain_google_genai import ChatGoogleGenerativeAI

# Initialize model
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

result = llm.invoke("Explain the concept of prompt engineering in one sentence.")

print(result.content)
print("\nUsage Metadata:")
print(result.usage_metadata)

Prompt engineering is the art of crafting effective instructions for AI models to elicit desired and accurate outputs.

Usage Metadata:
{'input_tokens': 10, 'output_tokens': 20, 'total_tokens': 30, 'input_token_details': {'cache_read': 0}}


## Google Gemini Embeddings with LangChain

In [20]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/gemini-embedding-exp-03-07")

# Embed a single query
vector = embeddings.embed_query("hello, world!")

# Embed multiple documents
vectors = embeddings.embed_documents([
    "Today is Monday",
    "Today is Tuesday",
    "Today is April Fools day",
])

### Using with Vector Store

In [21]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore

# Initialize embeddings
embeddings = GoogleGenerativeAIEmbeddings(model="models/gemini-embedding-exp-03-07")

text = "LangChain is the framework for building context-aware reasoning applications"

# Create vector store and retriever
vectorstore = InMemoryVectorStore.from_texts([text], embedding=embeddings)
retriever = vectorstore.as_retriever()

# Retrieve similar documents
retrieved_documents = retriever.invoke("What is LangChain?")
print(retrieved_documents[0].page_content)

LangChain is the framework for building context-aware reasoning applications


### Task Types

In [None]:
%pip install scikit-learn

In [None]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from sklearn.metrics.pairwise import cosine_similarity

# Different task types for different use cases
query_embeddings = GoogleGenerativeAIEmbeddings(
    model="models/gemini-embedding-exp-03-07", 
    task_type="RETRIEVAL_QUERY"  # For queries
)
doc_embeddings = GoogleGenerativeAIEmbeddings(
    model="models/gemini-embedding-exp-03-07", 
    task_type="RETRIEVAL_DOCUMENT"  # For documents
)

# Compare similarity
q_embed = query_embeddings.embed_query("What is the capital of France?")
d_embed = doc_embeddings.embed_documents(["The capital of France is Paris.", "Philipp likes to eat pizza."])

for i, d in enumerate(d_embed):
    similarity = cosine_similarity([q_embed], [d])[0][0]
    print(f"Document {i+1} similarity: {similarity}")

Document 1 similarity: 0.7892893360164779
Document 2 similarity: 0.5410037458373438
