## <b><font color='darkblue'>Preface</font></b>
<b><font size='3ptx'>LangChain provides powerful callback mechanisms that allow you to capture and analyze LLM calls</font>.</b> When using [**`langchain_google_genai.ChatGoogleGenerativeAI`**](https://python.langchain.com/docs/integrations/chat/google_generative_ai/), you can leverage these callbacks to log input prompts, model outputs, token usage, and other relevant information.

Here's how you can collect LLM calls for analysis, along with sample code:

## <b><font color='darkblue'>1. Using LangChain's Built-in Callbacks (e.g., [`LangChainTracer`](https://python.langchain.com/api_reference/core/tracers/langchain_core.tracers.langchain.LangChainTracer.html) for LangSmith)</font></b>
<b><font size='3ptx'>LangChain has deep integration with [**LangSmith**](https://www.langchain.com/langsmith), their platform for debugging, testing, evaluating, and monitoring LLM applications</font></b>. If you set up LangSmith, all your LLM calls (including those from ChatGoogleGenerativeAI) will be automatically logged and available for analysis in the LangSmith UI.

### <b><font color='darkgreen'>Steps:</font></b>
* **Install LangSmith**: `pip install -U langsmith`
* **Set environment variables:**
```shell
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="your_langsmith_api_key"
export LANGCHAIN_PROJECT="your_project_name" # Optional, for organizing runs
```

* **Run your LangChain code**: When these environment variables are set, LangChain automatically instruments your LLM calls, sending the data to LangSmith.

### <b><font color='darkgreen'>Sample Code (Conceptual, as LangSmith handles the collection automatically):</font></b>

In [2]:
import os
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage, SystemMessage

# Ensure your Google API Key is set as an environment variable
# os.environ["GOOGLE_API_KEY"] = "YOUR_GOOGLE_API_KEY"

# Ensure LangSmith environment variables are set (as described above)
# export LANGCHAIN_TRACING_V2="true"
# export LANGCHAIN_API_KEY="your_langsmith_api_key"
# export LANGCHAIN_PROJECT="My Gemini App"

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What is the capital of France?"),
]

# Invoke the LLM - this call will be automatically traced by LangSmith
response = llm.invoke(messages)

In [3]:
response

AIMessage(content='The capital of France is Paris.', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.0-flash', 'safety_ratings': []}, id='run--2ee2560b-57a3-412f-99e0-0206190ad95f-0')

Then you should be able to observe below information from LangSmith page:
![ui](images/1.png)

## <b><font color='darkblue'>2. Custom Callback Handlers for Local Collection/Logging</font></b>
If you don't want to use LangSmith or need more granular control over where and how the data is collected, you can implement a custom [**BaseCallbackHandler**](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html). This allows you to define what happens at different stages of the LLM call (e.g., when a call starts, ends, or streams a chunk).

### <b><font color='darkgreen'>Steps:</font></b>
* **Create a custom callback handler**: Inherit from [**`langchain_core.callbacks.BaseCallbackHandler`**](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html) and override the methods you're interested in (e.g., `on_llm_start`, `on_llm_end`, `on_chat_model_start`, `on_chat_model_end`).
* **Pass the handler to your LLM**: Use the `callbacks` argument when invoking the LLM or creating a chain.

In [6]:
import os
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage, SystemMessage, BaseMessage
from langchain_core.callbacks import BaseCallbackHandler
from typing import Any, Dict, List, Union
import json

# Ensure your Google API Key is set as an environment variable
# os.environ["GOOGLE_API_KEY"] = "YOUR_GOOGLE_API_KEY"

# Custom JSON Encoder to handle UUID objects
class UUIDEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, UUID):
            # Convert UUID objects to their string representation
            return str(obj)
        # Let the base class default method raise the TypeError for other types
        return json.JSONEncoder.default(self, obj)

class LLMCallCollector(BaseCallbackHandler):
    """A custom callback handler to collect LLM call details."""

    def __init__(self):
        self.llm_calls = []

    def on_llm_start(
        self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
    ) -> None:
        """Run when LLM starts running."""
        print(f"LLM Call Started: {serialized.get('name')}")
        self.llm_calls.append({
            "event": "llm_start",
            "model_name": serialized.get("name"),
            "prompts": prompts,
            "timestamp": pd.Timestamp.now().isoformat(), # Using pandas for timestamp for convenience
            "kwargs": kwargs
        })

    def on_llm_end(self, response: Any, **kwargs: Any) -> None:
        """Run when LLM ends running."""
        print(f"LLM Call Ended: Response received.")
        # Extract relevant info from response object, which varies by model type
        # For ChatGoogleGenerativeAI, response is typically an AIMessage
        output_content = None
        token_usage = None

        if hasattr(response, 'content'):
            output_content = response.content
        if hasattr(response, 'response_metadata') and 'usage_metadata' in response.response_metadata:
            token_usage = response.response_metadata['usage_metadata']

        self.llm_calls.append({
            "event": "llm_end",
            "output_content": output_content,
            "token_usage": token_usage,
            "timestamp": pd.Timestamp.now().isoformat(),
            "kwargs": kwargs
        })

    def on_chat_model_start(
        self,
        serialized: Dict[str, Any],
        messages: List[List[BaseMessage]],
        **kwargs: Any,
    ) -> Any:
        """Run when Chat Model starts running."""
        print(f"Chat Model Call Started: {serialized.get('name')}")
        # Convert BaseMessage objects to dictionary for easier logging
        formatted_messages = []
        for msg_list in messages:
            formatted_inner_messages = []
            for msg in msg_list:
                formatted_inner_messages.append({"type": msg.type, "content": msg.content})
            formatted_messages.append(formatted_inner_messages)

        self.llm_calls.append({
            "event": "chat_model_start",
            "model_name": serialized.get("name"),
            "input_messages": formatted_messages,
            "timestamp": pd.Timestamp.now().isoformat(),
            "kwargs": kwargs
        })

    def on_chat_model_end(self, response: Any, **kwargs: Any) -> Any:
        """Run when Chat Model ends running."""
        print(f"Chat Model Call Ended: Response received.")
        output_content = None
        token_usage = None
        
        if hasattr(response, 'content'):
            output_content = response.content
        if hasattr(response, 'response_metadata') and 'usage_metadata' in response.response_metadata:
            token_usage = response.response_metadata['usage_metadata']
            
        self.llm_calls.append({
            "event": "chat_model_end",
            "output_content": output_content,
            "token_usage": token_usage,
            "timestamp": pd.Timestamp.now().isoformat(),
            "kwargs": kwargs
        })

In [7]:
# Initialize the LLM and your custom collector
import pandas as pd # Used for timestamp, ensure it's installed: pip install pandas

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")
collector = LLMCallCollector()

messages_to_send = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="Tell me a fun fact about giraffes."),
]

# Invoke the LLM with the custom callback handler
response = llm.invoke(messages_to_send, config={"callbacks": [collector]})

print("\nLLM Response:")
print(response.content)

print("\n--- Collected LLM Call Data ---")
for call in collector.llm_calls:
    print(json.dumps(call, indent=2, default=str))

# Example of saving the collected data to a JSON file
with open("/tmp/llm_calls_log.json", "w") as f:
    json.dump(collector.llm_calls, f, indent=2, default=str)

print("\nLLM call data saved to llm_calls_log.json")

Chat Model Call Started: ChatGoogleGenerativeAI
LLM Call Ended: Response received.

LLM Response:
Here's a fun fact about giraffes:

Giraffes only need to drink water once every few days! They get most of their hydration from the plants they eat. This is pretty handy when you live in a dry, African savanna!

--- Collected LLM Call Data ---
{
  "event": "chat_model_start",
  "model_name": "ChatGoogleGenerativeAI",
  "input_messages": [
    [
      {
        "type": "system",
        "content": "You are a helpful assistant."
      },
      {
        "type": "human",
        "content": "Tell me a fun fact about giraffes."
      }
    ]
  ],
  "timestamp": "2025-07-13T15:34:45.028323",
  "kwargs": {
    "run_id": "26c0c0d6-b15c-4f2e-9646-eae766e9cccb",
    "parent_run_id": null,
    "tags": [],
    "metadata": {
      "ls_provider": "google_genai",
      "ls_model_name": "gemini-2.0-flash",
      "ls_model_type": "chat",
      "ls_temperature": 0.7,
      "revision_id": "v0.1-317-g26179b9-

By using either LangSmith or custom callback handlers, you can effectively collect and analyze all your LLM calls made with [**`langchain_google_genai.ChatGoogleGenerativeAI`**](https://python.langchain.com/docs/integrations/chat/google_generative_ai/).