<a target="_blank" href="https://colab.research.google.com/github/urcraft/llm_lecture_notebooks/blob/main/07_Gemini_Langfuse_Tracing_Basics.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>


# Gemini + Langfuse: Logging and Tracing Gemini Calls

This notebook shows how to log and trace Gemini API calls in Langfuse using automatic OpenTelemetry instrumentation.


## What you will learn

- How to configure Gemini and Langfuse credentials via Google Colab Secrets.
- How to enable automatic tracing for the Google GenAI SDK.
- How to run Gemini calls and verify traces in the Langfuse UI.


## Setup notes

1. Add these secrets in Colab (`Tools -> Secrets`):
   - `GOOGLE_API_KEY`
   - `LANGFUSE_PUBLIC_KEY`
   - `LANGFUSE_SECRET_KEY`
2. Optional secret:
   - `LANGFUSE_BASE_URL` (EU default: `https://cloud.langfuse.com`, US: `https://us.cloud.langfuse.com`)
3. You can also set these as environment variables.


In [1]:
%pip -q install -U google-genai langfuse openinference-instrumentation-google-genai pandas

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.2/53.2 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.5/79.5 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m728.8/728.8 kB[0m [31m14.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m420.4/420.4 kB[0m [31m16.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.9/10.9 MB[0m [31m53.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m66.5/66.5 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m88.0/88.0 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following

In [2]:
import os
import time
import pandas as pd

SDK_AVAILABLE = False
SDK_ERROR = None

try:
    from langfuse import get_client
    from openinference.instrumentation.google_genai import GoogleGenAIInstrumentor
    from google import genai
    SDK_AVAILABLE = True
except Exception as e:
    SDK_ERROR = e
    print('Import/setup warning:', repr(e))


In [3]:
def get_secret(name: str, default=None):
    value = os.getenv(name)
    if value:
        return value

    try:
        from google.colab import userdata
        value = userdata.get(name)
        if value:
            return value
    except Exception:
        pass

    return default

GOOGLE_API_KEY = get_secret('GOOGLE_API_KEY')
LANGFUSE_PUBLIC_KEY = get_secret('LANGFUSE_PUBLIC_KEY')
LANGFUSE_SECRET_KEY = get_secret('LANGFUSE_SECRET_KEY')
LANGFUSE_BASE_URL = get_secret('LANGFUSE_BASE_URL', 'https://cloud.langfuse.com')

missing = [
    name for name, value in [
        ('GOOGLE_API_KEY', GOOGLE_API_KEY),
        ('LANGFUSE_PUBLIC_KEY', LANGFUSE_PUBLIC_KEY),
        ('LANGFUSE_SECRET_KEY', LANGFUSE_SECRET_KEY),
    ]
    if not value
]

if missing:
    raise ValueError(
        'Missing required secrets/env vars: ' + ', '.join(missing) +
        '. Add them in Colab Secrets or set environment variables.'
    )

os.environ['GOOGLE_API_KEY'] = GOOGLE_API_KEY
os.environ['LANGFUSE_PUBLIC_KEY'] = LANGFUSE_PUBLIC_KEY
os.environ['LANGFUSE_SECRET_KEY'] = LANGFUSE_SECRET_KEY
os.environ['LANGFUSE_BASE_URL'] = LANGFUSE_BASE_URL

print('Secrets loaded successfully.')
print('LANGFUSE_BASE_URL =', LANGFUSE_BASE_URL)


Secrets loaded successfully.
LANGFUSE_BASE_URL = https://cloud.langfuse.com


In [4]:
if not SDK_AVAILABLE:
    raise RuntimeError(f'SDK imports failed: {SDK_ERROR!r}')

# Initialize Langfuse client and check credentials.
langfuse = get_client()
assert langfuse.auth_check(), 'Langfuse auth failed. Verify keys and LANGFUSE_BASE_URL.'

# Enable automatic tracing for Google GenAI SDK calls.
GoogleGenAIInstrumentor().instrument()

print('Langfuse authentication OK.')
print('Gemini instrumentation enabled.')


Langfuse authentication OK.
Gemini instrumentation enabled.


In [5]:
MODEL_ID = 'gemini-2.5-flash'
client = genai.Client(api_key=GOOGLE_API_KEY)
print('Model:', MODEL_ID)


Model: gemini-2.5-flash


In [7]:
def run_gemini(prompt: str):
    start = time.perf_counter()
    try:
        response = client.models.generate_content(
            model=MODEL_ID,
            contents=prompt,
        )
        elapsed = time.perf_counter() - start
        return {
            'ok': True,
            'prompt': prompt,
            'text': (response.text or '').strip(),
            'latency_s': round(elapsed, 3),
        }
    except Exception as e:
        elapsed = time.perf_counter() - start
        return {
            'ok': False,
            'prompt': prompt,
            'text': f'ERROR: {e}',
            'latency_s': round(elapsed, 3),
        }

first_result = run_gemini('In 4 bullet points, explain what observability means for LLM applications.')
print('Success:', first_result['ok'])
print('Latency (s):', first_result['latency_s'])
print('Response:', first_result['text'])


Success: True
Latency (s): 8.582
Response: Observability for LLM applications extends traditional software observability to specifically address the unique challenges and characteristics of generative AI:

*   **Logs & Context:** Capturing comprehensive records of all inputs (user prompts, system prompts, retrieved context), intermediate steps (RAG retrievals, tool calls, moderation results), and LLM outputs for every request. This allows for detailed post-hoc analysis, debugging of unexpected behavior, and understanding the LLM's reasoning path.
*   **Performance Metrics:** Monitoring key quantitative indicators such as latency per LLM call and overall request, token consumption (input, output, total) for cost tracking, API error rates, and throughput. These metrics are crucial for identifying performance bottlenecks, managing operational costs, and optimizing resource usage.
*   **End-to-End Tracing:** Visualizing the complete flow of a user request through multi-step LLM chains, cus

In [8]:
prompts = [
    'Give a one-sentence definition of tracing for LLM apps.',
    'Name three reasons to monitor prompt/response latency.',
    'Write a short note to students on why logs + traces help debugging.'
]

rows = []
for p in prompts:
    out = run_gemini(p)
    rows.append({
        'ok': out['ok'],
        'latency_s': out['latency_s'],
        'prompt': p,
        'response_preview': (out['text'][:160] + '...') if len(out['text']) > 160 else out['text'],
    })

results_df = pd.DataFrame(rows)
results_df


Unnamed: 0,ok,latency_s,prompt,response_preview
0,True,4.811,Give a one-sentence definition of tracing for ...,Tracing for LLM apps is the process of recordi...
1,True,5.92,Name three reasons to monitor prompt/response ...,Here are three reasons to monitor prompt/respo...
2,True,8.142,Write a short note to students on why logs + t...,"Hey Students,\n\nEver wondered how to debug a ..."


In [9]:
results_df.to_excel('results.xlsx', index=False)
print('DataFrame saved to results.xlsx')

try:
    from google.colab import files
    files.download('results.xlsx')
except Exception:
    print('If not running in Colab, download the file from the notebook working directory.')

DataFrame saved to results.xlsx


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## View traces in Langfuse

After running the calls above:

1. Open your Langfuse project.
2. Go to the Trace view/table.
3. Filter to recent traces and inspect entries from this notebook run.
4. Confirm you can see model calls, inputs, outputs, and timing metadata.

Tip: If traces do not appear immediately, wait a few seconds and refresh.


## Checkpoint

- Did at least one Gemini call produce output in the notebook?
- Do you see matching traces in Langfuse?
- Which prompt had the highest latency and why might that happen?


## Troubleshooting

- `Missing required secrets/env vars`: verify secret names exactly.
- `Langfuse auth failed`: check `LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY`, and `LANGFUSE_BASE_URL`.
- No traces visible: ensure the instrumentation cell ran before Gemini calls.
- Colab runtime reset: rerun all setup cells after reset.
