---
title: Query Data in Langfuse via the SDK
description: All data in Langfuse is available via API. This Python notebook includes a number of examples of how to use the Langfuse SDK to query data.
category: Examples
---

# Example: Query Data in Langfuse via the SDK

This notebook demonstrates how to programmatically access your LLM observability data from Langfuse using the Python SDK. As outlined in our [documentation](https://langfuse.com/docs/query-traces), Langfuse provides several methods to fetch traces, observations, and sessions for various use cases like collecting few-shot examples, creating datasets, or preparing training data for fine-tuning.

We'll explore the main query functions and show practical examples of filtering and processing the returned data.

**This notebook is work-in-progress, feel free to contribute additional examples that you find useful.**

## Setup

In [None]:
!pip install langfuse --upgrade

In [1]:
import os

# Get keys for your project from the project settings page: https://cloud.langfuse.com
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..." 
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..." 
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com" # ðŸ‡ªðŸ‡º EU region
# os.environ["LANGFUSE_HOST"] = "https://us.cloud.langfuse.com" # ðŸ‡ºðŸ‡¸ US region

# Your openai key
os.environ["OPENAI_API_KEY"] = "sk-proj-..."

In [2]:
from langfuse import get_client

langfuse = get_client()

In [3]:
import pandas as pd
# helper function
def pydantic_list_to_dataframe(pydantic_list):
    """
    Convert a list of pydantic objects to a pandas dataframe.
    """
    data = []
    for item in pydantic_list:
        data.append(item.dict())
    return pd.DataFrame(data)

## Fetch Multiple Traces

Default: get the last 50 traces

In [4]:
traces = langfuse.api.trace.list(limit=50)
# pydantic_list_to_dataframe(traces.data).head(1)

Get traces created by a specific user

In [5]:
traces = langfuse.api.trace.list(user_id="u-svQKrql")
# pydantic_list_to_dataframe(traces.data).head(4)

Fetch many traces via pagination:

In [6]:
all_traces = []
limit = 50  # Adjust as needed to balance performance and data retrieval.
page = 1
while True:
    traces = langfuse.api.trace.list(limit=limit, page=page)
    all_traces.extend(traces.data)
    if len(traces.data) < limit or len(all_traces) >= 1000:
        break
    page += 1

print(f"Retrieved {len(all_traces)} traces.")

Retrieved 1000 traces.


## Fetch Single Trace

Simple example: fetch and render as json -> get the full traces including evals, observation inputs/outputs, timings and costs

In [8]:
trace = langfuse.api.trace.get("9a6071d27444a0f5d7cc431ecdd72e70")
# print(trace.json(indent=1))

Summarize cost by model

In [9]:
trace = langfuse.api.trace.get("9a6071d27444a0f5d7cc431ecdd72e70")
observations = trace.observations

In [10]:
import pandas as pd

def summarize_usage(observations):
    """Summarizes usage data grouped by model."""

    usage_data = []
    for obs in observations:
        usage = obs.usage
        if usage:
            usage_data.append({
                'model': obs.model,
                'input': usage.input,
                'output': usage.output,
                'total': usage.total,
            })

    df = pd.DataFrame(usage_data)
    if df.empty:
      return pd.DataFrame()

    summary = df.groupby('model').sum()
    return summary

# Example usage (assuming 'observations' is defined as in the provided code):
summary_df = summarize_usage(observations)
summary_df

Unnamed: 0_level_0,input,output,total
model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
gpt-4o-2024-08-06,31,100,131


## Fetch Multiple Observations

Simple example:

In [11]:
observations = langfuse.api.observations.get_many(limit=50)
# pydantic_list_to_dataframe(observations.data).head(1)

In [12]:
observations

ObservationsViews(data=[ObservationsView(id='a90e84da299a0231', trace_id='e1e6a3c770b793d3605360fe3d42581a', type='GENERATION', name='OpenAI-generation', start_time=datetime.datetime(2025, 6, 12, 16, 35, 45, 184000, tzinfo=datetime.timezone.utc), end_time=datetime.datetime(2025, 6, 12, 16, 35, 49, 212000, tzinfo=datetime.timezone.utc), completion_start_time=None, model='gpt-4o-mini-2024-07-18', model_parameters={'temperature': 1, 'max_tokens': 'Infinity', 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}, input=[{'content': [{'type': 'text', 'text': "What's in this image?"}, {'type': 'image_url', 'image_url': {'url': '@@@langfuseMedia:type=image/jpeg|id=i5BuV2qX9nPaAAPf7c0gCY|source=base64_data_uri@@@'}}], 'role': 'user'}], version=None, metadata={'resourceAttributes': {'telemetry.sdk.language': 'python', 'telemetry.sdk.name': 'opentelemetry', 'telemetry.sdk.version': '1.33.1', 'service.name': 'unknown_service'}, 'scope': {'name': 'langfuse-sdk', 'version': '3.0.1', 'attribute

## Fetch Single Observation

In [13]:
observation = langfuse.api.observations.get("24ef0545d1a9d66d")
# print(observation.json(indent=1))

## Fetch Multiple Sessions

Simple example

In [17]:
sessions = langfuse.api.sessions.list(limit=50)
# pydantic_list_to_dataframe(sessions.data).head(1)