feat: Structured logging by ncrispino · Pull Request #708 · massgen/MassGen

ncrispino · 2025-12-28T21:42:38Z

feat: Logfire Observability Integration

Summary

Add comprehensive structured logging and observability to MassGen using Logfire (by the Pydantic team). This enables detailed tracing of LLM calls, tool executions, and agent coordination for debugging, performance analysis, and production monitoring.

Key Features

1. Automatic LLM Instrumentation

OpenAI/OpenAI-compatible APIs - Automatic tracing of all Chat Completions calls
Anthropic Claude - Full request/response tracing with token usage
Google GenAI (Gemini) - Instrumented via OpenTelemetry GenAI integration
Captures: request parameters, response content, duration, token usage, costs

2. Tool Execution Tracing

MCP Tool Calls - Spans for every MCP server tool invocation
Custom Tools - Tracing for workflow-defined tools
Captures: execution time, input/output sizes, success/failure, error messages

3. Agent Coordination Observability

Per-round spans - Track each agent's coordination round (initial_answer, voting, presentation)
Token usage logging - Structured logs for input/output/reasoning/cached tokens
Coordination events - Vote submissions, winner selection, round transitions

4. Graceful Degradation

Falls back to loguru logging when Logfire is disabled/unavailable
Zero performance impact when observability is off
No-op spans ensure code works identically with or without Logfire

Usage

CLI Flag (Recommended)

massgen --logfire --config your_config.yaml "Your question"

Environment Variable

export MASSGEN_LOGFIRE_ENABLED=true
massgen --config your_config.yaml "Your question"

Programmatic

from massgen.structured_logging import configure_observability

configure_observability(enabled=True, service_name="my-app")

Setup

Create account at https://logfire.pydantic.dev/
Authenticate: logfire auth login
Run with --logfire flag

Type of Change

New feature (feat:) - Non-breaking change which adds functionality

Files Changed

New Files

File	Lines	Description
`massgen/structured_logging.py`	+1,419	Core observability module with TracerProxy, span creation, logging utilities
`massgen/tests/test_structured_logging.py`	+581	Comprehensive test coverage for all observability functions
`docs/source/user_guide/logging.rst`	+321	User documentation with SQL query examples
`massgen/skills/massgen-log-analyzer/SKILL.md`	+924	Skill for AI-assisted log analysis via Logfire MCP
`massgen/coordination_tracker.py`	+85	Coordination state tracking for observability

Modified Files

File	Changes	Description
`massgen/cli.py`	+36	`--logfire` flag, `_setup_logfire_observability()` helper
`massgen/orchestrator.py`	+172	Agent execution spans, round tracking, token usage logging
`massgen/tool/_manager.py`	+139	Custom tool execution tracing
`massgen/mcp_tools/client.py`	+121	MCP tool call spans with timing/metrics
`massgen/subagent/manager.py`	+101	Subagent execution tracing
`massgen/backend/base_with_custom_tool_and_mcp.py`	+207	Tool execution spans in unified streaming
`massgen/backend/claude_code.py`	+85	LLM span tracking for Claude Code backend
`massgen/backend/claude.py`	+61	Client instrumentation for Anthropic
`massgen/backend/chat_completions.py`	+32	Client instrumentation for OpenAI
`massgen/persona_generator.py`	+67	Persona generation tracing
`massgen/logger_config.py`	+25	Logfire-loguru integration
`docs/source/reference/cli.rst`	+18	CLI reference for --logfire flag
`pyproject.toml`	+1	logfire dependency

Architecture

TracerProxy Pattern

# Get tracer (works whether Logfire is enabled or not)
tracer = get_tracer()

# Create spans (no-op if Logfire disabled)
with tracer.span("operation", attributes={"key": "value"}) as span:
    result = do_work()
    span.set_attribute("result", result)

Context Propagation

# Set round context (auto-propagated to nested tool calls)
set_current_round(round_number=2, round_type="voting")

# Tool calls automatically inherit context
await mcp_client.call_tool("search", {...})  # Gets round=2, type=voting

Span Hierarchy

coordination.session
├── coordination.iteration.1
│   ├── agent.agent_a.round_1 (initial_answer)
│   │   ├── llm.openai.stream
│   │   └── mcp.filesystem.read_file
│   ├── agent.agent_b.round_1 (initial_answer)
│   └── ...
├── coordination.iteration.2 (voting)
└── presentation.agent_a

Querying Data

Logfire uses SQL via DataFusion. The main table is records. Custom attributes are stored in the attributes JSON column:

-- Find slowest tool calls
SELECT
  attributes->>'tool.name' as tool_name,
  (attributes->>'tool.execution_time_ms')::float as time_ms,
  attributes->>'massgen.agent_id' as agent_id
FROM records
WHERE attributes->>'tool.type' = 'mcp'
ORDER BY (attributes->>'tool.execution_time_ms')::float DESC
LIMIT 10;

-- Token usage by agent
SELECT
  attributes->>'massgen.agent_id' as agent,
  SUM((attributes->>'massgen.usage.input')::int) as input_tokens,
  SUM((attributes->>'massgen.usage.output')::int) as output_tokens
FROM records
WHERE attributes->>'massgen.usage.input' IS NOT NULL
GROUP BY 1;

Environment Variables

Variable	Description
`MASSGEN_LOGFIRE_ENABLED`	Set to `true` to enable (alternative to `--logfire`)
`LOGFIRE_TOKEN`	API token (if not using `logfire auth login`)
`LOGFIRE_SERVICE_NAME`	Override service name (default: `massgen`)
`LOGFIRE_ENVIRONMENT`	Set environment tag (e.g., `production`)

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have added tests that prove my feature works
I have run pre-commit on my changed files

Pre-commit Status

# TODO: Run before PR submission
uv run pre-commit run --all-files

How to Test

Basic Test

# Authenticate with Logfire first
logfire auth login

# Run with observability enabled
uv run massgen --logfire --backend openai --model gpt-4o-mini "What is 2+2?"

# Check Logfire dashboard for traces

Unit Tests

uv run pytest massgen/tests/test_structured_logging.py -v

Verify Graceful Degradation

# Without Logfire - should work identically
uv run massgen --backend openai --model gpt-4o-mini "What is 2+2?"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Structured logging#708

feat: Structured logging#708
Henry-811 merged 7 commits intodev/v0.1.31from
structured_logging

ncrispino commented Dec 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ncrispino commented Dec 28, 2025

feat: Logfire Observability Integration

Summary

Key Features

1. Automatic LLM Instrumentation

2. Tool Execution Tracing

3. Agent Coordination Observability

4. Graceful Degradation

Usage

CLI Flag (Recommended)

Environment Variable

Programmatic

Setup

Type of Change

Files Changed

New Files

Modified Files

Architecture

TracerProxy Pattern

Context Propagation

Span Hierarchy

Querying Data

Environment Variables

Checklist

Pre-commit Status

How to Test

Basic Test

Unit Tests

Verify Graceful Degradation

Related Documentation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants