Skip to content

feat: Structured logging#708

Merged
Henry-811 merged 7 commits intodev/v0.1.31from
structured_logging
Dec 29, 2025
Merged

feat: Structured logging#708
Henry-811 merged 7 commits intodev/v0.1.31from
structured_logging

Conversation

@ncrispino
Copy link
Collaborator

feat: Logfire Observability Integration

Summary

Add comprehensive structured logging and observability to MassGen using Logfire (by the Pydantic team). This enables detailed tracing of LLM calls, tool executions, and agent coordination for debugging, performance analysis, and production monitoring.

Key Features

1. Automatic LLM Instrumentation

  • OpenAI/OpenAI-compatible APIs - Automatic tracing of all Chat Completions calls
  • Anthropic Claude - Full request/response tracing with token usage
  • Google GenAI (Gemini) - Instrumented via OpenTelemetry GenAI integration
  • Captures: request parameters, response content, duration, token usage, costs

2. Tool Execution Tracing

  • MCP Tool Calls - Spans for every MCP server tool invocation
  • Custom Tools - Tracing for workflow-defined tools
  • Captures: execution time, input/output sizes, success/failure, error messages

3. Agent Coordination Observability

  • Per-round spans - Track each agent's coordination round (initial_answer, voting, presentation)
  • Token usage logging - Structured logs for input/output/reasoning/cached tokens
  • Coordination events - Vote submissions, winner selection, round transitions

4. Graceful Degradation

  • Falls back to loguru logging when Logfire is disabled/unavailable
  • Zero performance impact when observability is off
  • No-op spans ensure code works identically with or without Logfire

Usage

CLI Flag (Recommended)

massgen --logfire --config your_config.yaml "Your question"

Environment Variable

export MASSGEN_LOGFIRE_ENABLED=true
massgen --config your_config.yaml "Your question"

Programmatic

from massgen.structured_logging import configure_observability

configure_observability(enabled=True, service_name="my-app")

Setup

  1. Create account at https://logfire.pydantic.dev/
  2. Authenticate: logfire auth login
  3. Run with --logfire flag

Type of Change

  • New feature (feat:) - Non-breaking change which adds functionality

Files Changed

New Files

File Lines Description
massgen/structured_logging.py +1,419 Core observability module with TracerProxy, span creation, logging utilities
massgen/tests/test_structured_logging.py +581 Comprehensive test coverage for all observability functions
docs/source/user_guide/logging.rst +321 User documentation with SQL query examples
massgen/skills/massgen-log-analyzer/SKILL.md +924 Skill for AI-assisted log analysis via Logfire MCP
massgen/coordination_tracker.py +85 Coordination state tracking for observability

Modified Files

File Changes Description
massgen/cli.py +36 --logfire flag, _setup_logfire_observability() helper
massgen/orchestrator.py +172 Agent execution spans, round tracking, token usage logging
massgen/tool/_manager.py +139 Custom tool execution tracing
massgen/mcp_tools/client.py +121 MCP tool call spans with timing/metrics
massgen/subagent/manager.py +101 Subagent execution tracing
massgen/backend/base_with_custom_tool_and_mcp.py +207 Tool execution spans in unified streaming
massgen/backend/claude_code.py +85 LLM span tracking for Claude Code backend
massgen/backend/claude.py +61 Client instrumentation for Anthropic
massgen/backend/chat_completions.py +32 Client instrumentation for OpenAI
massgen/persona_generator.py +67 Persona generation tracing
massgen/logger_config.py +25 Logfire-loguru integration
docs/source/reference/cli.rst +18 CLI reference for --logfire flag
pyproject.toml +1 logfire dependency

Architecture

TracerProxy Pattern

# Get tracer (works whether Logfire is enabled or not)
tracer = get_tracer()

# Create spans (no-op if Logfire disabled)
with tracer.span("operation", attributes={"key": "value"}) as span:
    result = do_work()
    span.set_attribute("result", result)

Context Propagation

# Set round context (auto-propagated to nested tool calls)
set_current_round(round_number=2, round_type="voting")

# Tool calls automatically inherit context
await mcp_client.call_tool("search", {...})  # Gets round=2, type=voting

Span Hierarchy

coordination.session
├── coordination.iteration.1
│   ├── agent.agent_a.round_1 (initial_answer)
│   │   ├── llm.openai.stream
│   │   └── mcp.filesystem.read_file
│   ├── agent.agent_b.round_1 (initial_answer)
│   └── ...
├── coordination.iteration.2 (voting)
└── presentation.agent_a

Querying Data

Logfire uses SQL via DataFusion. The main table is records. Custom attributes are stored in the attributes JSON column:

-- Find slowest tool calls
SELECT
  attributes->>'tool.name' as tool_name,
  (attributes->>'tool.execution_time_ms')::float as time_ms,
  attributes->>'massgen.agent_id' as agent_id
FROM records
WHERE attributes->>'tool.type' = 'mcp'
ORDER BY (attributes->>'tool.execution_time_ms')::float DESC
LIMIT 10;

-- Token usage by agent
SELECT
  attributes->>'massgen.agent_id' as agent,
  SUM((attributes->>'massgen.usage.input')::int) as input_tokens,
  SUM((attributes->>'massgen.usage.output')::int) as output_tokens
FROM records
WHERE attributes->>'massgen.usage.input' IS NOT NULL
GROUP BY 1;

Environment Variables

Variable Description
MASSGEN_LOGFIRE_ENABLED Set to true to enable (alternative to --logfire)
LOGFIRE_TOKEN API token (if not using logfire auth login)
LOGFIRE_SERVICE_NAME Override service name (default: massgen)
LOGFIRE_ENVIRONMENT Set environment tag (e.g., production)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my feature works
  • I have run pre-commit on my changed files

Pre-commit Status

# TODO: Run before PR submission
uv run pre-commit run --all-files

How to Test

Basic Test

# Authenticate with Logfire first
logfire auth login

# Run with observability enabled
uv run massgen --logfire --backend openai --model gpt-4o-mini "What is 2+2?"

# Check Logfire dashboard for traces

Unit Tests

uv run pytest massgen/tests/test_structured_logging.py -v

Verify Graceful Degradation

# Without Logfire - should work identically
uv run massgen --backend openai --model gpt-4o-mini "What is 2+2?"

Related Documentation

  • User Guide: docs/source/user_guide/logging.rst (Logfire Observability section)
  • CLI Reference: docs/source/reference/cli.rst
  • Logfire Docs: https://logfire.pydantic.dev/docs/

@ncrispino ncrispino marked this pull request as ready for review December 29, 2025 15:13
@Henry-811 Henry-811 changed the base branch from main to dev/v0.1.31 December 29, 2025 15:18
@Henry-811 Henry-811 merged commit 18ec685 into dev/v0.1.31 Dec 29, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants