Skip to content

Metrics

Eshan Roy edited this page Jun 21, 2026 · 1 revision

Metrics & Observability

M31 Autonomous (M31A) includes a centralized session observability pipeline that tracks tool execution, LLM interactions, and workflow phase metrics. Metrics are collected thread-safely and persisted as JSON for session review and dashboards.

Source: pkg/metrics/

Architecture

flowchart LR
    subgraph Collection
        TC[Tool Calls] --> C[Collector]
        LLM[LLM Interactions] --> C
        PT[Phase Transitions] --> C
    end

    subgraph Persistence
        C --> Snap[Snapshot]
        Snap --> Flush[Flush to JSON]
    end

    subgraph Storage
        Flush --> File["METRICS.json<br/>session directory"]
    end

    style C fill:#e1f5fe
    style File fill:#e8f5e9
Loading

Collector

Source: pkg/metrics/collector.go

type Collector struct {
    mu          sync.Mutex
    sessionID   string
    sessionsDir string
    metrics     *SessionMetrics
    enabled     bool
}

The Collector is created once per session. When enabled is false, all recording methods become no-ops, ensuring zero overhead when metrics are disabled.

Recording Methods

Method Description
RecordToolCall(name, success, durationMs) Track tool execution: count, success/fail, duration
RecordLLMInteraction(phase, usage, cost) Track LLM calls: tokens (prompt/completion), cost
RecordPhaseTransition(phase) Track phase entry count
RecordPhaseDuration(phase, durationMs, success) Track phase completion time and outcome
RecordHealTrigger(phase) Track self-heal invocation count
RecordBisectTrigger(phase) Track git bisect invocation count

All methods are thread-safe via sync.Mutex.

Snapshot & Persistence

Method Description
Snapshot() Deep copy of current metrics (safe for concurrent reads)
Flush() Persist to {sessionDir}/METRICS.json
Stop() Flush + log summary
Load(sessionsDir, sessionID) Read metrics from disk
Save(sessionsDir, sessionID, m) Write metrics to disk

Metric Types

Source: pkg/metrics/types.go

ToolMetric

Tracks per-tool execution statistics:

type ToolMetric struct {
    Name         string  `json:"name"`
    CallCount    int64   `json:"call_count"`
    SuccessCount int64   `json:"success_count"`
    FailCount    int64   `json:"fail_count"`
    TotalDurMs   int64   `json:"total_duration_ms"`
    AvgDurMs     float64 `json:"avg_duration_ms"`
}

LLMMetric

Tracks per-phase LLM token usage and cost:

type LLMMetric struct {
    Phase            WorkflowPhase `json:"phase"`
    PromptTokens     int64         `json:"prompt_tokens"`
    CompletionTokens int64         `json:"completion_tokens"`
    TotalTokens      int64         `json:"total_tokens"`
    Cost             float64       `json:"cost"`
    InteractionCount int64         `json:"interaction_count"`
}

PhaseMetric

Tracks workflow phase execution:

type PhaseMetric struct {
    Phase              WorkflowPhase `json:"phase"`
    DurationMs         int64         `json:"duration_ms"`
    TransitionCount    int64         `json:"transition_count"`
    HealTriggerCount   int64         `json:"heal_trigger_count"`
    BisectTriggerCount int64         `json:"bisect_trigger_count"`
    Success            bool          `json:"success"`
}

SessionMetrics

Top-level container persisted per session:

type SessionMetrics struct {
    SessionID string        `json:"session_id"`
    StartedAt time.Time     `json:"started_at"`
    UpdatedAt time.Time     `json:"updated_at"`
    Tools     []ToolMetric  `json:"tools"`
    LLMs      []LLMMetric   `json:"llms"`
    Phases    []PhaseMetric `json:"phases"`
}

Persistence Format

Metrics are stored as pretty-printed JSON in the session directory:

~/.m31a/sessions/<session-id>/METRICS.json

File permissions: 0600 (owner read/write only). Directory permissions: 0700.

Example Output

{
  "session_id": "abc123",
  "started_at": "2026-06-20T10:00:00Z",
  "updated_at": "2026-06-20T10:15:30Z",
  "tools": [
    {
      "name": "Bash",
      "call_count": 12,
      "success_count": 11,
      "fail_count": 1,
      "total_duration_ms": 45200,
      "avg_duration_ms": 3766.67
    },
    {
      "name": "FileRead",
      "call_count": 8,
      "success_count": 8,
      "fail_count": 0,
      "total_duration_ms": 120,
      "avg_duration_ms": 15.0
    }
  ],
  "llms": [
    {
      "phase": "execute",
      "prompt_tokens": 45000,
      "completion_tokens": 12000,
      "total_tokens": 57000,
      "cost": 0.038,
      "interaction_count": 15
    }
  ],
  "phases": [
    {
      "phase": "execute",
      "duration_ms": 180000,
      "transition_count": 1,
      "heal_trigger_count": 2,
      "bisect_trigger_count": 0,
      "success": true
    }
  ]
}

Integration Points

flowchart TD
    subgraph Workflow Engine
        Execute[Execute Phase] --> TC[RecordToolCall]
        Stream[Stream LLM] --> LI[RecordLLMInteraction]
        Transition[Phase Change] --> PT[RecordPhaseTransition]
        Heal[Self-Heal] --> HT[RecordHealTrigger]
    end

    subgraph TUI
        Metrics[Metrics Screen] --> Snap[Collector.Snapshot]
        Session[Session End] --> Stop[Collector.Stop]
    end

    TC --> Collector
    LI --> Collector
    PT --> Collector
    HT --> Collector

    style Collector fill:#e1f5fe
Loading

Configuration

Metrics collection is enabled by default. Disable via config:

[features]
metrics_enabled = false    # Disable all metrics collection

Use Cases

  • Session review: Inspect METRICS.json to understand what happened during a session
  • Cost tracking: Monitor LLM token usage and cost per phase
  • Tool efficiency: Identify slow or frequently-failing tools
  • Phase analysis: Track which phases take the most time and trigger self-healing
  • Dashboard integration: JSON format enables easy ingestion into external monitoring tools

Clone this wiki locally