Skip to content

feat(loop): expose turn_count, final_content, usage, request_metadata in result#136

Merged
chubes4 merged 1 commit into
mainfrom
expose-loop-result-fields
May 10, 2026
Merged

feat(loop): expose turn_count, final_content, usage, request_metadata in result#136
chubes4 merged 1 commit into
mainfrom
expose-loop-result-fields

Conversation

@chubes4
Copy link
Copy Markdown
Contributor

@chubes4 chubes4 commented May 10, 2026

Summary

Closes #135. Expose four universally-needed observability fields on the conversation loop result, eliminating the need for consumers to pass mutable state by reference through their turn runner closure.

Field Type Source
turn_count int Existing internal loop counter, surfaced
final_content string Derived from last assistant message in $messages
usage array Summed from each turn runner's optional usage return field
request_metadata array Overwritten by each turn runner's optional request_metadata field

Why

Consumers building agents on top of WP_Agent_Conversation_Loop need more than messages + tool_execution_results. They need to know how many turns ran, what the agent finally said, what it cost (tokens), and what the last provider request looked like. The substrate already had access to all of this — it just didn't formalize it as part of the result contract.

Without these fields, consumers resort to passing mutable state by reference through their turn runner closure. The data-machine consumer (inc/Engine/AI/conversation-loop.php:datamachine_run_conversation) does exactly this with six & parameters threaded through a separate build_turn_runner factory just so the outer function can read accumulators after the loop returns. See #135 for full evidence.

What changes

Loop (class-wp-agent-conversation-loop.php)

  • Accumulates $turns_run (existing counter), $total_usage (summed across turns), $request_metadata (latest turn).
  • Reads $result['usage'] and $result['request_metadata'] from each turn runner return value (both optional — turn runners that don't report them get empty defaults).
  • Adds all four fields to the final result.
  • Two new private helpers:
    • accumulate_usage( $running, $turn ) — sums canonical token-count fields and preserves provider-specific keys like cache_creation_input_tokens or reasoning_tokens. Numeric fields are summed; non-numeric fields are taken from the latest turn.
    • extract_final_content( $messages ) — walks the transcript backward for the last non-empty assistant text content. Returns '' when none exists (e.g. tool-call-only tail).

Result (class-wp-agent-conversation-result.php)

Validates the four new fields when present:

  • turn_count must be int
  • final_content must be string
  • usage must be array
  • request_metadata must be array

All four are optional in the schema — normalize() won't reject results that don't include them, so callers that construct results without the loop (test fixtures, manual normalization) keep working.

Compatibility

100% additive. Every existing consumer keeps working:

  • Consumers reading messages, tool_execution_results, status, budget see no change.
  • Turn runners that don't return usage or request_metadata get empty defaults in the result.
  • Turn runners that already return request_metadata (the existing conversation-loop-transcript-persister-smoke test does this) continue to round-trip correctly — the loop reads it from the per-turn return shape and surfaces it on the final result, exactly as before.

Test plan

All 30 substrate smoke tests pass on this branch:

PASS conversation-loop-smoke                              (6 assertions)
PASS conversation-loop-tool-execution-smoke               (19)
PASS conversation-loop-completion-policy-smoke            (6)
PASS conversation-loop-events-smoke                       (17)
PASS conversation-loop-budgets-smoke                      (24)
PASS conversation-loop-transcript-persister-smoke         (18 — including request_metadata round-trip)
PASS conversation-runner-contracts-smoke                  (18)
PASS iteration-budget-smoke                               (22)
PASS conversation-compaction-smoke
PASS conversation-transcript-lock-smoke
PASS effective-agent-resolver-smoke
PASS execution-principal-smoke
PASS guidelines-substrate-smoke
PASS identity-smoke
PASS markdown-section-compaction-smoke
PASS memory-metadata-contract-smoke
PASS message-envelope-smoke
PASS no-product-imports-smoke
PASS pending-action-store-contract-smoke
PASS registry-smoke
PASS remote-bridge-smoke
PASS routine-smoke                                        (22)
PASS subagents-smoke                                      (6)
PASS tool-policy-contracts-smoke
PASS tool-runtime-smoke
PASS webhook-safety-smoke
PASS workflow-bindings-smoke                              (13)
PASS workflow-runner-smoke                                (26)
PASS workflow-spec-validator-smoke                        (24)
PASS workspace-scope-smoke

Follow-up

After this merges, the data-machine consumer will refactor to drop its by-reference accumulator dance and read directly from the new result fields. Separate PR forthcoming on Extra-Chill/data-machine.

@chubes4 chubes4 force-pushed the expose-loop-result-fields branch from f438b24 to 31a8d9a Compare May 10, 2026 20:18
… in result

The conversation loop's result contract previously surfaced only
`messages`, `tool_execution_results`, `events`, and the optional
`status`/`budget` pair. Consumers building real agents on top of the
substrate need four additional pieces of information:

* `turn_count` — how many turns actually executed
* `final_content` — text of the last assistant message
* `usage` — accumulated token counts across all turns
* `request_metadata` — most recent turn's provider request descriptor

The substrate already had access to all four — `turn_count` lives in
the loop counter, `final_content` is computable from `messages`, and
turn runners were already returning `usage` and `request_metadata` per
turn (the latter was even tested for passthrough). They just weren't
formalized as part of the result contract.

Without these fields, consumers like data-machine resort to passing
mutable state by reference through their turn runner closure so the
outer caller can read accumulators after the loop returns:

    $turn_runner = build_turn_runner(
        ...
        &$last_request_metadata,
        &$total_usage,
        &$turns_run,
        &$final_content,
    );
    WP_Agent_Conversation_Loop::run( $messages, $turn_runner, $options );
    // Now read the by-reference accumulators back into the result.

This is gross and a sign the substrate is missing universal-consumer
observability. Other future consumers will hit the same gap.

## What changes

* The loop accumulates `turn_count` (existing internal counter, just
  surfaced now), `final_content` (extracted from messages at result
  time), `total_usage` (summed from each turn runner's optional
  `usage` field), and `request_metadata` (overwritten by each turn
  runner's optional `request_metadata` field).
* `WP_Agent_Conversation_Result::normalize()` validates the four new
  fields when present (int, string, array, array respectively).
* Two new private helpers: `accumulate_usage()` sums numeric token
  counts while preserving provider-specific extras like
  `cache_creation_input_tokens` or `reasoning_tokens`;
  `extract_final_content()` walks messages backward for the last
  non-empty assistant text content.

## Compatibility

All four fields are **additive** to the result shape — existing
consumers reading `messages`/`tool_execution_results`/`status`/`budget`
continue to work unchanged.

The turn-runner contract gains two optional return fields (`usage`,
`request_metadata`); turn runners that don't return them get the empty
default. Turn runners that already return `request_metadata` (the
existing `conversation-loop-transcript-persister-smoke` test does this)
continue to round-trip correctly.

## Verification

All 30 substrate smoke tests pass:

* conversation-loop-smoke (6 assertions)
* conversation-loop-tool-execution-smoke (19)
* conversation-loop-completion-policy-smoke (6)
* conversation-loop-events-smoke (17)
* conversation-loop-budgets-smoke (24)
* conversation-loop-transcript-persister-smoke (18 — including the
  pre-existing request_metadata round-trip test)
* conversation-runner-contracts-smoke (18)
* iteration-budget-smoke (22)
* plus the broader substrate suites (compaction, identity, guidelines,
  workflows, routines, subagents, etc.)

Closes #135
@chubes4 chubes4 force-pushed the expose-loop-result-fields branch from 31a8d9a to 93a08da Compare May 10, 2026 20:25
@chubes4 chubes4 merged commit c365119 into main May 10, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Loop result is missing universally-needed fields (turn_count, final_content, usage, last_request_metadata)

1 participant