feat(loop): expose turn_count, final_content, usage, request_metadata in result#136
Merged
Conversation
f438b24 to
31a8d9a
Compare
… in result
The conversation loop's result contract previously surfaced only
`messages`, `tool_execution_results`, `events`, and the optional
`status`/`budget` pair. Consumers building real agents on top of the
substrate need four additional pieces of information:
* `turn_count` — how many turns actually executed
* `final_content` — text of the last assistant message
* `usage` — accumulated token counts across all turns
* `request_metadata` — most recent turn's provider request descriptor
The substrate already had access to all four — `turn_count` lives in
the loop counter, `final_content` is computable from `messages`, and
turn runners were already returning `usage` and `request_metadata` per
turn (the latter was even tested for passthrough). They just weren't
formalized as part of the result contract.
Without these fields, consumers like data-machine resort to passing
mutable state by reference through their turn runner closure so the
outer caller can read accumulators after the loop returns:
$turn_runner = build_turn_runner(
...
&$last_request_metadata,
&$total_usage,
&$turns_run,
&$final_content,
);
WP_Agent_Conversation_Loop::run( $messages, $turn_runner, $options );
// Now read the by-reference accumulators back into the result.
This is gross and a sign the substrate is missing universal-consumer
observability. Other future consumers will hit the same gap.
## What changes
* The loop accumulates `turn_count` (existing internal counter, just
surfaced now), `final_content` (extracted from messages at result
time), `total_usage` (summed from each turn runner's optional
`usage` field), and `request_metadata` (overwritten by each turn
runner's optional `request_metadata` field).
* `WP_Agent_Conversation_Result::normalize()` validates the four new
fields when present (int, string, array, array respectively).
* Two new private helpers: `accumulate_usage()` sums numeric token
counts while preserving provider-specific extras like
`cache_creation_input_tokens` or `reasoning_tokens`;
`extract_final_content()` walks messages backward for the last
non-empty assistant text content.
## Compatibility
All four fields are **additive** to the result shape — existing
consumers reading `messages`/`tool_execution_results`/`status`/`budget`
continue to work unchanged.
The turn-runner contract gains two optional return fields (`usage`,
`request_metadata`); turn runners that don't return them get the empty
default. Turn runners that already return `request_metadata` (the
existing `conversation-loop-transcript-persister-smoke` test does this)
continue to round-trip correctly.
## Verification
All 30 substrate smoke tests pass:
* conversation-loop-smoke (6 assertions)
* conversation-loop-tool-execution-smoke (19)
* conversation-loop-completion-policy-smoke (6)
* conversation-loop-events-smoke (17)
* conversation-loop-budgets-smoke (24)
* conversation-loop-transcript-persister-smoke (18 — including the
pre-existing request_metadata round-trip test)
* conversation-runner-contracts-smoke (18)
* iteration-budget-smoke (22)
* plus the broader substrate suites (compaction, identity, guidelines,
workflows, routines, subagents, etc.)
Closes #135
31a8d9a to
93a08da
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #135. Expose four universally-needed observability fields on the conversation loop result, eliminating the need for consumers to pass mutable state by reference through their turn runner closure.
turn_countintfinal_contentstring$messagesusagearrayusagereturn fieldrequest_metadataarrayrequest_metadatafieldWhy
Consumers building agents on top of
WP_Agent_Conversation_Loopneed more thanmessages+tool_execution_results. They need to know how many turns ran, what the agent finally said, what it cost (tokens), and what the last provider request looked like. The substrate already had access to all of this — it just didn't formalize it as part of the result contract.Without these fields, consumers resort to passing mutable state by reference through their turn runner closure. The
data-machineconsumer (inc/Engine/AI/conversation-loop.php:datamachine_run_conversation) does exactly this with six¶meters threaded through a separatebuild_turn_runnerfactory just so the outer function can read accumulators after the loop returns. See #135 for full evidence.What changes
Loop (
class-wp-agent-conversation-loop.php)$turns_run(existing counter),$total_usage(summed across turns),$request_metadata(latest turn).$result['usage']and$result['request_metadata']from each turn runner return value (both optional — turn runners that don't report them get empty defaults).accumulate_usage( $running, $turn )— sums canonical token-count fields and preserves provider-specific keys likecache_creation_input_tokensorreasoning_tokens. Numeric fields are summed; non-numeric fields are taken from the latest turn.extract_final_content( $messages )— walks the transcript backward for the last non-empty assistant text content. Returns''when none exists (e.g. tool-call-only tail).Result (
class-wp-agent-conversation-result.php)Validates the four new fields when present:
turn_countmust beintfinal_contentmust bestringusagemust bearrayrequest_metadatamust bearrayAll four are optional in the schema —
normalize()won't reject results that don't include them, so callers that construct results without the loop (test fixtures, manual normalization) keep working.Compatibility
100% additive. Every existing consumer keeps working:
messages,tool_execution_results,status,budgetsee no change.usageorrequest_metadataget empty defaults in the result.request_metadata(the existingconversation-loop-transcript-persister-smoketest does this) continue to round-trip correctly — the loop reads it from the per-turn return shape and surfaces it on the final result, exactly as before.Test plan
All 30 substrate smoke tests pass on this branch:
Follow-up
After this merges, the
data-machineconsumer will refactor to drop its by-reference accumulator dance and read directly from the new result fields. Separate PR forthcoming onExtra-Chill/data-machine.