Skip to content

feat!: Add ManagedResult, RunnerResult, and Runner protocol; rename invoke() to run()#148

Merged
jsonbailey merged 15 commits intomainfrom
jb/aic-2388/managed-result
May 1, 2026
Merged

feat!: Add ManagedResult, RunnerResult, and Runner protocol; rename invoke() to run()#148
jsonbailey merged 15 commits intomainfrom
jb/aic-2388/managed-result

Conversation

@jsonbailey
Copy link
Copy Markdown
Contributor

@jsonbailey jsonbailey commented Apr 28, 2026

Summary

Introduces the new managed-layer return type `ManagedResult`, the unified `Runner` protocol, and extends `LDAIMetricSummary` with `tool_calls`, `duration_ms` (renamed from `duration`), and `resumption_token`.

  • `ManagedModel.run()` is the new primary API; returns `ManagedResult`. `ManagedModel.invoke()` is removed — use `run()` instead.
  • `ManagedAgent.run()` now returns `ManagedResult`.
  • `RunnerResult` added (no `evaluations` field — judge dispatch lives on the managed layer).
  • `ManagedModel` and `ManagedAgent` now accept only `Runner`; the `ModelRunner`/`AgentRunner` compat branches are removed from the managed layer.
  • `RunnerFactory.create_model()` and `create_agent()` return `Optional[Runner]`.
  • `LDAIConfigTracker.init` seeds `LDAIMetricSummary._resumption_token` at instantiation so it's available on `get_summary()`.
  • `ModelResponse`, `StructuredResponse`, `AgentResult`, `ModelRunner`, `AgentRunner` type definitions are kept in place so OpenAI and LangChain provider packages continue to pass CI until the follow-up PRs migrate them to the unified `Runner` protocol.

Stack

This is part of the AIC-2388 stacked PR series. Targets `main` (PR #147 merged).

Order: PR 7 ✅ → PR 8 (this) → PR 8-openai → PR 8-langchain → Cleanup → PR 9 → PR 10 → PR 11 → PR 11-openai → PR 11-langchain → PR 12

Test plan

  • `make test` — all tests pass
  • `make lint` — mypy clean across all 3 packages

🤖 Generated with Claude Code


Note

Medium Risk
Medium risk due to a breaking API surface change (invoke() removed/renamed to run()) and refactors across managed model/agent and judge evaluation paths that could impact integrations and metrics tracking.

Overview
Introduces a unified runner interface and new result types. Adds a Runner protocol with a single run() method returning RunnerResult, plus a managed-layer ManagedResult that bundles content, aggregated LDAIMetricSummary, optional parsed structured output, and optional async judge evaluations.

Updates managed and judge APIs to the new runner/result model. ManagedModel.invoke() is replaced by ManagedModel.run() (and the client examples updated), ManagedAgent.run() now returns ManagedResult, and Judge switches from invoke_structured_model/StructuredResponse to Runner.run(..., output_type=...) and reads structured output from RunnerResult.parsed.

Expands tracking/metrics payloads. LDAIMetrics gains tool_calls and duration_ms (included in to_dict()), LDAIMetricSummary adds tool_calls, duration_ms (with deprecated duration alias), and eagerly captures resumption_token; LDAIConfigTracker.track_metrics_of(_async) now supports optional metrics extraction, prefers metrics.duration_ms over wall-clock time, and tracks tool-call events once per execution.

Reviewed by Cursor Bugbot for commit 5925da6. Bugbot is set up for automated code reviews on this repo. Configure here.

@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch from b0ca696 to d403590 Compare April 28, 2026 23:04
@jsonbailey jsonbailey changed the title feat(ldai)!: Add ManagedResult and Runner protocol feat!: Add ManagedResult, RunnerResult, and Runner protocol; rename invoke() to run() Apr 28, 2026
@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch 2 times, most recently from a564649 to bd4cd68 Compare April 28, 2026 23:12
@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch from bd4cd68 to 45441da Compare April 29, 2026 13:14
@jsonbailey jsonbailey force-pushed the jb/aic-2174/evaluations branch from a997b91 to d0b3436 Compare April 29, 2026 13:18
@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch from 45441da to 27bcfc0 Compare April 29, 2026 13:18
@jsonbailey jsonbailey force-pushed the jb/aic-2174/evaluations branch from d0b3436 to e56f69a Compare April 29, 2026 13:21
@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch from 27bcfc0 to ff47ec2 Compare April 29, 2026 13:22
Comment thread packages/sdk/server-ai/src/ldai/providers/runner.py Outdated
Comment thread packages/sdk/server-ai/src/ldai/providers/runner.py Outdated
Comment thread packages/sdk/server-ai/src/ldai/providers/types.py Outdated
Comment thread packages/sdk/server-ai/src/ldai/providers/types.py Outdated
@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch 2 times, most recently from 369242d to b8d3fad Compare April 29, 2026 14:37
Base automatically changed from jb/aic-2174/evaluations to main April 29, 2026 16:14
@jsonbailey jsonbailey marked this pull request as ready for review April 29, 2026 16:19
@jsonbailey jsonbailey requested a review from a team as a code owner April 29, 2026 16:19
Comment thread packages/sdk/server-ai/src/ldai/tracker.py Outdated
Comment thread packages/sdk/server-ai/src/ldai/judge/__init__.py
Comment thread packages/sdk/server-ai/src/ldai/providers/types.py
@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch from b8d3fad to 4e28ae6 Compare April 29, 2026 16:31
Comment thread packages/sdk/server-ai/src/ldai/managed_model.py Outdated
@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch 2 times, most recently from adfd9f0 to b4d15df Compare April 29, 2026 17:17
Comment thread packages/sdk/server-ai/src/ldai/tracker.py
Comment thread packages/sdk/server-ai/src/ldai/tracker.py
Comment thread packages/sdk/server-ai/src/ldai/tracker.py
Comment thread packages/sdk/server-ai/src/ldai/managed_model.py
Comment thread packages/sdk/server-ai/src/ldai/managed_agent.py
Comment thread packages/sdk/server-ai/src/ldai/managed_model.py
Comment thread packages/sdk/server-ai/src/ldai/tracker.py
…nvoke() to run()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
jsonbailey and others added 12 commits May 1, 2026 09:59
The new track_tool_calls method at line 413 (with summary storage and
dedup guard) was being shadowed by the older method at line 559 (which
only fired per-tool events). Merge them into a single method that both
stores to the summary and fires per-tool events.
Previously, metrics_extractor(result) was called twice — once in the
public track_metrics_of/track_metrics_of_async to read duration_ms,
and again inside _track_from_metrics_extractor to track success,
tokens, and tool calls. Extract metrics once in the public method and
pass the resulting metrics + elapsed_ms into the private helper, which
now also handles the duration tracking.
ManagedModel and ManagedAgent now require a Runner. The compat shims
(_invoke_runner, isinstance(result, RunnerResult) branches, Union
type annotations) are removed; result handling is direct on
RunnerResult fields.

The deprecated ManagedModel.invoke() is preserved for backwards compat
but now delegates to run() and adapts the ManagedResult into the legacy
ModelResponse shape.

ModelRunner and AgentRunner protocol definitions remain in place so
downstream provider packages that import them continue to work.
- Drop the inconsistent 'if metrics else None' guard on reported_ms;
  the next line already dereferences metrics.success unconditionally.
- Use 'is not None' for tool_calls so an explicit empty list still
  triggers tracking (preserves the distinction between 'not tracked'
  and 'tracked with no calls').
Drop the deprecated invoke() method from the managed layer along with
its dedicated test class and the warnings/LDAIMetrics/ModelResponse
imports that were only needed by it. Type definitions in providers/
remain so downstream provider packages keep building.
…unner]

The factory's downstream consumers (ManagedModel, ManagedAgent) now
take Runner; aligning the factory's return types lets us drop the
type: ignore comments at the ManagedModel/ManagedAgent call sites.
Provider package PRs will update their concrete implementations to
match.

Judge still takes ModelRunner, so its call site picks up the
type: ignore[arg-type] in its place — that's resolved later in the
cleanup PR when Judge migrates to Runner.
Move the metrics_extractor call inside _track_from_metrics_extractor
so extraction errors are caught and logged without bubbling up. When
extraction fails or returns None, only the wall-clock duration is
tracked — success/error is left untouched since the underlying model
call itself succeeded.

Also tighten the tool_calls check to access metrics.tool_calls
directly, mirroring how metrics.usage is accessed.
- Judge now accepts Runner instead of ModelRunner
- evaluate() calls runner.run(output_type=...) instead of invoke_structured_model
- response.parsed replaces StructuredResponse.data; None guard added
- evaluate_messages() accepts RunnerResult instead of ModelResponse
- Tests updated to use RunnerResult and mock_runner.run

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jsonbailey jsonbailey force-pushed the jb/aic-2388/managed-result branch from cc792ec to e2e2b6e Compare May 1, 2026 15:02
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e2e2b6e. Configure here.

Comment thread packages/sdk/server-ai/tests/test_managed_model.py Outdated
jsonbailey and others added 2 commits May 1, 2026 10:53
…odel

ManagedModel.run() calls self._model_runner.run(), not invoke_model.
The previous mocks were dead code that never exercised the runner.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jsonbailey jsonbailey merged commit 88d4ddc into main May 1, 2026
45 checks passed
@jsonbailey jsonbailey deleted the jb/aic-2388/managed-result branch May 1, 2026 16:38
@github-actions github-actions Bot mentioned this pull request May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants