Skip to content

feat: introduce ManagedResult, RunnerResult, and LDAIMetricSummary#1332

Draft
jsonbailey wants to merge 2 commits intonext-ai-releasefrom
jb/aic-2388/js-managed-result
Draft

feat: introduce ManagedResult, RunnerResult, and LDAIMetricSummary#1332
jsonbailey wants to merge 2 commits intonext-ai-releasefrom
jb/aic-2388/js-managed-result

Conversation

@jsonbailey
Copy link
Copy Markdown
Contributor

@jsonbailey jsonbailey commented Apr 28, 2026

Summary

  • Adds RunnerResult interface (provider-level result: content, metrics, raw?, parsed? — no evaluations)
  • Adds ManagedResult interface (managed-layer result with evaluations: Promise<JudgeResult[]>)
  • Adds LDAIMetricSummary (flat summary: success, usage?, toolCalls?, durationMs?, resumptionToken?)
  • Adds toolCalls? and durationMs? fields to LDAIMetrics
  • TrackedChat.run() replaces/supplements invoke(), returning ManagedResult with metric summary built from tracker
  • Adds createModel() to LDAIClient and LDAIClientImpl as the preferred replacement for createChat()
  • Updates chat-judge example to use createModel() and run()

Test plan

  • All 188 tests pass
  • chat-judge example updated to use new API

🤖 Generated with Claude Code


Note

Medium Risk
Medium risk due to API shape changes (createModel/createChat, TrackedChat removal) and new provider abstraction (Runner) that can affect downstream integrations and prompt/message handling.

Overview
Introduces a new managed invocation layer via ManagedModel.run() that returns a ManagedResult with flattened LDAIMetricSummary and an asynchronous evaluations promise, alongside provider-level Runner/RunnerResult types.

Replaces the stateful TrackedChat API with stateless model execution: LDAIClient.createModel() is added as the preferred entry point, while createChat/initChat are deprecated shims; examples are updated to use createModel + run. Judge execution is split into a standalone Evaluator (parallel, best-effort) and configs now carry an internal evaluator reference for the managed layer.

Extends metrics to include optional toolCalls and durationMs, and adds initial graph runner result/metrics types to support the new runner protocol.

Reviewed by Cursor Bugbot for commit e163f7d. Bugbot is set up for automated code reviews on this repo. Configure here.

@jsonbailey jsonbailey force-pushed the jb/aic-2174/server-ai branch from 46ab0a4 to c751ce6 Compare April 28, 2026 23:12
@jsonbailey jsonbailey force-pushed the jb/aic-2388/js-managed-result branch from fe6948b to 192315f Compare April 28, 2026 23:14
Comment thread packages/sdk/server-ai/src/api/chat/TrackedChat.ts Outdated
jsonbailey added a commit that referenced this pull request May 1, 2026
… (AIC-2388)

Adds RunnerProtocol.test.ts to verify that the Runner and AgentGraphRunner
interfaces can be implemented as plain objects. The Runner, AgentGraphRunner
interfaces, AIProvider deprecation, and providers/index.ts re-exports landed
in the parent PR (#1332).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Base automatically changed from jb/aic-2174/server-ai to next-ai-release May 1, 2026 16:20
@jsonbailey
Copy link
Copy Markdown
Contributor Author

@cursor review

jsonbailey and others added 2 commits May 1, 2026 12:23
…IC-2388)

Adds RunnerResult (provider-level result type without evaluations), ManagedResult
(managed-layer result with async evaluations promise), and LDAIMetricSummary (flat
metric summary including resumptionToken). Adds toolCalls and durationMs to
LDAIMetrics. TrackedChat.run() replaces invoke() returning ManagedResult with
LDAIMetricSummary built from tracker. Adds createModel() to LDAIClient/LDAIClientImpl
as the preferred replacement for createChat(). Updates chat-judge example.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… conversation management

- Add `Runner` and `AgentGraphRunner` interfaces in api/providers/Runner.ts.
  Runner.run takes a prompt string + optional output schema and returns a
  RunnerResult. AgentGraphRunner.run takes a string and returns an
  AgentGraphRunnerResult. Re-export both from api/providers/index.ts.
- Add the supporting `GraphMetrics` and `AgentGraphRunnerResult` types to
  api/graph/types.ts so AgentGraphRunner has its result shape on this branch.
- Rename `TrackedChat` -> `ManagedModel` (file + class). The constructor now
  takes a `Runner` instead of an `AIProvider`. The class is stateless: it
  owns no conversation history, and `run(prompt)` forwards the prompt
  directly to the runner. Drop `invoke()`, `_evaluateWithJudges`,
  `appendMessages`, `getMessages`, `getJudges`, `getProvider`, and the
  internal `messages` field.
- Update `LDAIClientImpl.createModel` to construct a `ManagedModel` with a
  `Runner`. The factory still produces a (deprecated) `AIProvider`, so a
  small `runnerFromAIProvider` adapter wraps it: it prepends the AIConfig's
  configured messages to the user prompt to preserve existing
  system-prompt behavior under the stateless contract.
- Mark `createChat` `@deprecated` (now delegates to `createModel`); keep
  `initChat` deprecated. Update the `LDAIClient` interface accordingly.
- Mark the `AIProvider` abstract class `@deprecated` in favor of `Runner`.
- Update `tracked-chat` and `chat-observability` examples to call
  `createModel` + `model.run()` instead of `createChat` + `chat.invoke()`.
- Rewrite the test suite for the stateless ManagedModel: prompt is passed
  through verbatim, no history is retained, ManagedResult is built from the
  RunnerResult plus the tracker's resumption token. Drop the old tests for
  `appendMessages`/`getMessages`/`invoke`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jsonbailey jsonbailey force-pushed the jb/aic-2388/js-managed-result branch from e163f7d to c906f79 Compare May 1, 2026 17:24
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e163f7d. Configure here.

};

// Evaluations are wired in a follow-up PR. For now, resolve empty.
const evaluations: Promise<LDJudgeResult[]> = Promise.resolve([]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Evaluator built but never called in run()

High Severity

createModel() builds an Evaluator (initializing judges via async network calls), attaches it to configWithEvaluator, and passes that config to ManagedModel. However, ManagedModel.run() ignores this.aiConfig.evaluator entirely and hardcodes evaluations to Promise.resolve([]). Since the old TrackedChat with working judge evaluations is deleted in this same PR, judge evaluations are silently non-functional. The chat-judge example will always print empty results despite judges being configured.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit e163f7d. Configure here.

* Resumption token for deferred feedback association.
*/
resumptionToken?: string;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflicting LDAIMetricSummary interfaces with same name

Medium Severity

Two incompatible interfaces named LDAIMetricSummary now exist. The original in config/LDAIConfigTracker.ts has tokens?: LDTokenUsage and success?: boolean; the new one in model/types.ts has usage?: LDTokenUsage and success: boolean (required). The new one is publicly exported, but LDAIConfigTracker.getSummary() returns the old one — users importing LDAIMetricSummary to type the return of getSummary() will get a type mismatch.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit e163f7d. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant