refactor: deduplicate models, LiteLLM helpers, prompts, and CLI across agent runners by ShuxinLin · Pull Request #273 · IBM/AssetOpsBench

ShuxinLin · 2026-04-23T19:59:18Z

Consolidation pass over the four agent runners. Follow-up to #272 (OTEL tracing) and supersedes #274 (which was merged into this branch).

Summary

src/agent/models.py — canonical ToolCall / TurnRecord / Trajectory used by every SDK runner. Three byte-identical per-runner models.py files deleted.
src/agent/_litellm.py — LITELLM_PREFIX + resolve_model(). Three copies collapsed to one.
src/agent/_prompts.py — AGENT_SYSTEM_PROMPT. Three identical 7-line prompts collapsed to one.
src/agent/_cli_common.py — setup_logging, add_common_args, print_trajectory, print_answer, print_result, and run_sdk_cli (bundles load_dotenv → parse → logging → OTEL init → asyncio.run). Each SDK CLI shrinks from ~140 LoC to ~55 LoC; the main() body is now one line.
src/llm/litellm.py — extracted _WATSONX_PREFIX constant.
DeepAgentRunner._chat_model — now a cached_property so _build_chat_model runs once per runner instance instead of once per run() call. Matches the per-instance-config pattern of the other two SDK runners, with lazy init so constructor tests don't need env set.
Tests — six duplicated _resolve_model tests consolidated into one parametrized suite at src/agent/tests/test_litellm.py. Subpackage __init__.py files slimmed to re-export only the runner class.

Net -217 lines across 22 files, no behavior change.

Test plan

uv run pytest src/ -k \"not integration\" — 255 pass, 0 failures.
uv run pytest src/agent src/observability -v — 139 pass.
uv run {claude,openai,deep}-agent --help all render correctly.
Grep confirms no stale imports of the deleted per-runner models.py modules.

Out of scope

Unifying _build_mcp_servers / _build_mcp_connections — each SDK requires a different output shape (claude-agent-sdk dict, langchain spec, OpenAI MCPServerStdio); forcing a common adapter would be more complex than three ~10-line helpers.
src/tmp/ cleanup — user is handling separately.

The three SDK runners (claude-agent, openai-agent, deep-agent) each shipped byte-identical Trajectory/TurnRecord/ToolCall dataclasses and their own _resolve_model / _LITELLM_PREFIX helpers. Consolidates into: - src/agent/models.py: canonical ToolCall, TurnRecord, Trajectory alongside the existing AgentResult. - src/agent/_litellm.py: shared LITELLM_PREFIX + resolve_model(). - Removed src/agent/{claude,openai,deep}_agent/models.py. - Collapsed six duplicated per-runner _resolve_model tests into one parametrized suite at src/agent/tests/test_litellm.py. Net -110 lines, no behaviour change. Signed-off-by: Shuxin Lin <linshuhsin@gmail.com>

- src/agent/_prompts.py: single AGENT_SYSTEM_PROMPT used by the three SDK runners (claude-agent, openai-agent, deep-agent). plan_execute keeps its own planning/summarisation prompts. - src/agent/_cli_common.py: setup_logging, add_common_args, print_trajectory, print_answer, print_result. The three SDK CLIs now only encode their prog name, default model, epilog text, and runner-specific arg (--max-turns vs --recursion-limit). - Extract _WATSONX_PREFIX constant in LiteLLMBackend. Net -110 lines; each CLI shrinks from ~140 LoC to ~60 LoC. Signed-off-by: Shuxin Lin <linshuhsin@gmail.com>

refactor: share system prompt and CLI boilerplate

- src/agent/_cli_common.py: new run_sdk_cli(service_name, build_parser, run_coro) that bundles dotenv → parse → logging → init_tracing → asyncio.run. The three SDK main() bodies shrink from 9 lines each to one. - DeepAgentRunner._chat_model is now a cached_property so _build_chat_model runs once per runner instance instead of once per run(). Matches the ClaudeAgentRunner / OpenAIAgentRunner pattern of pre-building per-instance config, with lazy init so constructor tests don't need env set. Signed-off-by: Shuxin Lin <linshuhsin@gmail.com>

ShuxinLin added 2 commits April 23, 2026 15:58

ShuxinLin mentioned this pull request Apr 23, 2026

refactor: share system prompt and CLI boilerplate #274

Merged

2 tasks

Merge pull request #274 from IBM/refactor/share-prompt-and-cli

298344c

refactor: share system prompt and CLI boilerplate

ShuxinLin changed the base branch from feat/otel-observability to main April 23, 2026 20:29

ShuxinLin changed the base branch from main to feat/otel-observability April 23, 2026 20:54

ShuxinLin changed the title ~~refactor: unify trajectory models and LiteLLM helpers across runners~~ refactor: deduplicate models, LiteLLM helpers, prompts, and CLI across agent runners Apr 23, 2026

ShuxinLin merged commit aa81db6 into feat/otel-observability Apr 23, 2026
1 check passed

ShuxinLin mentioned this pull request Apr 24, 2026

feat: save agent traces to disk in OTLP-JSON for benchmark evaluation #272

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: deduplicate models, LiteLLM helpers, prompts, and CLI across agent runners#273

refactor: deduplicate models, LiteLLM helpers, prompts, and CLI across agent runners#273
ShuxinLin merged 4 commits intofeat/otel-observabilityfrom
refactor/unify-agent-models

ShuxinLin commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ShuxinLin commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ShuxinLin commented Apr 23, 2026 •

edited

Loading