Skip to content

refactor(agent-profiles): add getLlmistGadgets() to AgentProfile for unified backend support#424

Merged
zbigniewsobiecki merged 2 commits intodevfrom
refactor/unify-agent-execution-paths
Feb 19, 2026
Merged

refactor(agent-profiles): add getLlmistGadgets() to AgentProfile for unified backend support#424
zbigniewsobiecki merged 2 commits intodevfrom
refactor/unify-agent-execution-paths

Conversation

@aaight
Copy link
Copy Markdown
Collaborator

@aaight aaight commented Feb 18, 2026

Summary

This PR implements Step 1 (Audit) and Step 2 (partial) of the dual agent execution path unification refactoring.

Card: https://trello.com/c/699625975a8f4cdd89b359f0

What was implemented

  • Added getLlmistGadgets() to AgentProfile interface — each profile now knows how to construct the llmist gadget instances appropriate for its agent type, making AgentProfile the single source of truth for agent behavior across both backends
  • Added gadget builder helpers in agent-profiles.ts:
    • buildWorkItemLlmistGadgets() — for briefing, planning, implementation, respond-to-planning-comment, debug agents
    • buildReviewLlmistGadgets() — for review and respond-to-review agents (read-only, PR-focused)
    • buildPRAgentLlmistGadgets() — for respond-to-ci and respond-to-pr-comment agents (file editing + PR tools, no CreatePR)
  • Updated all 8 profile implementations to include getLlmistGadgets() with correct capabilities
  • Added gadget mocks and tests in agent-profiles.test.ts (6 new test cases covering getLlmistGadgets())

Key decisions

  • getLlmistGadgets(agentType) takes the agent type string to correctly resolve capabilities (e.g., implementation vs planning both use buildWorkItemLlmistGadgets but with different capabilities)
  • Review agents use buildReviewLlmistGadgets(true) to include review comment tools, aligning with the existing createPRAgentGadgets({ includeReviewComments: true }) pattern from agents/shared/gadgets.ts
  • The capability-based gadget selection mirrors getBaseAgentGadgets() from agents/base.ts, consolidating the logic into profiles

What comes next (follow-up PRs)

  • Step 2 (continued): Modify LlmistBackend.execute() to use the pre-built AgentBackendInput and gadgets from getLlmistGadgets() instead of delegating to old executor functions
  • Step 6: Simplify registry.ts to route llmist through executeWithBackend (removing the dummy input hack)
  • Steps 3-5: Consolidate context fetching, prompt building, eliminate specialized executors
  • Steps 7-8: Delete dead code, update CLAUDE.md

Testing

  • All 2193 tests pass (6 new tests added)
  • TypeScript type check passes
  • Lint passes (no new errors introduced)

🤖 Generated with Claude Code

@nhopeatall
Copy link
Copy Markdown
Collaborator

🔍 Reviewing PR...

Copy link
Copy Markdown
Collaborator

@nhopeatall nhopeatall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Two gadget-set mismatches that would break review and respond-to-review agents on the llmist backend.

Code Issues

Blocking

  • src/backends/agent-profiles.ts:170-188buildReviewLlmistGadgets() is missing the CreatePRReview gadget. The review agent's primary purpose is submitting PR reviews via CreatePRReview. The existing review.ts agent includes it (line 160), and it's exported from gadgets/github/index.ts, but it's neither imported nor used in the new builder. A review agent on the llmist backend would be unable to submit its review — the core action of the agent.

  • src/backends/agent-profiles.ts line 710 (PROFILE_REGISTRY)respond-to-review maps to reviewProfile, which uses buildReviewLlmistGadgets(true) — a read-only gadget set with no file editing tools. However, the existing respond-to-review agent (respond-to-review.ts:23) uses createPRAgentGadgets({ includeReviewComments: true }) which includes FileSearchAndReplace, WriteFile, and AstGrep, because respond-to-review needs to make code changes in response to review feedback. This should use buildPRAgentLlmistGadgets(true) instead, matching the existing behavior.

Should Fix

  • tests/unit/backends/agent-profiles.test.ts:287-341 — All 6 new tests only check gadgets.length > 0. They don't verify that critical gadgets are present (e.g., CreatePRReview for review, file editing gadgets for respond-to-review) or that capability-gated gadgets are correctly included/excluded (e.g., CreatePR only for implementation). Both blocking issues above would have been caught with more specific assertions. Consider testing for the presence/absence of specific gadget types per profile.

new Tmux(),
new Sleep(),
new TodoUpsert(),
new TodoUpdateStatus(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOCKING: CreatePRReview is missing from this builder. The review agent (see review.ts:149-163) needs CreatePRReview to submit its review — it's the core action. It's exported from gadgets/github/index.ts but not imported here.

Fix: add import { CreatePRReview } from '../gadgets/github/index.js'; (or add it to the existing import block) and include new CreatePRReview() in this gadget array. Also consider whether PostPRComment belongs here — the existing review agent doesn't include it, since it posts comments via CreatePRReview.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added CreatePRReview to the import block and to buildReviewLlmistGadgets(), replacing PostPRComment to match the authoritative gadget list in review.ts. The review agent now has exactly: ListDirectory, ReadFile, Tmux, Sleep, TodoUpsert/UpdateStatus/Delete, GetPRDetails, GetPRDiff, GetPRChecks, CreatePRReview, UpdatePRComment, Finish — matching the real review agent.

const gadgets = profile.getLlmistGadgets('review');
expect(gadgets).toBeDefined();
expect(gadgets.length).toBeGreaterThan(0);
});
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHOULD_FIX: This test would pass even though CreatePRReview is missing from the review gadgets — it only checks length > 0. Consider verifying specific gadget presence, e.g.:

const gadgetNames = gadgets.map(g => g.constructor.name);
expect(gadgetNames).toContain("CreatePRReview");
expect(gadgetNames).not.toContain("FileSearchAndReplace"); // review is read-only

Similar assertions for other profiles would catch capability mismatches early.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All tests now verify specific gadget presence/absence using constructor.name assertions. For example, the review test verifies CreatePRReview is present and FileSearchAndReplace/PostPRComment are absent. Each profile test checks capability-gated gadgets are correctly included or excluded. Both blocking bugs (missing CreatePRReview, wrong respond-to-review profile) would now be caught by these tests.

@aaight
Copy link
Copy Markdown
Collaborator Author

aaight commented Feb 18, 2026

👀 Checking this out...

@aaight
Copy link
Copy Markdown
Collaborator Author

aaight commented Feb 18, 2026

🔍 Reviewing PR...

Copy link
Copy Markdown
Collaborator Author

@aaight aaight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of PR #424: getLlmistGadgets() addition to AgentProfile

Great direction overall — moving gadget construction into AgentProfile as a single source of truth is the right architectural call. The structure of the three builder helpers is clean and the capability-gating logic is sound. However, there's one blocking bug that will cause the review agent to silently fail when run via the llmist backend.


🚨 BLOCKING: CreatePRReview is missing from buildReviewLlmistGadgets()

File: src/backends/agent-profiles.ts, around line 188

Comparing the new buildReviewLlmistGadgets() to the authoritative source in src/agents/review.ts (lines 149-163):

// review.ts — the real review agent gadget list
getGadgets: () => [
    new ListDirectory(),
    new ReadFile(),
    new Tmux(),
    new Sleep(),
    new TodoUpsert(),
    new TodoUpdateStatus(),
    new TodoDelete(),
    new GetPRDetails(),
    new GetPRDiff(),
    new GetPRChecks(),
    new CreatePRReview(),   // ← THE CORE ACTION GADGET
    new UpdatePRComment(),
    new Finish(),
],

buildReviewLlmistGadgets() omits CreatePRReview entirely and instead includes PostPRComment (which the real review agent does NOT have). The result: when the llmist backend uses getLlmistGadgets() for the review agent, the agent will have no way to submit a review, which is its sole purpose.

Fix required:

  1. Add CreatePRReview to the import block: import { ..., CreatePRReview, ... } from '../gadgets/github/index.js';
  2. Replace new PostPRComment() with new CreatePRReview() in buildReviewLlmistGadgets() (or add it alongside and reconsider whether PostPRComment belongs)

⚠️ Test Coverage Too Weak to Catch This Bug

File: tests/unit/backends/agent-profiles.test.ts, around line 307

The new tests only verify gadgets.length > 0, which means the missing CreatePRReview bug above goes completely undetected. A stronger assertion would have caught this:

it('returns non-empty gadget array for review', () => {
    const profile = getAgentProfile('review');
    const gadgets = profile.getLlmistGadgets('review');
    const names = gadgets.map((g) => (g as object).constructor.name);

    expect(names).toContain('CreatePRReview');            // core action
    expect(names).toContain('GetPRDiff');                 // needs to read diff
    expect(names).not.toContain('FileSearchAndReplace');  // read-only agent
    expect(names).not.toContain('CreatePR');              // no branch creation
});

Similar constructor-name assertions for the other profiles (e.g., verifying FileSearchAndReplace is present for implementation but absent for planning) would make these tests much more useful as a safety net during future refactors.


Minor Observations (no change required)

buildPRAgentLlmistGadgets is missing VerifyChanges
src/agents/base.ts includes VerifyChanges for agents with canEditFiles. buildPRAgentLlmistGadgets includes all other editing gadgets (FileSearchAndReplace, FileMultiEdit, WriteFile) but skips VerifyChanges. This may be intentional for PR-branch agents, but worth confirming.

_agentType unused parameter with explicit false

getLlmistGadgets: (_agentType) => buildPRAgentLlmistGadgets(false),

The false is redundant since the default is false — calling buildPRAgentLlmistGadgets() would be slightly cleaner. Very minor.


Summary

The blocking issue is the missing CreatePRReview in buildReviewLlmistGadgets(). Please fix that and strengthen the review-agent test to verify the gadget is present. Everything else is solid.

@zbigniewsobiecki
Copy link
Copy Markdown
Member

@aaight address code review concerns

@aaight
Copy link
Copy Markdown
Collaborator Author

aaight commented Feb 19, 2026

Request Addressed

What was asked

Address code review concerns raised by @nhopeatall and @aaight on the getLlmistGadgets() implementation.

Changes Made

  • Fixed missing CreatePRReview in buildReviewLlmistGadgets() — the core action gadget for the review agent was missing, replaced incorrect PostPRComment
  • Fixed respond-to-review profile — created dedicated respondToReviewProfile with file editing gadgets instead of reusing the read-only reviewProfile
  • Added VerifyChanges to buildPRAgentLlmistGadgets() to match base.ts pattern
  • Strengthened all getLlmistGadgets tests — now verify specific gadget presence/absence per profile using constructor name assertions
  • Minor cleanup — removed redundant explicit false parameter

Verification

  • All 2195 tests passing ✅
  • TypeScript type check passing ✅
  • Lint passing ✅ (no new warnings)

- Add missing CreatePRReview to buildReviewLlmistGadgets() (blocking bug:
  review agent couldn't submit reviews via llmist backend)
- Replace PostPRComment with CreatePRReview to match review.ts gadget list
- Create dedicated respondToReviewProfile with buildPRAgentLlmistGadgets(true)
  instead of reusing read-only reviewProfile (blocking bug: respond-to-review
  agent had no file editing gadgets)
- Add VerifyChanges to buildPRAgentLlmistGadgets() matching base.ts pattern
- Remove redundant explicit false in respondToCIProfile
- Strengthen tests: verify specific gadget presence/absence per profile using
  constructor.name assertions instead of just checking array length

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aaight
Copy link
Copy Markdown
Collaborator Author

aaight commented Feb 19, 2026

Addressed all code review concerns from @nhopeatall and @aaight:

Blocking fixes:

  1. Missing CreatePRReview in review gadgets — Added CreatePRReview to buildReviewLlmistGadgets() and removed PostPRComment (which the real review agent doesn't use). Now matches the authoritative gadget list in review.ts.
  2. respond-to-review using read-only profile — Created a dedicated respondToReviewProfile that uses buildPRAgentLlmistGadgets(true) (file editing + review comment tools), matching the existing respond-to-review.ts agent which uses createPRAgentGadgets({ includeReviewComments: true }).

Should-fix improvements:
3. Weak test assertions — Rewrote all getLlmistGadgets tests to verify specific gadget presence/absence per profile using constructor.name assertions (e.g., review has CreatePRReview but not FileSearchAndReplace; implementation has CreatePR but planning doesn't).
4. Missing VerifyChanges in PR agent gadgets — Added to buildPRAgentLlmistGadgets() to match the base.ts pattern for agents with file editing.
5. Minor cleanup — Removed redundant explicit false in respondToCIProfile.getLlmistGadgets.

All 2195 tests pass, TypeScript and lint are clean.

@nhopeatall
Copy link
Copy Markdown
Collaborator

🔍 Reviewing PR...

Copy link
Copy Markdown
Collaborator

@nhopeatall nhopeatall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

LGTM — clean refactoring step that adds getLlmistGadgets() to AgentProfile, moving toward a single source of truth for agent gadget configuration across backends.

Key Observations

Bugfix embedded in refactoring (good): The respond-to-review profile was previously aliased to reviewProfile (read-only, review-focused). This PR correctly creates a dedicated respondToReviewProfile with file editing capabilities, matching what the actual respond-to-review agent does (it addresses review feedback by making code changes). This was a real bug in the old code.

Gadget parity verified:

  • buildWorkItemLlmistGadgets matches getBaseAgentGadgets from agents/base.ts
  • buildReviewLlmistGadgets matches review.ts getGadgets()
  • buildPRAgentLlmistGadgets extends createPRAgentGadgets from agents/shared/gadgets.ts with three additional gadgets (FileMultiEdit, VerifyChanges, GetPRChecks) — reasonable enhancements that bring PR agents in line with the base agent gadget set for file-editing agents

Minor note (not blocking): The respond-to-review entry in capabilities.ts has canEditFiles: false, isReadOnly: true, which contradicts the actual agent behavior (and this PR's correct gadget assignment). This is a pre-existing inconsistency — the capabilities aren't consulted by buildPRAgentLlmistGadgets so it doesn't cause a bug, but it could confuse future readers. Worth a follow-up cleanup.

Tests are thorough and all pass. CI is green. Clean implementation of the refactoring plan's Steps 1-2.

@zbigniewsobiecki zbigniewsobiecki merged commit 9fd15b9 into dev Feb 19, 2026
4 checks passed
@zbigniewsobiecki zbigniewsobiecki deleted the refactor/unify-agent-execution-paths branch March 16, 2026 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants