Skip to content

fix(server): restore claude effort remapping and exclude perf tests from default scope#5

Merged
Berkay2002 merged 1 commit intomainfrom
claude/restore-claude-effort-remapping-lwj9n
Apr 17, 2026
Merged

fix(server): restore claude effort remapping and exclude perf tests from default scope#5
Berkay2002 merged 1 commit intomainfrom
claude/restore-claude-effort-remapping-lwj9n

Conversation

@Berkay2002
Copy link
Copy Markdown
Owner

Summary

  • Restores Claude effort remapping in ClaudeAdapter.getEffectiveClaudeCodeEffort (cherry-pick drift from d81e41c) so unsupported or prompt-injected efforts cap to the highest supported non-prompt-injected level instead of silently falling back to the model default.
  • Sonnet 4.6 + effort: "max" or effort: "ultrathink" now resolves to "high" (was "medium"), and "ultrathink" keeps its "Ultrathink:\n" prompt prefix path intact via the existing buildPromptText helper.
  • Excludes integration/perf/** from the default bun run test scope; bun run test:perf opts back in via VITEST_PERF=1 so the explicitly-targeted perf script keeps working.

Changes

  • apps/server/src/provider/Layers/ClaudeAdapter.ts: rewrites getEffectiveClaudeCodeEffort(caps, rawEffort) to be capability-aware. Drops the now-unused resolveEffort import in this file (the new helper handles everything).
  • apps/server/vitest.config.ts: adds exclude: [...configDefaults.exclude, "integration/perf/**"] (preserving vitest's default exclude), gated by VITEST_PERF=1 so test:perf still discovers the perf folder.
  • apps/server/package.json: prefixes test:perf with VITEST_PERF=1 to opt back into the perf folder.

Verify

  • bun run typecheck — passes (8/8 packages).
  • bun run test src/provider/Layers/ClaudeAdapter.test.ts — 57/57 passing (was 55/57 with the two effort tests failing).
  • bun run test:perf — still discovers integration/perf/** (verified via vitest list --filesOnly).

Test plan

  • CI: server suite no longer runs the wall-clock-sensitive perf benchmarks under default scope
  • CI: ClaudeAdapter Sonnet 4.6 + max/ultrathink paths produce the expected effort and prompt prefix

https://claude.ai/code/session_01CCbLqcPV5imhah2QTotE1H

…rom default scope

Cherry-pick drift from d81e41c: ClaudeAdapter never remapped unsupported
or prompt-injected effort levels to a model-supported tier. For Claude
Sonnet 4.6 a request for "max" or "ultrathink" silently fell back to the
model default ("medium") instead of capping to the highest supported
non-prompt-injected level ("high"); ultrathink also lost its
"Ultrathink:\n" prompt prefix path because callers saw `null` effort.

Replace the post-resolveEffort `getEffectiveClaudeCodeEffort` helper with
a caps-aware version that:
- passes supported, non-prompt-injected efforts through unchanged
- falls back to the model default when no effort is requested
- caps unsupported or prompt-injected efforts ("max", "ultrathink") to
  the highest supported non-prompt-injected level

The existing prompt-prefix path in `buildPromptText` already handles the
"Ultrathink:\n" injection from the raw effort, so it now works end-to-end
once the effort assertion stops short-circuiting the test.

Also exclude `integration/perf/**` from the default `bun run test`
scope. The perf benchmarks run ~90s each and fail deterministically on
shared CI hardware; opt back in via `bun run test:perf`
(VITEST_PERF=1) which targets the folder explicitly.
@github-actions github-actions Bot added size:M vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. labels Apr 17, 2026
@Berkay2002 Berkay2002 marked this pull request as ready for review April 17, 2026 00:08
Copilot AI review requested due to automatic review settings April 17, 2026 00:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Restores capability-aware Claude “effort” resolution (including correct handling of prompt-injected efforts like ultrathink) and adjusts the server’s Vitest configuration so wall-clock-sensitive perf benchmarks don’t run in the default test suite.

Changes:

  • Reworked ClaudeAdapter.getEffectiveClaudeCodeEffort to cap unsupported/prompt-injected efforts to the highest supported non-prompt-injected level.
  • Updated server Vitest config to exclude integration/perf/** by default, with an opt-in via VITEST_PERF=1.
  • Updated test:perf script to set VITEST_PERF=1 so perf tests still run when explicitly requested.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
apps/server/src/provider/Layers/ClaudeAdapter.ts Makes effort selection capability-aware and preserves prompt-injected effort behavior while choosing a safe API effort value.
apps/server/vitest.config.ts Excludes perf tests from default runs and gates inclusion behind VITEST_PERF=1, preserving Vitest default excludes.
apps/server/package.json Updates test:perf to opt back into perf tests via VITEST_PERF=1.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Berkay2002 Berkay2002 merged commit c5f7292 into main Apr 17, 2026
15 of 16 checks passed
Berkay2002 added a commit that referenced this pull request Apr 17, 2026
Resolve conflicts with PR #5 (restore claude effort remapping):
- Adopt main's new getEffectiveClaudeCode/AgentEffort signature (caps, rawEffort)
  with capability-aware remapping, while keeping our rename to ClaudeAgentEffort.
- Drop the pre-computed resolveEffort variable; the new function subsumes it.
- Revert test assertions back to "high" now that the remapping caps to the top
  supported non-prompt-injected level.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants