Skip to content

Conversation

@ethanndickson
Copy link
Member

Problem

The enforceThinkingPolicy tests in policy.test.ts fail intermittently depending on test execution order. When providerOptions.test.ts runs before policy.test.ts, the policy tests fail because they receive a mocked pass-through function instead of the real implementation.

Root cause: providerOptions.test.ts uses mock.module("@/browser/utils/thinking/policy", ...) at module load time, which globally replaces the real module. Bun's mock.module does not isolate between test files—this is a known limitation.

Why it's flaky: Bun randomizes test file order. Different seeds produce different orderings:

  • --seed=2: policy tests pass (providerOptions runs after policy)
  • --seed=1, --seed=42, default: policy tests fail (providerOptions runs before policy)

Solution

Remove the mock.module for enforceThinkingPolicy from providerOptions.test.ts.

Why this fix is safe

The mock was added to test buildProviderOptions output formatting in isolation—passing arbitrary thinking levels without policy clamping. However, examining the tests:

  • They use thinking levels ("medium", "high", "low", "off") that are valid for the models being tested
  • enforceThinkingPolicy returns these levels unchanged for these model/level combinations
  • The tests still pass without the mock because the real policy allows these levels

The mock was defensive but unnecessary. Removing it:

  1. Fixes the flaky test
  2. Follows the codebase's preference for avoiding mocks (per AGENTS.md)
  3. Makes tests more honest—they now test real behavior

Verification

# Before fix (default seed): 5 failures
bun test src

# After fix: 0 failures across all seeds
bun test src                    # 1571 pass, 0 fail
bun test --seed=1 src           # passes
bun test --seed=42 src          # passes

Generated with mux

Remove the mock.module for enforceThinkingPolicy from providerOptions.test.ts.
This mock was polluting the global module cache and causing policy.test.ts
to fail when run after providerOptions.test.ts (order depends on bun's
random test scheduling).
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

@ethanndickson ethanndickson added this pull request to the merge queue Dec 9, 2025
Merged via the queue into main with commit e07ef9d Dec 9, 2025
18 of 19 checks passed
@ethanndickson ethanndickson deleted the fix-thinkingpolicy-tests-local branch December 9, 2025 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant