Skip to content

fix(docs): correct contains* case-sensitivity in grader.md#1171

Merged
christso merged 4 commits intomainfrom
fix/1154-contains-case-sensitive-docs
Apr 27, 2026
Merged

fix(docs): correct contains* case-sensitivity in grader.md#1171
christso merged 4 commits intomainfrom
fix/1154-contains-case-sensitive-docs

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Closes #1154

What changed

grader.md:42 incorrectly documented contains as "case-insensitive by default". The implementation uses raw .includes() which is case-sensitive. This also made the icontains* entries internally inconsistent (they would only make sense as a distinct variant if contains* is already case-sensitive).

Option taken: Option 1 from the issue — fix the documentation to match the implementation.

Changes

  • plugins/agentv-dev/skills/agentv-bench/agents/grader.md — corrected contains, contains-any, and contains-all descriptions to say (case-sensitive). Simplified icontains* row to just say case-insensitive.
  • packages/core/test/evaluation/graders/assertions.test.ts — added regression tests pinning case-sensitivity for runContainsAssertion, runContainsAnyAssertion, and runContainsAllAssertion.

Incidental fixes (unrelated to #1154, caught by pre-push hook)

  • examples/red-team/archetypes/coding-agent/fixtures/poisoned-mcp-server.js — biome lint/format fixes (double → single quotes, template literal) introduced by feat(examples): scenario-based red-team suites for coding and customer-facing agent archetypes #1168.
  • biome.json — added **/__tmp_*/** to files.ignore so test-created temp dirs don't trigger biome during the pre-push test→lint sequence.
  • apps/cli/test/commands/eval/pipeline/pipeline-e2e.test.ts — set explicit 30s timeout; the test spawns multiple bun child processes and was hitting the 5s default in constrained environments.

Red/green UAT

Red (before fix): grader.md:42 says (case-insensitive by default).

Green (with fix): grader.md:42 says (case-sensitive). Regression test runContainsAssertion('Hello, world!', 'hello').score === 0 passes, confirming the documented and tested behaviour are consistent.

Test plan

  • bun test packages/core/test/evaluation/graders/assertions.test.ts — 12 pass, 0 fail
  • Pre-push hook: Build / Typecheck / Lint / Test / Validate — all Passed

christso and others added 4 commits April 27, 2026 11:55
grader.md:42 incorrectly stated `contains` is case-insensitive by default;
the implementation uses raw `.includes()` which is case-sensitive. Updated
`contains`, `contains-any`, and `contains-all` descriptions to reflect the
actual behaviour, and added regression tests pinning case-sensitivity for all
three functions.

Closes #1154

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Test runs create temp directories (e.g. __tmp_bench_test__) that biome
picks up after the test step in the pre-push hook, causing spurious lint
failures. Adding the pattern to files.ignore prevents this.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test spawns multiple bun child processes (pipeline input/grade/bench)
which takes 5-10s in constrained environments, exceeding bun's default
per-test timeout of 5000ms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: c601ec3
Status: ✅  Deploy successful!
Preview URL: https://f73fc9a4.agentv.pages.dev
Branch Preview URL: https://fix-1154-contains-case-sensi.agentv.pages.dev

View logs

@christso christso marked this pull request as ready for review April 27, 2026 10:57
@christso christso merged commit dcc1c82 into main Apr 27, 2026
4 checks passed
@christso christso deleted the fix/1154-contains-case-sensitive-docs branch April 27, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: contains grader doc claims case-insensitive default but implementation is case-sensitive

1 participant