Skip to content

fix(arena): promptarena validate catches unknown assertion types with fuzzy suggestions#945

Merged
chaholl merged 1 commit intomainfrom
fix/939-validate-assertion-types
Apr 12, 2026
Merged

fix(arena): promptarena validate catches unknown assertion types with fuzzy suggestions#945
chaholl merged 1 commit intomainfrom
fix/939-validate-assertion-types

Conversation

@chaholl
Copy link
Copy Markdown
Contributor

@chaholl chaholl commented Apr 12, 2026

Summary

  • promptarena validate now checks assertion types against the eval handler registry
  • Unknown types produce clear errors with "did you mean?" suggestions via Levenshtein distance
  • Catches common mistakes like substring_presentcontains, tool_calledtools_called

Root Cause

Scenario assertion types (turns[].assertions[].type and conversation_assertions[].type) were validated only structurally (JSON Schema accepts any string). The business logic validator didn't cross-reference against the eval handler registry. Invalid types passed validation and silently failed at runtime — the run still reported exit code 0 with errors buried in JSON output.

Fix

  • New ValidateAssertionTypes() function in tools/arena/assertions/ iterates all loaded scenarios and checks each assertion type against the eval registry (including aliases)
  • Fuzzy matching via Levenshtein distance provides "did you mean?" suggestions for close matches
  • Wired into performBusinessLogicValidation() in the validate command, running after config validation

Example output:

❌ Unknown assertion types (2):
  - hero-scenario: conversation_assertions: unknown assertion type "substring_present" (did you mean "contains"?)
  - hero-scenario: turns[0].assertions: unknown assertion type "tool_called" (did you mean "tools_called"?)

Test plan

  • TestValidateAssertionTypes/valid_types_produce_no_errors
  • TestValidateAssertionTypes/aliases_are_accepted
  • TestValidateAssertionTypes/invalid_types_produce_errors_with_suggestions
  • TestValidateAssertionTypes/suggestion_includes_close_match
  • TestPerformBusinessLogicValidation_InvalidAssertionType — end-to-end with arena config + scenario file
  • TestPerformBusinessLogicValidation_ValidAssertionType — no error with valid types

Closes #939

…mptarena validate (#939)

promptarena validate now checks all assertion types in loaded scenarios
against the eval handler registry. Unknown types (e.g., substring_present
instead of contains) produce errors with fuzzy "did you mean?" suggestions.

Closes #939
@sonarqubecloud
Copy link
Copy Markdown

@chaholl chaholl merged commit 86f2ecb into main Apr 12, 2026
29 checks passed
@chaholl chaholl deleted the fix/939-validate-assertion-types branch April 12, 2026 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

promptarena validate: unknown assertion types pass silently, only fail at runtime

1 participant