Skip to content

fix(cli): validate --gate metric names early, exit 2 for unknown metrics#156

Merged
decko merged 2 commits into
mainfrom
soda/136
Apr 23, 2026
Merged

fix(cli): validate --gate metric names early, exit 2 for unknown metrics#156
decko merged 2 commits into
mainfrom
soda/136

Conversation

@decko
Copy link
Copy Markdown
Owner

@decko decko commented Apr 22, 2026

Summary

Fixes ticket 136: when --gate unknown_metric>0.5 was specified, the CLI silently skipped the gate (exit 0), giving no indication that the metric name was invalid. This was especially dangerous for typos like faithfullness instead of faithfulness.

  • Added early validation of --gate metric names against the full known metric registry (_all_metric_names()) before any heavy loading, mirroring the existing --metrics validation pattern.
  • Unknown metrics → exit 2 with Error: Invalid value for '--gate': Unknown metric(s) in --gate: ..., Valid metrics: ...
  • Known-but-not-computed metrics (e.g. faithfulness without --judge) → exit 0 SKIP, preserving existing behaviour.
  • Removed a spurious [dependency-groups] block accidentally committed to pyproject.toml (local dev-tooling workaround).

Files Changed

File Purpose
src/raki/cli.py Added --gate metric-name validation block (lines 222–238)
tests/test_cli.py 5 new tests in TestGateThresholdCLI
changes/136.fix Towncrier changelog fragment
pyproject.toml Removed spurious [dependency-groups] block

Acceptance Criteria

  • --gate typo_metric>0.5 exits 2 with a user-friendly error listing valid metrics
  • --gate faithfulness>0.85 (valid name, LLM not running) exits 0 (skip preserved)
  • --gate bad_syntax exits 2 (parse error preserved)
  • All 695 non-slow tests pass
  • ruff check, ty check, and raki validate --deep are clean

Review Results

Python Specialist — needs_fixes (addressed)

Severity Finding Status
IMPORTANT pyproject.toml: [dependency-groups].dev duplicated and conflicted with [project.optional-dependencies].dev — spurious local workaround, should not ship Fixed in second commit
MINOR _all_metric_names() called redundantly when both --metrics and --gate are provided Accepted — negligible cost, keeps code readable
MINOR Gate thresholds parsed twice (early validation + evaluation); parsed_gates_early discarded Accepted — correctness preserved, not on hot path
MINOR Three tests cover same CLI invocation but assert single aspects each Accepted — clarity over concision

Core fix logic (cli.py:222–238) confirmed correct and well-structured.

Security Specialist — clean

No credential, path-traversal, or exit-code-bypass concerns. Input validation is strictly improved. Manifest-sourced thresholds (line 467) continue to SKIP unknown metrics — existing safe behaviour, unchanged.

References

Refs #136


Assisted-by: Claude Opus 4.6 (1M context) noreply@anthropic.com
Assigned-by: decko

decko and others added 2 commits April 22, 2026 19:29
Previously, --gate with a completely unknown metric name (e.g. a typo
like `faithfullness>0.85`) would silently skip the gate check and exit 0,
making it impossible to detect typos or misconfigurations. Users received
no feedback that their gate was ineffective.

Now the CLI validates all metric names in --gate against the full known
metric registry (operational + knowledge + Ragas) before running any
evaluation. Unknown metric names produce exit code 2 with a friendly
error message listing all valid metric names — consistent with how
--metrics already validates metric names.

Known metrics that are not computed for the current run (e.g.,
`faithfulness` without `--judge`) continue to SKIP gracefully (exit 0),
preserving the intended behaviour for conditional gate configuration.

Closes #136

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Local dev-tooling workaround that was accidentally committed; the
existing [project.optional-dependencies].dev section is the source
of truth for development dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@decko decko added the ai-assisted Implemented with AI assistance label Apr 22, 2026
@decko decko merged commit eab6163 into main Apr 23, 2026
4 checks passed
@decko decko deleted the soda/136 branch April 23, 2026 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-assisted Implemented with AI assistance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant