Conversation
Previously, --gate with a completely unknown metric name (e.g. a typo like `faithfullness>0.85`) would silently skip the gate check and exit 0, making it impossible to detect typos or misconfigurations. Users received no feedback that their gate was ineffective. Now the CLI validates all metric names in --gate against the full known metric registry (operational + knowledge + Ragas) before running any evaluation. Unknown metric names produce exit code 2 with a friendly error message listing all valid metric names — consistent with how --metrics already validates metric names. Known metrics that are not computed for the current run (e.g., `faithfulness` without `--judge`) continue to SKIP gracefully (exit 0), preserving the intended behaviour for conditional gate configuration. Closes #136 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Local dev-tooling workaround that was accidentally committed; the existing [project.optional-dependencies].dev section is the source of truth for development dependencies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes ticket 136: when
--gate unknown_metric>0.5was specified, the CLI silently skipped the gate (exit 0), giving no indication that the metric name was invalid. This was especially dangerous for typos likefaithfullnessinstead offaithfulness.--gatemetric names against the full known metric registry (_all_metric_names()) before any heavy loading, mirroring the existing--metricsvalidation pattern.Error: Invalid value for '--gate': Unknown metric(s) in --gate: ..., Valid metrics: ...faithfulnesswithout--judge) → exit 0 SKIP, preserving existing behaviour.[dependency-groups]block accidentally committed topyproject.toml(local dev-tooling workaround).Files Changed
src/raki/cli.py--gatemetric-name validation block (lines 222–238)tests/test_cli.pyTestGateThresholdCLIchanges/136.fixpyproject.toml[dependency-groups]blockAcceptance Criteria
--gate typo_metric>0.5exits 2 with a user-friendly error listing valid metrics--gate faithfulness>0.85(valid name, LLM not running) exits 0 (skip preserved)--gate bad_syntaxexits 2 (parse error preserved)ruff check,ty check, andraki validate --deepare cleanReview Results
Python Specialist —
needs_fixes(addressed)pyproject.toml:[dependency-groups].devduplicated and conflicted with[project.optional-dependencies].dev— spurious local workaround, should not ship_all_metric_names()called redundantly when both--metricsand--gateare providedparsed_gates_earlydiscardedCore fix logic (
cli.py:222–238) confirmed correct and well-structured.Security Specialist —
cleanNo credential, path-traversal, or exit-code-bypass concerns. Input validation is strictly improved. Manifest-sourced thresholds (line 467) continue to SKIP unknown metrics — existing safe behaviour, unchanged.
References
Refs #136
Assisted-by: Claude Opus 4.6 (1M context) noreply@anthropic.com
Assigned-by: decko