Feature Request: Numeric comparison operators for shell assertions
Current Behavior
Shell assertions in EVAL.yaml only support exact string matching via expected:
- type: shell
command: "pdfinfo report.pdf | grep Pages | awk '{print $2}'"
expected: "14" # Exact match only - fails if "15"
Desired Behavior
Support numeric operators for shell assertions:
- type: shell
command: "pdfinfo report.pdf | grep Pages | awk '{print $2}'"
operator: ">="
expected: "5"
Supported operators: >, <, >=, <=, ==, !=
Use Case
From property-inspection-video-analysis E2E evals:
- Verify PDF has "at least 5 pages"
- Check compression reduced images below 200KB
- Confirm frame extraction produced > 50 frames across videos
Current Workaround
Using shell scripts with explicit exit codes:
pages=$(pdfinfo report.pdf | grep Pages | awk '{print $2}')
[ "$pages" -ge 5 ] && echo "PASS" || echo "FAIL"
This is verbose, error-prone, and clutters EVAL files.
Reproduction
Repo: https://github.com/tsoyang-org/property-inspection-bench
File: e2e-2-markdown-to-pdf/EVAL.yaml (lines 18-22)
Currently uses awkward workaround instead of direct numeric comparison.
Feature Request: Numeric comparison operators for shell assertions
Current Behavior
Shell assertions in EVAL.yaml only support exact string matching via
expected:Desired Behavior
Support numeric operators for shell assertions:
Supported operators:
>,<,>=,<=,==,!=Use Case
From property-inspection-video-analysis E2E evals:
Current Workaround
Using shell scripts with explicit exit codes:
This is verbose, error-prone, and clutters EVAL files.
Reproduction
Repo: https://github.com/tsoyang-org/property-inspection-bench
File:
e2e-2-markdown-to-pdf/EVAL.yaml(lines 18-22)Currently uses awkward workaround instead of direct numeric comparison.