Skip to content

[EVAL.yaml] Feature: Numeric comparison operators for shell assertions #1207

@tsoyangbot

Description

@tsoyangbot

Feature Request: Numeric comparison operators for shell assertions

Current Behavior

Shell assertions in EVAL.yaml only support exact string matching via expected:

- type: shell
  command: "pdfinfo report.pdf | grep Pages | awk '{print $2}'"
  expected: "14"  # Exact match only - fails if "15"

Desired Behavior

Support numeric operators for shell assertions:

- type: shell
  command: "pdfinfo report.pdf | grep Pages | awk '{print $2}'"
  operator: ">="
  expected: "5"

Supported operators: >, <, >=, <=, ==, !=

Use Case

From property-inspection-video-analysis E2E evals:

  • Verify PDF has "at least 5 pages"
  • Check compression reduced images below 200KB
  • Confirm frame extraction produced > 50 frames across videos

Current Workaround

Using shell scripts with explicit exit codes:

pages=$(pdfinfo report.pdf | grep Pages | awk '{print $2}')
[ "$pages" -ge 5 ] && echo "PASS" || echo "FAIL"

This is verbose, error-prone, and clutters EVAL files.

Reproduction

Repo: https://github.com/tsoyang-org/property-inspection-bench

File: e2e-2-markdown-to-pdf/EVAL.yaml (lines 18-22)

Currently uses awkward workaround instead of direct numeric comparison.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions