auto select evaluators correctly by benjibc · Pull Request #323 · eval-protocol/python-sdk

benjibc · 2025-11-09T01:58:09Z

Changes made:
- Added robust evaluator-id inference and interactive selection in eval_protocol/cli_commands/create_rft.py (handles last-used, project/home traces, multiple traces with interactive/--yes behavior, fallback to single test).
- Persist last used evaluator id for seamless subsequent eval-protocol create rft runs.
- Added tests covering all branches in tests/test_cli_create_rft_infer.py.
You can now run eval-protocol create rft without specifying --evaluator-id. It will:
- Use the last selected evaluator if available.
- Pick the only trace if just one exists.
- Prompt to choose when multiple are available (or auto-pick most recent with --yes).
- Fall back to a single discovered test when no traces exist.

Note

Adds robust evaluator auto-selection (last-used/traces with interactive or most-recent), persists last evaluator, skips upload when evaluator exists with ACTIVE polling, and adds comprehensive tests.

CLI create_rft:
- Evaluator inference: Auto-select via last-used pointer, project/home trace discovery, interactive prompt (or most-recent when --yes).
- Persistence: Save last-used evaluator to .eval_protocol/last_evaluator.json after successful ACTIVE ensure.
- Upload short-circuit: If evaluator exists (via GET), skip upload; poll until ACTIVE, with dashboard guidance on timeout.
- Entry resolution: Map evaluator_id to discovered tests; fail fast if multiple tests and no match.
- Dataset ID: _build_trimmed_dataset_id hardened to handle empty/non-alpha starts.
Tests:
- Add tests/test_cli_create_rft_infer.py covering last-used loading/saving, trace selection (single/multiple, interactive/non-interactive), fallback to single test, end-to-end create_rft paths, and dataset-id derivation.

^{Written by Cursor Bugbot for commit cca18e6. This will update automatically on new commits. Configure here.}

eval_protocol/cli_commands/create_rft.py

auto select evaluators correctly

64efd1c

cursor bot reviewed Nov 9, 2025

View reviewed changes

eval_protocol/cli_commands/create_rft.py Outdated Show resolved Hide resolved

xzrderek approved these changes Nov 9, 2025

View reviewed changes

add new test to verify dataset id and fix code

e6cbe86

cursor bot reviewed Nov 9, 2025

View reviewed changes

eval_protocol/cli_commands/create_rft.py Show resolved Hide resolved

eval_protocol/cli_commands/create_rft.py Show resolved Hide resolved

xzrderek added 2 commits November 8, 2025 19:12

try skipping if possible

fb12028

fix

cca18e6

xzrderek merged commit c1df8b5 into main Nov 9, 2025
8 checks passed

xzrderek deleted the auto_select_evaluator_correctly branch November 9, 2025 03:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto select evaluators correctly#323

auto select evaluators correctly#323
xzrderek merged 4 commits intomainfrom
auto_select_evaluator_correctly

benjibc commented Nov 9, 2025 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

benjibc commented Nov 9, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

benjibc commented Nov 9, 2025 •

edited by cursor bot

Loading