Skip to content

support Datasets and additional output date in Evals#11

Merged
Andrew Kent (realark) merged 1 commit intomainfrom
ark/eval-task-output
Nov 12, 2025
Merged

support Datasets and additional output date in Evals#11
Andrew Kent (realark) merged 1 commit intomainfrom
ark/eval-task-output

Conversation

@realark
Copy link
Copy Markdown
Collaborator

@realark Andrew Kent (realark) commented Nov 11, 2025

  • datasets interface
    • currently only in-memory datasets, but future proof for fetching from Braintrust
  • task and scorers output to record classes
    • to support passing secondary data

@realark Andrew Kent (realark) force-pushed the ark/eval-task-output branch 2 times, most recently from a95f56c to 879a229 Compare November 12, 2025 16:23
EvalCase.of("strawberry", "fruit"),
EvalCase.of("asparagus", "vegetable"),
EvalCase.of("apple", "fruit"),
EvalCase.of("banana", "fruit"))
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: EvalCase is still supported for backwards compatibility (I've marked the class as deprecated and encouraged use of DatasetCase instead).

@realark Andrew Kent (realark) force-pushed the ark/eval-task-output branch 3 times, most recently from 9ed8278 to 75ee381 Compare November 12, 2025 16:50
@realark Andrew Kent (realark) marked this pull request as ready for review November 12, 2025 16:52
@realark Andrew Kent (realark) force-pushed the ark/eval-task-output branch 3 times, most recently from 56dd8c0 to 9058941 Compare November 12, 2025 17:36
- datasets interface
  - currently only in-memory datasets, but future proof for fetching from branitrust
- task and scorers output to record classes
  - to support passing secondary data
@realark Andrew Kent (realark) merged commit f8682b7 into main Nov 12, 2025
1 check passed
@realark Andrew Kent (realark) added the enhancement New feature or request label Nov 12, 2025
@realark Andrew Kent (realark) changed the title Rework evals api Support Datasets and additional output date in Evals Nov 12, 2025
@realark Andrew Kent (realark) changed the title Support Datasets and additional output date in Evals support Datasets and additional output date in Evals Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant