-
Notifications
You must be signed in to change notification settings - Fork 10
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Context
The data type for the results field of the BenchmarkResults class is hard to parse.
Currently, the results field is a (possibly) nested dict up to 3 levels deep. Depending on whether the benchmark is multi-task or single-task and depending on whether the benchmark includes just a single or multiple test sets, the depth of the dict changes and the same level can contain different information for result objects coming from different benchmarks.
Because of this inconsistency, it's hard to parse the results downstream (e.g. to build the leaderboard or to serialize the field).
Description
Consider the downstream use-cases for the results field and devise a new data-structure that is easy to parse to facilitate these use cases.
Acceptance Criteria
- An informed decision has been made on how to revise the results data-structure.
- The data-structure has been implemented in the Polaris library.
Links
- On the visualization of the results, see https://github.com/polaris-hub/polaris-hub/issues/131
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request