Add an artifact that requires time series comparisons (e.g., from a line graph) to confirm results were reproduced

The initial release of `ArtEvalBench` (v0.9, PR #15) contains a single artifact, [Wasabi](https://github.com/bastoica/wasabi), whose results can be ultimately summarized as a single integer -- the number of bugs were triggered by the current attempt.

The goal of this feature request is adding an artifact that produces a more diverse set of results/outputs, including time series used for plots and figures, which require a more elaborate "results reproduced"/"experiment runs" evaluator oracle. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add an artifact that requires time series comparisons (e.g., from a line graph) to confirm results were reproduced #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add an artifact that requires time series comparisons (e.g., from a line graph) to confirm results were reproduced #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions