Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] EvalAlgorithmInterface.evaluate should accept a list of DataConfigs for consistency #269

Open
athewsey opened this issue May 3, 2024 · 1 comment

Comments

@athewsey
Copy link
Contributor

athewsey commented May 3, 2024

Today EvalAlgorithmInterface.evaluate is typed to return List[EvalOutput] ("for dataset(s)", per the docstring), but its dataset_config argument only accepts Optional[DataConfig].

It seems like most concrete eval algorithms (like QAAccuracy here) either take the user's data_config for a single dataset, or take all the pre-defined DATASET_CONFIGS relevant to the evaluator's problem type.

...So the internal logic of evaluators is set up to support providing multiple datasets and returning multiple results already, but we seem to prevent users from calling evaluate() with multiple of their own datasets for no particular reason?

@danielezhu
Copy link
Contributor

Your understanding is correct. Currently, evaluate can either be configured to use a single user-provided dataset (via data_config) or configured to use all of the "built-in" datasets. Your feature request certainly makes sense; there isn't a particularly compelling reason I can think of for why we shouldn't be able to evaluate multiple "custom" (i.e. user-provided) datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants