Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating Arc Easy Got NonMatchingSplitsSizesError #1217

Closed
yuanyehome opened this issue Dec 26, 2023 · 4 comments
Closed

Evaluating Arc Easy Got NonMatchingSplitsSizesError #1217

yuanyehome opened this issue Dec 26, 2023 · 4 comments

Comments

@yuanyehome
Copy link

Hi, when evaluating with arc easy challenge with the script below:

lm-eval \
    --model hf \
    --model_args trust_remote_code=True,pretrained=$ckpt \ 
    --tasks arc_easy \
    --num_fewshot 0 \
    --device cuda:0 \
    --output_path "./eval_scripts/arc_easy.json"

I got datasets.utils.info_utils.NonMatchingSplitsSizesError.
I've noticed that the arc_easy repo in Huggingface has been updated 5 days ago. Are there some problems with it?
image

@baberabb
Copy link
Contributor

I'm not getting this. Maybe try clearing the dataset in the cache and try again?

@yuanyehome
Copy link
Author

I updated my datasets package from 2.12.0 to 2.16.0 and the issue disappeared. Perhaps this should be added to the dependencies. Thanks anyway.

@haileyschoelkopf
Copy link
Collaborator

haileyschoelkopf commented Dec 26, 2023

We may need to pin datasets to >= 2.16.0 or 2.17.0 following #1135 .

Probably other HF datasets will also be migrated in the backend in a similar way. It shouldn’t change their contents though, HF is just phasing out dataset loading scripts.

@lhoestq
Copy link
Contributor

lhoestq commented Jan 18, 2024

This dataset used to be defined using a dataset script and we recently converted it to Parquet to enable the datasets security features. However because of this change datasets 2.14 is now needed to load this dataset, sorry for the inconvenience.

Alternatively it's possible to load the old version of this dataset (with the dataset script) by specifying its old revision before the change, see at https://huggingface.co/datasets/allenai/ai2_arc/commits/main)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants