Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load hugging face dataset for function calling #371

Closed
sebastiangonsal opened this issue Apr 19, 2024 · 1 comment
Closed

Cannot load hugging face dataset for function calling #371

sebastiangonsal opened this issue Apr 19, 2024 · 1 comment

Comments

@sebastiangonsal
Copy link

To repro, try the following:

from datasets import load_dataset

dataset = load_dataset("gorilla-llm/Berkeley-Function-Calling-Leaderboard")

You will get the following error

  File "pyarrow/_json.pyx", line 308, in pyarrow._json.read_json
  File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: JSON parse error: Column(/execution_result/[]) changed from number to string in row 13
@CharlieJCJ
Copy link
Contributor

CharlieJCJ commented Apr 19, 2024

Hi @sebastiangonsal,

We detailed dataset accessing instructions in this section of the README: https://github.com/ShishirPatil/gorilla/tree/main/berkeley-function-call-leaderboard#prepare-evaluation-dataset. Please download the huggingface dataset through this procedure, which is fully tested and functional.

You can load the dataset through the helper function defined in this section: https://github.com/ShishirPatil/gorilla/blob/main/berkeley-function-call-leaderboard/eval_checker/eval_runner_helper.py#L409C1-L415C18. The dataset is organized in JSON files, where each file representing a test category. The helper function loads a JSON file into a list of dictionaries.

We are looking into ways to be compatible with the HuggingFace datasets package shortly to provide alternative ways to access the dataset. Thanks for raising this to our attention! In the meantime, please try out the way we outlined above. Thanks!

Best,
BFCL Team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants