Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load dataset with 2.14.5: FileNotFound error #6305

Closed
finiteautomata opened this issue Oct 16, 2023 · 2 comments · Fixed by #6309
Closed

Cannot load dataset with 2.14.5: FileNotFound error #6305

finiteautomata opened this issue Oct 16, 2023 · 2 comments · Fixed by #6309
Assignees

Comments

@finiteautomata
Copy link

Describe the bug

I'm trying to load [piuba-bigdata/articles_and_comments] and I'm stumbling with this error on 2.14.5. However, this works on 2.10.0.

Steps to reproduce the bug

Colab link

Downloading readme: 100%
1.19k/1.19k [00:00<00:00, 30.9kB/s]
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
[<ipython-input-2-807c3583d297>](https://localhost:8080/#) in <cell line: 3>()
      1 from datasets import load_dataset
      2 
----> 3 load_dataset("piuba-bigdata/articles_and_comments", split="train")

2 frames
[/usr/local/lib/python3.10/dist-packages/datasets/load.py](https://localhost:8080/#) in load_dataset(path, name, data_dir, data_files, split, cache_dir, features, download_config, download_mode, verification_mode, ignore_verifications, keep_in_memory, save_infos, revision, token, use_auth_token, task, streaming, num_proc, storage_options, **config_kwargs)
   2127 
   2128     # Create a dataset builder
-> 2129     builder_instance = load_dataset_builder(
   2130         path=path,
   2131         name=name,

[/usr/local/lib/python3.10/dist-packages/datasets/load.py](https://localhost:8080/#) in load_dataset_builder(path, name, data_dir, data_files, cache_dir, features, download_config, download_mode, revision, token, use_auth_token, storage_options, **config_kwargs)
   1813         download_config = download_config.copy() if download_config else DownloadConfig()
   1814         download_config.storage_options.update(storage_options)
-> 1815     dataset_module = dataset_module_factory(
   1816         path,
   1817         revision=revision,

[/usr/local/lib/python3.10/dist-packages/datasets/load.py](https://localhost:8080/#) in dataset_module_factory(path, revision, download_config, download_mode, dynamic_modules_path, data_dir, data_files, **download_kwargs)
   1506                     raise e1 from None
   1507                 if isinstance(e1, FileNotFoundError):
-> 1508                     raise FileNotFoundError(
   1509                         f"Couldn't find a dataset script at {relative_to_absolute_path(combined_path)} or any data file in the same directory. "
   1510                         f"Couldn't find '{path}' on the Hugging Face Hub either: {type(e1).__name__}: {e1}"

FileNotFoundError: Couldn't find a dataset script at /content/piuba-bigdata/articles_and_comments/articles_and_comments.py or any data file in the same directory. Couldn't find 'piuba-bigdata/articles_and_comments' on the Hugging Face Hub either: FileNotFoundError: No (supported) data files or dataset script found in piuba-bigdata/articles_and_comments.

Expected behavior

It should load normally.

Environment info

- `datasets` version: 2.14.5
- Platform: Linux-5.15.120+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.18.0
- PyArrow version: 9.0.0
- Pandas version: 1.5.3
@albertvillanova albertvillanova self-assigned this Oct 17, 2023
@albertvillanova
Copy link
Member

Thanks for reporting, @finiteautomata.

We are investigating it.

@albertvillanova
Copy link
Member

There is a bug in datasets. You can see our proposed fix:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants