You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At least for 'reclor', 'telugu_books', 'turkish_movie_sentiment', 'ubuntu_dialogs_corpus', 'wikihow', trying to load_dataset in streaming mode raises a TypeError without any detail about why it fails.
AssertionError: The dataset reclor with config default requires manual data.
Please follow the manual download instructions: to use ReClor you need to download it manually. Please go to its homepage (http://whyu.me/reclor/) fill the google
form and you will receive a download link and a password to extract it.Please extract all files in one folder and use the path folder in datasets.load_dataset('reclor', data_dir='path/to/folder/folder_name')
.
Manual data can be loaded with `datasets.load_dataset(reclor, data_dir='<path/to/manual/data>')
Actual results
TypeError: expected str, bytes or os.PathLike object, not NoneType
Environment info
datasets version: 1.11.0
Platform: macOS-11.5-x86_64-i386-64bit
Python version: 3.8.11
PyArrow version: 4.0.1
The text was updated successfully, but these errors were encountered:
As discussed, datasets requiring manual download should be:
programmatically identifiable
properly handled with more clear error message when trying to load them with streaming
In relation with programmatically identifiability, note that for datasets requiring manual download, their builder have a property manual_download_instructions which is not None:
Describe the bug
At least for 'reclor', 'telugu_books', 'turkish_movie_sentiment', 'ubuntu_dialogs_corpus', 'wikihow', trying to
load_dataset
in streaming mode raises aTypeError
without any detail about why it fails.Steps to reproduce the bug
Expected results
Ideally: raise a specific exception, something like
ManualDownloadError
.Or at least give the reason in the message, as when we load in normal mode:
Actual results
Environment info
datasets
version: 1.11.0The text was updated successfully, but these errors were encountered: