Skip to content

Pull requests: huggingface/datasets

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Better error handling in dataset_module_factory
#6959 opened Jun 7, 2024 by Wauplin Loading…
Move info_utils errors to exceptions module
#6952 opened Jun 4, 2024 by albertvillanova Loading…
Support folder-based datasets with large metadata.jsonl
#6859 opened May 2, 2024 by gbenson Loading…
LargeListType support #6834
#6835 opened Apr 24, 2024 by Modexus Loading…
Support downloading specific splits in load_dataset
#6832 opened Apr 23, 2024 by mariosasko Loading…
2
8
Make Image cast storage faster
#6786 opened Apr 5, 2024 by Modexus Loading…
Allow polars as valid output type
#6762 opened Mar 28, 2024 by psmyth94 Loading…
3x Faster Text Preprocessing
#6711 opened Mar 3, 2024 by ashvardanian Loading…
Persist IterableDataset epoch in workers
#6710 opened Mar 2, 2024 by lhoestq Loading…
__add__ for Dataset, IterableDataset
#6694 opened Feb 26, 2024 by oh-gnues-iohc Loading…
Run download_and_prepare if missing splits
#6639 opened Feb 2, 2024 by lhoestq Loading…
Add return_file_name in load_dataset
#6310 opened Oct 17, 2023 by juliendenize Loading…
Move exceptions.py to utils/exceptions.py
#6296 opened Oct 11, 2023 by mariosasko Loading…
Drop data_files duplicates
#6282 opened Oct 5, 2023 by lhoestq Loading…
2
7
Add repo_id to DatasetInfo
#6268 opened Sep 29, 2023 by lhoestq Draft
2 tasks
Use LibYAML with PyYAML if available
#6266 opened Sep 27, 2023 by bryant1410 Loading…
ProTip! What’s not been updated in a month: updated:<2024-05-08.