New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finish deprecating the fs argument #5393
Finish deprecating the fs argument #5393
Conversation
…o and datasets.load
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the deprecation. Some minor suggested fixes below...
Also note that the corresponding tests should be updated as well.
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Thanks for the suggestions/typo fixes. I updated the failing test - passing locally now |
Nice thanks ! I believe you also need to update This should remove the remaining warnings in the CI such as tests/test_builder.py::test_builder_with_filesystem_download_and_prepare_reload
tests/test_load.py::test_load_dataset_local[False]
tests/test_load.py::test_load_dataset_local[True]
tests/test_load.py::test_load_dataset_zip_csv[csv_path-False]
tests/test_load.py::test_load_dataset_then_move_then_reload
/opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/datasets/info.py:344: FutureWarning: 'fs' was deprecated in favor of 'storage_options' in version 2.9.0 and will be removed in 3.0.0.
You can remove this warning by passing 'storage_options=fs.storage_options' instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for all the work, @dconathan.
I agree with @lhoestq that we should better address all remaining fs
deprecation warnings with this PR.
For example, there are still some deprecation warnings when calling Dataset.load_from_disk
with fs
. See:
Line 1819 in 232a439
return Dataset.load_from_disk(dataset_path, fs, keep_in_memory=keep_in_memory) |
or DatasetDict.load_from_disk
with fs
. See:
Line 1821 in 232a439
return DatasetDict.load_from_disk(dataset_path, fs, keep_in_memory=keep_in_memory) |
These docstrings should also be updated:
>>> dataset = load_from_disk('s3://my-private-datasets/imdb/train', fs=s3) # doctest: +SKIP |
>>> dataset.save_to_disk('s3://my-private-datasets/imdb/train', fs=s3) # doctest: +SKIP |
fs=
arg
re: docstring, I assume passing in |
what about datasets/src/datasets/filesystems/__init__.py Lines 43 to 54 in 5b793dd
leave as is? Is this function no longer necessary? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for all your work on this PR, @dconathan.
I think the function is_remote_filesystem
should be kept as it is.
We are going to re-run the CI. Once all green, we can merge.
Show benchmarksPyArrow==6.0.0 Show updated benchmarks!Benchmark: benchmark_array_xd.json
Benchmark: benchmark_getitem_100B.json
Benchmark: benchmark_indices_mapping.json
Benchmark: benchmark_iterating.json
Benchmark: benchmark_map_filter.json
Show updated benchmarks!Benchmark: benchmark_array_xd.json
Benchmark: benchmark_getitem_100B.json
Benchmark: benchmark_indices_mapping.json
Benchmark: benchmark_iterating.json
Benchmark: benchmark_map_filter.json
|
See #5385 for some discussion on this
The
fs=
arg was depcrecated fromDataset.save_to_disk
andDataset.load_from_disk
in2.8.0
(to be removed in3.0.0
). There are a few other places where thefs=
arg was still used (functions/methods indatasets.info
anddatasets.load
). This PR adds a similar behavior, warnings and thestorage_options=
arg to these functions and methods.One question: should the "deprecated" / "added" versions be
2.8.1
for the docs/warnings on these? Right now I'm going with "fs was deprecated in 2.8.0" but "storage_options= was added in 2.8.1" where appropriate.@mariosasko