New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak readme #5210
Tweak readme #5210
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Nit: We should also update the |
Updated the disclaimers section, thanks ! Does it sound good to you @albertvillanova ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the docs improvements.
Some nits below: feel free to ignore them.
README.md
Outdated
@@ -46,6 +46,8 @@ | |||
- Smart caching: never wait for your data to process several times. | |||
- Lightweight and fast with a transparent and pythonic API (multi-processing/caching/memory-mapping). | |||
- Built-in interoperability with NumPy, pandas, PyTorch, Tensorflow 2 and JAX. | |||
- Native support for audio and image data | |||
- Stream datasets without downloading them completely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can be confusing what means "without downloading them completely".
- this could be understood as: stream datasets just saving them to disk partially
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to
Enable streaming mode to save disk space and start iterating over the dataset immediately.
README.md
Outdated
If your dataset is bigger than your disk or if you don't want to wait to download the data, you can use streaming: | ||
|
||
```python | ||
# If you want to efficiently download the data as you iterate over the dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again here, "download" can be understood as "save to disk".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to
If you want to use the dataset immediately and efficiently stream the data as you iterate over the dataset
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Tweaked some paragraphs mentioning the modalities we support + added a paragraph on security