New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document to compress data files before uploading #5687
Comments
Great idea! Should we also take this opportunity to include some audio/image file formats? Currently, it still reads very text heavy. Something like:
|
Hi @stevhliu, thanks for your suggestion. I agree it is a good opportunity to mention that audio/image file formats are also supported. Nit: What about something similar to:
Note that for compressions I have mentioned:
|
Perfect, thanks for making the distinction between compression and data extensions! |
In our docs to Share a dataset to the Hub, we tell users to upload directly their data files, like CSV, JSON, JSON-Lines, text,... However, these extensions are not tracked by Git LFS by default, as they are not in the
.giattributes
file. Therefore, if they are too large, Git will fail to commit/upload them.I think for those file extensions (.csv, .json, .jsonl, .txt), we should better recommend to compress their data files (using ZIP for example) before uploading them to the Hub.
.gitattributes
fileWhat do you think?
CC: @stevhliu
See related issue:
The text was updated successfully, but these errors were encountered: