Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] This repo is too big. Consider using git-lfs or keeping the data outside of git. #290

Open
harrism opened this issue Apr 24, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request on deck next in line to be merged

Comments

@harrism
Copy link

harrism commented Apr 24, 2020

Describe the bug
This repo is 500MB and growing. Cloning is slow. Most users probably don't use all the datasets. Should consider externalizing the datasets. Could also use git-lfs (large file storage):
https://git-lfs.github.com

(base) mharris@dgx02:~/rapids$ git clone git@github.com:harrism/notebooks-contrib.git
Cloning into 'notebooks-contrib'...
remote: Enumerating objects: 4547, done.
remote: Total 4547 (delta 0), reused 0 (delta 0), pack-reused 4547
Receiving objects: 100% (4547/4547), 489.83 MiB | 24.91 MiB/s, done.
Resolving deltas: 100% (2392/2392), done.
Checking out files: 100% (213/213), done.
@harrism harrism added the enhancement New feature or request label Apr 24, 2020
@taureandyernv
Copy link
Contributor

@harrism , keeping the repo size down was a problem, but we should not be at 500mb. IIRC, 300ish. we may have had an accidental push of dataset or history. If josh okays it, i will search and remove the old datasets. thanks man!

@taureandyernv taureandyernv self-assigned this Apr 28, 2020
@taureandyernv taureandyernv added the on deck next in line to be merged label Apr 28, 2020
@harrism
Copy link
Author

harrism commented Apr 29, 2020

@taureandyernv Even if you remove them, they will still be there in the git history (affecting everybody who clones), unless you modify the history. This is why you should use a different system for large datasets, like git-lfs, since it replaces the files with text pointers to external storage on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request on deck next in line to be merged
Projects
None yet
Development

No branches or pull requests

2 participants