-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repository is unexpectedly large #71
Comments
This is a problem when trying to create a docker image from these tools. As of today:
I can filter out test and git from going into the docker setup, but the build process for normami generates a debian file (that isn't used AFAICT in running the tools) which relies on some example files from test (locally I've just commented out making the deb file for now). Once filtered, there's only 350MB left used to build the image (5% of the storage!). That could go down more I suspect, but it'd be a massive start to just move the test data and then purge the git history of this data. Git really isn't the best place to store large test data, or if you are going to do this you at least want it in a submodule, so that the main repository can remain lean. Git LFS may also be a solution here. |
Agreed. |
The GitHub API gives the size of the Norma repository as 362425 KB and the AMI repository as 301415 KB.
The recent experience of two new developers, both of whom needed to buy additional hardware in order to be able to clone and work with these repositories, suggests that new users or developers are unlikely to expect these repositories to be so large.
Ways of reducing the size of the repositories should be investigated. For instance, could the repositories' test corpora be factored out into a different module that can be shared, as a dependency, between Norma, AMI, and perhaps other modules in the AMI stack?
(Corresponding AMI issue: ContentMine/ami#70 .)
The text was updated successfully, but these errors were encountered: