You're working on a project and you find yourself accumulating large files which are not well-suited for git leading you to add them to your .gitignore.
Months pass, and you find yourself wanting to share your repository including all of the large files which are now strewn about your repository and hidden from git.
At this point your options are limited to git-lfs (which now requires a paid subscription for truly large files) or linking to an external storage container outside of the repository which will often require refactoring.
RepoPacker is a command line tool built to work in harmony with Git to solve this problem. In a nutshell RepoPacker follows a simple workflow:
- You specify which files should be tracked by RepoPacker.
- RepoPacker collects tracked files and packs them into a standard zip file.
- When sharing the repository, the person downloading your code downloads both the repository and the zip file generated by RepoPacker.
- RepoPacker unpacks the zip file into the locations tracked by RepoPacker in your local repository.
The person downloading your code is left with a local git repository with the large files in their right place.
Installation requires git and python3.9+. Once you've installed the required dependencies you can install repopacker via pip:
pip install repopacker
-
Add files: Add any large files / directories which you want tracked by RepoPacker:
repopacker add largefile.txt repopacker add dir/
This will create file named
.repopacker.json
at the GitHub root of your project. It contains some basic config information and will act as a storage container for RepoPacker.This command does two things:
(1)
largefile.txt
and all the files indir
will be added into the list of RepoPacker tracked files.(2)
largefile.txt
and all the files indir
will be added to the.gitignore
for your project (you can disable this behavior by setting the.gitignore
flag in the.repopacker.json
to false).Note: files tracked by RepoPacker will not have their changes tracked. All RepoPacker stores is a static link to the file.
-
Pack: Once you have added all the files you want tracked you can create a zip file,
repopacker pack repopack.zip
You should upload this to an accessible location such as dropbox or Zenodo. Optionally you can also add the link to the
downloadpath
in the.repopacker.json
. -
Unpack: Populate directories with large files in their original locations,
repopacker unpack repopack.zip
- Setup: When running pack, RepoPack will store the hash for the pack in
the
.repopacker.json
. You can setup a download link for your pack as,
repopacker config --downloadpath <your-download-link.zip>
- Download: You can now automatically download the zip and unpack as usual:
repopacker download repopack.zip
repopacker unpack repopack.zip
Note this will download a zip file from the internet. RepoPacker performs a basic checksum using the SHA256 hash of the file but this does not guarantee the zip file's integrity. Only use this option with repositories you can trust.
-
Initialize: From an existing git repository initialize the RepoPacker system without adding any files first.
repopacker init
This is useful if you want to setup some configuration options before adding any files.
-
List files: See all files currently tracked by repopacker:
repopacker list
-
Remove files: If you accidentally add files to RepoPacker that you shouldn't have you can remove them:
repopacker remove largefile.txt
This will also remove
largefile.txt
from the.gitignore
if you have the option enabled. -
Configuration: See all configuration options:
repopacker config -h
-
Clean: When downloading an updated version of the repopacker.zip, it can be useful to clear all prexisting files tracked by RepoPacker:
repopacker clean
-
RepoPacker will attempt to zip your files into a single zip making it poorly suited to extremely large files (more than 100GB will probably become unwieldy).
-
To verify the integrity of zip files, RepoPacker will perform a checksum of the zipped file before unpacking. This can be slow for large zips. You can disable this behavior by modifying the config.
-
RepoPacker is not directly integrated with Git meaning any operations which update the file tree (like
git mv
) will not be known to RepoPacker. You must remove these files and re-add them using RepoPacker for them to be properly tracked.