Skip to content

coursekevin/repopacker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RepoPacker

A low-tech solution to large files in Git

The problem

You're working on a project and you find yourself accumulating large files which are not well-suited for git leading you to add them to your .gitignore.

Months pass, and you find yourself wanting to share your repository including all of the large files which are now strewn about your repository and hidden from git.

At this point your options are limited to git-lfs (which now requires a paid subscription for truly large files) or linking to an external storage container outside of the repository which will often require refactoring.

Enter RepoPacker

RepoPacker is a command line tool built to work in harmony with Git to solve this problem. In a nutshell RepoPacker follows a simple workflow:

  1. You specify which files should be tracked by RepoPacker.
  2. RepoPacker collects tracked files and packs them into a standard zip file.
  3. When sharing the repository, the person downloading your code downloads both the repository and the zip file generated by RepoPacker.
  4. RepoPacker unpacks the zip file into the locations tracked by RepoPacker in your local repository.

The person downloading your code is left with a local git repository with the large files in their right place.

Installation

Installation requires git and python3.9+. Once you've installed the required dependencies you can install repopacker via pip:

pip install repopacker

Usage

Basic functionality

  • Add files: Add any large files / directories which you want tracked by RepoPacker:

    repopacker add largefile.txt
    repopacker add dir/

    This will create file named .repopacker.json at the GitHub root of your project. It contains some basic config information and will act as a storage container for RepoPacker.

    This command does two things:

    (1)largefile.txt and all the files in dir will be added into the list of RepoPacker tracked files.

    (2)largefile.txt and all the files in dir
    will be added to the .gitignore for your project (you can disable this behavior by setting the .gitignore flag in the .repopacker.json to false).

    Note: files tracked by RepoPacker will not have their changes tracked. All RepoPacker stores is a static link to the file.

  • Pack: Once you have added all the files you want tracked you can create a zip file,

    repopacker pack repopack.zip

    You should upload this to an accessible location such as dropbox or Zenodo. Optionally you can also add the link to the downloadpath in the .repopacker.json.

  • Unpack: Populate directories with large files in their original locations,

    repopacker unpack repopack.zip

Easy downloading

  • Setup: When running pack, RepoPack will store the hash for the pack in the .repopacker.json. You can setup a download link for your pack as,
repopacker config --downloadpath <your-download-link.zip>
  • Download: You can now automatically download the zip and unpack as usual:
repopacker download repopack.zip
repopacker unpack repopack.zip

Note this will download a zip file from the internet. RepoPacker performs a basic checksum using the SHA256 hash of the file but this does not guarantee the zip file's integrity. Only use this option with repositories you can trust.

Additional utilities

  • Initialize: From an existing git repository initialize the RepoPacker system without adding any files first.

    repopacker init

    This is useful if you want to setup some configuration options before adding any files.

  • List files: See all files currently tracked by repopacker:

    repopacker list
  • Remove files: If you accidentally add files to RepoPacker that you shouldn't have you can remove them:

    repopacker remove largefile.txt

    This will also remove largefile.txt from the .gitignore if you have the option enabled.

  • Configuration: See all configuration options:

    repopacker config -h
  • Clean: When downloading an updated version of the repopacker.zip, it can be useful to clear all prexisting files tracked by RepoPacker:

    repopacker clean

Gotchas

  • RepoPacker will attempt to zip your files into a single zip making it poorly suited to extremely large files (more than 100GB will probably become unwieldy).

  • To verify the integrity of zip files, RepoPacker will perform a checksum of the zipped file before unpacking. This can be slow for large zips. You can disable this behavior by modifying the config.

  • RepoPacker is not directly integrated with Git meaning any operations which update the file tree (like git mv) will not be known to RepoPacker. You must remove these files and re-add them using RepoPacker for them to be properly tracked.

About

A low-tech solution to large files in git

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages