Skip to content
This repository has been archived by the owner on Sep 27, 2021. It is now read-only.


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


Build Status Coverage Status

We backup and archive GitHub.

GitBackup was built at Storj by @super3, @montyanderson, and @calebcase.


We have a single central server exposing a REST API used by both the user interface and by workers.

Workers operate statelessly and can be scaled, limited only by the central server's ability to provision work.

Storj serves as our durable store for all data and metadata. Redis will serve as the store for ephmerical data and data cached for speed reasons.

  • Locks and last sync time in Redis (per username)
  • Everything else in Storj (usernames, repos, last sync, repo count, etc)


The durable store needs to support the following operations:

  • Listing usernames
  • Getting the last sync time for a username
  • Getting the repository count for a username
  • Listing a user's repositories
  • Getting the last update time for a repository
  • Getting the last error for a repository

To avoid directories with a very large number of entries the paths will be constructed with a hash prefix.

The general layout scheme:

bucket sha256sum(username)[:8] username repository archive 2b/cb/c2/d5/ octocat/ Hello-World.bundle 2b/cb/c2/d5/ octocat/ 2b/cb/c2/d5/ octocat/ Hello-World.error

For example, the ZIP archive of would be located at:


The data will be sharded across all production satellites to maximize our total throughput and available storage. The sharding will be done per user based on the first byte of the sha256sum of the username and then equally split among the satellites.

Sharing allocations with our current satellites:

satellite min max
asia-east-1 00 55
europe-west-1 56 aa
us-central-1 ab ff

Listing a user's repositories

rclone ls ''

Getting the last update time for a repository

rclone ls ''

Getting last error for a repository

rclone cat ''



Locks are stored as normal Redis keys with a TTL as described by Redlock. The lock must be refreshed by the worker before it expires. For example, if locks expire every 10 seconds, the worker should attempt to relock after 5 seconds.

Initially getting the lock:

SET "lock:octocat" 1 EX 10 NX


EXPIRE "lock:octocat" 10

Locks are not explicitly deleted and are left to expire.

Last Sync

Last sync data is stored in Redis to facilitate fast calculation of which user should be sync'd next. This data is rebuilt from the Storj bucket metadata on start up.

Initially each user is added to the tracked sorted set:

ZADD tracked 0 ""

Where 0 is the last time the user fully synced or -1 if it has never been done.

Getting the next user to sync is accomplished by retrieving user's sorted by score (and then skipping any that are locked):

ZRANGEBYSCORE tracked "-inf" "+inf" LIMIT 0 1
SET "lock:octocat" 1 EX 10 NX


Backup and archive of Github repositories.








No releases published


No packages published

Contributors 4