Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


github-statistics is a workflow repository designed to pull data from the GitHub Repositories API and GitHub Users API on a regularly scheduled basis to generate distribution statistics based on a subset of GitHub early repositories and users.

Google Sheets


As of 2021, GitHub has over 73 million registered users. The github-users.db SQLite database in this repository includes the first 1.5 million registered users. It reflects 15 CI runs, pulling 100,000 users per run, compressed with Zstandard, the same compression algorithm GitHub uses for actions/cache@v3.

The planned studies to be produced by this repository will be bounded by GitHub repository limits in order to follow recommendations set out by the Managing large files article. 1.5 million users is the maximum amount of users that can fit in a full series of 100,000 user inserts after compressed with Zstandard.

As of Jun 17 2022, github-statistics adds repositories.

Note: Do not use Git LFS. It is not possible to remove Git LFS objects from a repository without deleting and recreating the repository.


  • github-repositories.db
  • github-users.db


  • repositories NEW
    GitHub repositories as listed by GET /repositories
  • repositories_stargazers NEW
    GitHub repositories from repositories and their stargazer counts
  • users
    GitHub users as listed by GET /users
  • users_followers
    GitHub users from users and their follower counts

Decompress database


zstd -d github-users.tzst
tar xf github-users.tar


tar --use-compress-program zstd -xf github-users.tzst


GNU General Public License v2.0