By Alex Free
Github Data Export Extractor makes the .tar.gz
file generated by the Github data export feature more useful for archival purposes. Github Data Export Extractor takes the .tar.gz
file downloaded from Github and generates a new backup directory that contains 2 sub directories:
repositories
- contains all of your repos cloned recursively from the Github Export data locally.releases
- contains all of your released files for your repos. If there are multipleGithub Releases
files with the same filename for different repos, each file will be copied. Duplicate filenames will have.~1~
or similar appended to the end to differentiate the duplicate files. These files may be hidden by default by your file explorer, so you may want to enable showing hidden files. Alternatively,ls
shows these files.
This is much nicer to have then what Github provides with their data export feature alone. I don't like the way GitHub provides this data export though as it is terrible for backing up because:
- It does not contain git repos, instead it contains
.pack
files of the repos. - Every file released for your git repos is in a different directory which only contain the single Github released file.
- Besides my repositories and releases for each repository, I don't want any other files in a backup. The GitHub data export contains a bunch of
.json
files and other things I don't really need in my case.
Github Data Export Extractor v1.0.1
Changes:
- Recursively clones all repos to ensure any and all submodules are archived correctly.
- Does not overwrite duplicate
Github Releases
filenames. If there are multipleGithub Releases
files with the same filename for different repos, each file will be copied. Duplicate filenames will have.~1~
or similar appended to the end to differentiate the duplicate files. These files may be hidden by default by your file explorer, so you may want to enable showing hidden files. Alternatively,ls
shows these files.
Github Data Export Extractor v1.0
Download and extract the latest Github Data Export Extractor release. Inside is gdee
, which is a bash script that takes only two arguments:
- The first argument is the
.tar.gz
file you download through the Github Data Export on Github. - The second argument is the name of the backup directory you want to create from the
tar.gz
file.
Example command line usage:
./gdee 9d9617f2-11b5-11ec-35c8-2dfe00aa20a5.tar.gz alex-free-2-21-2022
This extracts all repos and released files from the Github data export .tar.gz
file and puts it in a new directory named "alex-free-2-21-2022".
Github Data Export Extractor is released into the public domain, see the file unlicese.txt
in each release for more info.