A collection of scripts for moving data between git and IPFS/IPLD. Here are the main points to consider when comparing this project with others:
- This repo provides a collection of narrowly-defined scripts; rather than an overarching tool (like a git-remote-helper or a UI).
- These scripts treat git and IPFS as "object stores". In particular, they are not directly compatible with "filesystem" approaches like IPFS's UnixFs.
The contents of this repo are currently split into three parts:
- Generic utilities live in the top-level, and are described below.
- The
car/directory contains utilities for converting between Git data and CAR files (content addressed archives). These work standalone and offline. - The
kubo/directory contains utilities for sending Git data into/out-of a running Kubo daemon (a widely used IPFS implementation).
All of the utilities in this repo are subject to the following caveats, which should be kept in mind when deciding if they're appropriate for your use-case.
These scripts use one IPFS block for each git object. This ensures their hashes stay the same, but this may cause problems for large blobs (files), since most IPFS nodes "in the wild" will reject blocks that are larger than ~1MB (for security reasons).
We currently assume hashes are SHA1. This is the most common case for git repos; although there's a slow transition to better hash algorithms like sha256. It's not hard to support others; I just haven't needed it.
Tags aren't supported/tested, since I haven't needed it.
Some of these commands assume the presence of a .git/objects directory, so
they don't work on bare repos yet. This would be straightforward to fix, but I
haven't needed to.
These scripts assume objects in the git repo are stored unpacked.
Adds metadata to a hex-encoded git SHA1 object ID, turning it into a CID.
$ git2cid 95296e419bcf7a7e84efe1396925ac55ee22b1f2
baf4bcfevffxedg6ppj7ij37bhfusllcv5yrld4q
Extracts the hash from a CID. Warns if it's not a "git-raw" SHA1.
$ cid2git baf4bcfevffxedg6ppj7ij37bhfusllcv5yrld4q
95296e419bcf7a7e84efe1396925ac55ee22b1f2
$ cid2git bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
WARNING: CID 'bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am' has codec 'raw', not 'git-raw'
WARNING: CID 'bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am' has 'sha2-256' hash, not 'sha1'
5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03
When run from a git repo, this populates a directory with a particular tree from that repo.
$ git init -q
$ touch hello
$ git add hello
$ git commit -m "Add hello"
[master (root-commit) 3f016cd] Add hello
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 hello
$ touch goodbye
$ git add goodbye
$ git rm hello
rm 'hello'
$ git commit -m "Goodbye"
[master f1dfc13] Goodbye
1 file changed, 0 insertions(+), 0 deletions(-)
rename hello => goodbye (100%)
$ ls
goodbye
$ ls ../my-old-tree
ls: cannot access '../my-old-tree': No such file or directory
$ git2dir 3f016cd^{tree} ../my-old-tree
$ ls ../my-old-tree
hello
Runs cid2git and reads the result (warning: uses import-from-derivation).
This Nix function takes a CID (which should identify a Git tree), fetches its contents as a CAR file from IPFS, and realises it into a directory. The result is a fixed-output derivation, whose hash is taken from the CID.