Chunking for CAR files. Split a single CAR into multiple CARs.
go get github.com/alanshaw/go-carbites
Carbites supports 2 different strategies:
- Simple (default) - fast but naive, only the first CAR output has a root CID, subsequent CARs have a placeholder "empty" CID. The first CAR output has roots in the header, subsequent CARs have an empty root CID
bafkqaaa
as recommended. - Treewalk - walks the DAG to pack sub-graphs into each CAR file that is output. Every CAR file has the same root CID but contains a different portion of the DAG. The DAG is traversed from the root node and each block is decoded and links extracted in order to determine which sub-graph to include in each CAR.
package main
import (
"io"
"os"
"github.com/alanshaw/go-carbites"
)
func main() {
bigCar, _ := os.Open("big.car")
targetSize := 1024 * 1024 // 1MiB chunks
strategy := carbites.Simple // also carbites.Treewalk
spltr, _ := carbites.Split(bigCar, targetSize, strategy)
var i int
for {
car, err := spltr.Next()
if err != nil {
if err == io.EOF {
break
}
panic(err)
}
b, _ := ioutil.ReadAll(car)
ioutil.WriteFile(fmt.Sprintf("chunk-%d.car", i), b, 0644)
i++
}
}
Feel free to dive in! Open an issue or submit PRs.
Dual-licensed under MIT + Apache 2.0