Skip to content

glukasiknuro/bazel-cache-chunking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bazel cache chunking stats

Contains sample binary that scans files in provided directory and display total size of the files if those were chunked, only unique chunks left, and the chunks compressed.

Written to check gains if data in bazel remote cache was chunked, see e.g. bazelbuild/remote-apis#178

Available chunking strageties include chunk on 1MiB boundary, and content based chunking based on FastCDC with block sizes around 1MiB - see chunker.go

Using zstd compression in speed fastest, which is most suitable for caching use case. See compressor.go for other options.

Building

Build binary - below using static binary that allows to run properly in environments that are missing some libraries.

GOOS=linux GOARCH=amd64 go build -a -ldflags "-extldflags '-static'" github.com/glukasiknuro/bazel-cache-chunking

Running

Below execution will scan all files in a given directory and print stats if data was chunked and compressed. Note, that it does not decompress any found files, so it does not make sense to run it on already compressed data.

./bazel-cache-chunking -workers 10 -dir /data/cas

Sample stats

Below is an execution on data from bazel-remote CAS data, from system that caches builds mainly build with --compilation_mode=fastbuild.

Processed successfully 468997 files, total size: 1.7 TB

No chunking ZSTD CGO SPEED  size: 343 GB (avg: 732 kB), 20.53% of total  (4.87x effective space)  chunks: 468997    duplicate: 0
        1MB ZSTD CGO SPEED  size: 323 GB (avg: 688 kB), 19.29% of total  (5.18x effective space)  chunks: 1888847   duplicate: 134986
       Gear ZSTD CGO SPEED  size: 244 GB (avg: 519 kB), 14.55% of total  (6.87x effective space)  chunks: 1171361   duplicate: 440797

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages