Release v1: diffah container image delta export/import/inspect#1
Merged
Release v1: diffah container image delta export/import/inspect#1
Conversation
Captures the agreed scope, architecture, CLI surface, delta archive format, and export/import algorithms before implementation begins, so the upcoming implementation plan and code can be evaluated against a stable contract.
7 stages, 26 TDD tasks with concrete file paths, failing tests, minimal implementations, and per-task commits. Each containers-image API binding task starts with a 'go doc' pin step so signatures are verified live rather than recalled.
Pin versions: - github.com/spf13/cobra@v1.8.1 - go.podman.io/image/v5@v5.39.2 - github.com/klauspost/compress@v1.18.5 - github.com/stretchr/testify@v1.11.1 Removed deprecated bakgo.mod with old github.com/containers/image reference.
go mod tidy strips pre-pinned deps that no code imports yet, so the prior commit's go.mod plus empty go.sum was an inconsistent state. Deps will land via go mod tidy as later tasks add real imports.
Bumps pre-commit-hooks to v4.6.0. Runs golangci-lint through a local hook entry instead of pre-commit's hosted git-based install, so the repo does not require contributors' machines to reach github.com on each new clone (also works around SSL issues in restricted networks).
The originally written v1 config does not parse under the locally installed golangci-lint v2.1.6. golangci-lint migrate produced this v2 layout. gofmt and goimports are now expressed under the formatters: section per the v2 schema. Linter selection is unchanged. run.go is pinned to 1.24 to match the toolchain golangci-lint v2.1.6 itself was built with; bumping the binary to a Go 1.25 release will let us drop this constraint.
The locally installed golangci-lint v2.1.6 was built with Go 1.24 and panics when analysing source compiled against Go 1.25, so the local hook cannot run reliably until contributors upgrade their binary. CI pins a working golangci-lint version (added in Task 1.10), so move the lint gate there. Local devs can run make lint when their toolchain is current.
Provides the standard developer entry points called out in the spec (§11.6). Build pins CGO_ENABLED=0 plus the containers_image_openpgp tag so the binary stays static and free of GnuPG cgo. VERSION is injected via -ldflags into cmd.version.
Wires the diffah binary so make build produces a usable CLI:
- main.go bootstraps cmd.Execute and propagates exit code 1 on error.
- cmd/root.go owns the cobra root, the version variable injected via
-ldflags, the persistent --log-level flag, and a shared reportError
helper for downstream subcommands.
- cmd/version.go prints the injected version.
- cmd/{export,import,inspect}.go are minimal stubs registered against
the root so the help output matches the spec; later tasks replace
the bodies.
Tests in cmd/root_test.go and cmd/version_test.go cover subcommand
registration, --help listing, and the version output. go.mod is
populated by go mod tidy now that real imports exist.
Both adapters wrap go.podman.io/image/v5 so service code can stay independent of the upstream package. ParseReference normalises any transport:reference string and wraps errors with the offending value; DefaultPolicyContext returns an insecure-accept-any signature policy appropriate for v1 (signing is out of scope per spec §2.2).
lint.yml runs golangci-lint via the official action against Go 1.25.4 on ubuntu. test.yml runs go test with -race -cover across ubuntu and macos. Pinning version: latest in the lint action keeps us on a golangci-lint build that targets the project's Go version.
pkg/diff hosts the domain layer with no framework dependencies. errors.go defines the domain error types from spec section 9.1, all implementing Error and (where useful) Unwrap so service callers can chain context with fmt.Errorf and consumers can errors.As. plan.go owns BlobRef, Plan, and ComputePlan. ComputePlan partitions target layer references into RequiredFromBaseline and ShippedInDelta according to which digests already live in baseline, preserving the target's original ordering. sidecar.go defines the diffah.json v1 schema, atomic Marshal that validates before encoding, and ParseSidecar that rejects unknown versions and missing required fields. The schema matches spec section 6.2 verbatim. go test ./pkg/diff/... -cover reports 94.4% coverage.
…e Go Adds a Progress (resumable handoff) section pinning HEAD, completed stages, and the local-environment caveats discovered during execution so the next session can pick up without rediscovering them. Replaces Task 3.4's bash + buildah recipe with a pure-Go generator (scripts/build_fixtures/main.go) that uses go.podman.io/image/v5 it- self for the destination write. The original recipe required buildah, which is not available on the development host; the Go generator runs anywhere go run does and keeps determinism control fully in our code.
Pack atomically writes a tar archive containing srcDir contents plus sidecar JSON to outPath. Uses tmp file + rename pattern to guarantee no observers see partial archives. Supports optional zstd compression.
…ection Extract writes every entry of a delta archive into a destination directory and returns the sidecar bytes. ReadSidecar returns only sidecar bytes without extracting the full archive, allowing fast metadata inspection. Both functions auto-detect zstd compression by sniffing the stream's magic bytes (0x28B52FFD). All 5 new tests pass; combined coverage 70.5%.
…ed fixtures Replaces the placeholder bash script with a self-contained Go program that produces bit-identical OCI and Docker Schema 2 archives on every run. Determinism is achieved by pinning all tar/gzip headers to fixed values, and post-processing each outer archive via normalizeTar to sort entries and zero out any variable fields written by the upstream transport library. Shared base layer digest is identical between v1 and v2 archives, which makes the fixtures useful for testing ComputePlan and delta distribution.
Wire the full export pipeline: open baseline, collect layer digests, copy target into a temp dir via KnownBlobsDest (wrapped through a knownBlobsRef to inject at the ImageReference level), build and pack the sidecar, then verify digest round-trip. Adds derivePlatformFromConfig to read os/arch from the config blob written by copy.Image into the directory transport layout, so the sidecar always has a non-empty Platform without requiring the caller to pass --platform.
- Add DryRunStats type and DryRun() function that computes the layer partition plan without invoking copy.Image or writing any output files - Extract loadTargetManifest() helper to open the target image source, read the manifest, resolve manifest lists via platform selection, and return the parsed manifest — shared by both Export and DryRun - Add TestExport_ManifestOnlyBaseline to prove the BaselineManifestPath code path works end-to-end through Export - Add TestExport_DryRun_DoesNotWriteOutput and TestExport_DryRun_ManifestOnlyBaseline to verify DryRun behavior
Wire the full import pipeline (spec §8): extract delta archive, parse sidecar, probe baseline, build CompositeSource, copy to output, rename atomically, and verify. PreserveDigests is only set for dir output as the spec requires — docker-archive/oci-archive need manifest rewriting which PreserveDigests=true would refuse.
…fixture - Extend scripts/build_fixtures to emit unrelated_oci.tar (1 layer, /unrelated.bin = 16 KiB of 0xFF), whose layer digest does not overlap with any v1/v2 layer, enabling fail-fast probe testing - Add DryRunReport + DryRun to pkg/importer: runs steps 1-4 (extract, parse, open baseline, probe) without writing output, reports reachable vs. missing blobs - Add TestImport_FailFast_MissingBaselineBlob, TestImport_DryRun_Reachable, and TestImport_DryRun_Missing; all three pass
Adds TestImport_Matrix, a table-driven integration test that exercises the full export → import pipeline across source formats (OCI and Docker Schema 2) and output formats (docker-archive, oci-archive, dir). Five test cases cover all combinations with a helper buildDeltaS2() for schema-2 deltas.
Replace stub with full implementation that reads sidecar metadata from delta archives and displays platform info, manifest references, blob counts, and compression savings percentage (required bytes / total bytes).
Keep dist/ (goreleaser output), bin/ (go build output), build_fixtures (scripts helper binary), *.bck.yaml config backups, and .tool-versions (asdf pin) out of the published repository. These are generated or machine-local and should not be imposed on contributors.
The local golangci-lint binary was built with Go 1.24 and could not load this 1.25 module, so the real v2 lint output never surfaced during development. Upgrading the local binary revealed 47 issues; this commit clears all of them. Config adjustments (.golangci.yaml): - Disable gocritic hugeParam globally. Many flagged methods implement go.podman.io/image types interfaces (PutBlob, GetBlob, TryReusingBlob) which take BlobInfo by value; pointer receivers would break interface satisfaction. - Bump gocyclo to 15 and exclude gosec from _test.go (0o644 on fixture paths is normal test practice; gosec G305 in reader.go is still checked). - Sync run.go to 1.25 to match go.mod. Code fixes: - internal/archive/reader.go: wrap io.EOF checks with errors.Is, defend Extract against zip slip via safeJoin, split loop body into extractEntry for clarity. - pkg/exporter/exporter.go, pkg/importer/importer.go: capture policyCtx.Destroy error via a defer closure. - pkg/importer: promote "docker-archive", "oci-archive", and "dir" to exported Format* constants so external callers do not hardcode them. - cmd/root.go: drop the unused reportError helper and its fmt/os imports. - internal/archive/writer.go: mark the unused addFile parameter as _. - Split overlong signatures in pkg/exporter and pkg/importer to keep every line under 120 columns.
go.podman.io/storage pulls in btrfs and devicemapper drivers that need system headers (btrfs/version.h, libdevmapper) present on the Linux runner. Ubuntu-latest does not ship them, and go test needs cgo for the race detector, so the previous invocation failed at package load with "btrfs/version.h: No such file or directory". diffah never instantiates these drivers - the containers-image copy path only uses directory/docker-archive/oci-archive transports - so excluding them via build tags is strictly correct. Mirror the same set in the Makefile test targets so make test matches CI. goreleaser builds are unaffected (they already use CGO_ENABLED=0 and the btrfs driver has a linux+cgo build constraint).
Two follow-ups to the v2 lint cleanup: 1. golangci-lint on CI still typechecked the podman/containers-image imports without our build tags, so it tried to compile the btrfs driver and gpgme-cgo path and failed with "btrfs/version.h: No such file or directory". Hoist the same tag set into run.build-tags so lint, test, and integration all agree. Unlocks 9 tag-guarded findings in scripts/build_fixtures/main.go that the reviewer flagged as latent; this commit clears them inline (errorlint io.EOF wrap, staticcheck QF1008 embedded-Header removal, two lll splits) and excludes gocyclo/funlen from scripts/ since the fixture builder is linear orchestration. 2. Add regression tests for safeJoin (zip-slip defense added in the prior commit with zero coverage). TestSafeJoin is table-driven across accept/reject cases; TestExtract_RejectsPathTraversal crafts a tar with a "../escape.txt" entry and confirms Extract rejects it before any file lands on disk. internal/archive coverage rises to 72.8%.
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
diffahis a CLI for shipping container images as portable layerdeltas when registry-to-registry replication is unavailable
(air-gapped deployments, customer deliveries, offline mirrors). This PR
lands the complete v1 surface:
export,import,inspect, andversionsubcommands, all cross-tested against both OCI and Dockerschema 2 manifest formats.
A v2 image that shares base layers with a v1 baseline typically ships as
a delta archive that is 10% or less of the full image size — only
the layers that actually changed travel.
What v1 ships
CLI commands
diffah export— reads a target image and baseline manifest, computeswhich layers are new, and packages only the new blobs plus the target
manifest and config into a portable
.tarwith adiffah.jsonsidecar describing which blobs the consumer must resolve from its
local baseline.
diffah import— extracts the delta, opens the local baseline image,verifies every required baseline blob is reachable (fail-fast), and
reconstructs the full target image in
docker-archive,oci-archive,or
dirformat.diffah inspect— previews the contents of a delta archive withoutwriting anything: version, platform, manifest digests, shipped vs
required blob counts, and the estimated size saving vs the full image.
diffah version— prints the build version.Capabilities
--targetand--baselineaccept anycontainers-imagetransport:docker://,docker-archive:,oci-archive:,dir:.--baseline-manifestaccepts a standalonemanifest.jsonwhen theoriginal baseline image is no longer available but its manifest
digest set is known.
--dry-runonexportandimportvalidates reachability andmanifest structure without touching the filesystem.
--compress=zstdon the outer archive for additionalon-wire savings.
transparently; manifest bytes are preserved verbatim in the internal
dir:layout to keep target digests stable.Architecture
Strict
Interface → Service → Domain → Infrastructurelayering:cmd/— Cobra CLI surface (interface layer).pkg/exporter/,pkg/importer/— service orchestration that wrapscontainers-imageImageSource/ImageDestinationinterfaces anddelegates heavy lifting to
go.podman.io/image/v5/copy.Image.pkg/diff/— pure domain types, sidecar schema, plan partition.internal/imageio/,internal/archive/,internal/oci/—infrastructure adapters for transports, tar/zstd packaging, and
dir-layout helpers.
Testing
pkg/*, ≥ 60% forinternal/*).oci-archive / dir) under
pkg/importer/integration_test.goandcmd/*_integration_test.go.testdata/fixtures/and verified viatestdata/fixtures/CHECKSUMS. Regenerate withgo run ./scripts/build_fixtures.lintandtestrun on every PR and push tomaster;integrationruns nightly and on manual dispatch.Build and release
containers_image_openpgp.linux_{amd64,arm64}anddarwin_{amd64,arm64}binaries; the
releaseworkflow triggers onv*tags.go install -tags containers_image_openpgp github.com/leosocy/diffah@latest.Usage snapshot
Producer:
diffah export \ --target docker://registry.example.com/app:v2 \ --baseline docker://registry.example.com/app:v1 \ --platform linux/amd64 \ --output ./app_v1_to_v2.tarConsumer:
Preview:
Design document
Full specification in
`docs/superpowers/specs/2026-04-20-diffah-design.md`:
archive format, export / import algorithms, error contracts, testing
strategy, and explicit non-goals.
Post-merge
first set of binaries on GitHub Releases.
cosign signature verification) are tracked outside this PR.
Test plan
passes locally on darwin/arm64 (Go 1.25.4).
end-to-end smoke against local fixtures produces a byte-identical
target image.