Optimize deploy upload with delta file transfer#670
Conversation
📝 WalkthroughWalkthroughAdds a manifest-based delta-upload workflow and per-app source caching to the builder. New RPC types FileManifestEntry and PrepareUploadResult and RPC methods PrepareUpload and BuildFromPrepared were introduced. Server-side builder now tracks upload sessions, stages prepared uploads, and can build from prepared sessions; NewBuilder gains a dataPath parameter and Builder contains a DataPath field. pkg/tarx gained manifest computation and filtered-tar utilities. CLI deploy was updated to use prepared uploads when available. Tests added for tarx and the source cache. 📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
pkg/tarx/tarx_test.go (1)
374-389: Determinism test is valuable but could be strengthened.The test verifies hash consistency across runs, which is essential for delta comparison. Consider adding a brief delay or file modification between runs to ensure the test isn't trivially passing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/tarx/tarx_test.go` around lines 374 - 389, In TestComputeManifestDeterministic, strengthen the determinism check by changing the file system state between ComputeManifest calls (e.g., sleep briefly and update the test file's modification time or rewrite the same content) so the second ComputeManifest(tmpDir, nil) runs after a visible change; keep assertions the same to ensure ComputeManifest still returns identical hashes. Use the existing TestComputeManifestDeterministic test and the ComputeManifest function as anchors when making the change.pkg/tarx/tarx.go (1)
326-354: Lazy directory emission logic is correct but silently ignores errors.Lines 343-349 silently continue on
os.Statortar.FileInfoHeadererrors for parent directories. While this prevents one bad directory from blocking the entire tar, consider logging these errors for debugging.Consider logging directory emission errors
dirInfo, dirErr := os.Stat(filepath.Join(dir, dirPath)) if dirErr != nil { + // Log but continue - parent dir might be missing in filtered scenarios continue } dirHdr, hdrErr := tar.FileInfoHeader(dirInfo, "") if hdrErr != nil { + // Log but continue continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/tarx/tarx.go` around lines 326 - 354, The parent-directory emission loop silently swallows errors from os.Stat and tar.FileInfoHeader (inside the loop that uses parts, dirPath, emittedDirs and writes with tw.WriteHeader), making debugging hard; change the two continue branches to log their errors (including dirPath and the returned error) before continuing — use the project's logger or log.Printf if none exists — so os.Stat(filepath.Join(dir, dirPath)) and tar.FileInfoHeader(dirInfo, "") failures are recorded with clear context while still continuing to the next parent directory.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@servers/build/build.go`:
- Around line 919-933: BuildFromPrepared currently consumes a prepared session
without re-checking per-app authorization; before calling
b.sessions.LoadAndDelete (or immediately after retrieving sess but before using
sess.appName), perform the same app authorization check used in PrepareUpload
(e.g., call rpc.AllowApp or the equivalent auth helper) against sess.appName and
the request context, and return an authorization error if it fails; ensure the
check runs before any session state is consumed or sess.cancelFunc/dir cleanup
is invoked so unauthorized callers cannot use a valid sessionID to build for
another app.
- Around line 870-884: The session stored after PrepareUpload only saves dir and
appName so BuildFromPrepared can't detect missing uploads; fix by adding the
negotiated delta set (the "needed" slice/paths returned by
stageMatchingFiles/PrepareUpload) to buildSession (e.g., expectedPaths or
neededPaths) when creating the session, and then in BuildFromPrepared validate
that each expected path exists in the extracted tempDir before calling
buildFromDir; if any are missing, clean up and return an error. Ensure you
update the buildSession struct and references in PrepareUpload and
BuildFromPrepared to use the new field.
In `@servers/build/source_cache.go`:
- Around line 216-235: The Walk callback opens each regular file with
os.Open(path) and defers f.Close(), which delays closing until after Walk
returns and can exhaust file descriptors; replace defer f.Close() with an
explicit f.Close() immediately after io.Copy(w, f) (ensure it still runs on copy
error by closing in a short defer-like pattern or via a named cleanup call),
keeping the existing error handling and manifest assignment (references: the
local variables f/fErr, os.Open, io.Copy(w, f), f.Close(), manifest and
cachedFileInfo).
---
Nitpick comments:
In `@pkg/tarx/tarx_test.go`:
- Around line 374-389: In TestComputeManifestDeterministic, strengthen the
determinism check by changing the file system state between ComputeManifest
calls (e.g., sleep briefly and update the test file's modification time or
rewrite the same content) so the second ComputeManifest(tmpDir, nil) runs after
a visible change; keep assertions the same to ensure ComputeManifest still
returns identical hashes. Use the existing TestComputeManifestDeterministic test
and the ComputeManifest function as anchors when making the change.
In `@pkg/tarx/tarx.go`:
- Around line 326-354: The parent-directory emission loop silently swallows
errors from os.Stat and tar.FileInfoHeader (inside the loop that uses parts,
dirPath, emittedDirs and writes with tw.WriteHeader), making debugging hard;
change the two continue branches to log their errors (including dirPath and the
returned error) before continuing — use the project's logger or log.Printf if
none exists — so os.Stat(filepath.Join(dir, dirPath)) and
tar.FileInfoHeader(dirInfo, "") failures are recorded with clear context while
still continuing to the next parent directory.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 532942f7-9214-4db0-9642-7112b8fa0b2d
📒 Files selected for processing (10)
api/build/build_v1alpha/rpc.gen.goapi/build/rpc.ymlcli/commands/deploy.gocli/commands/deploy_ui.gocomponents/coordinate/coordinate.gopkg/tarx/tarx.gopkg/tarx/tarx_test.goservers/build/build.goservers/build/source_cache.goservers/build/source_cache_test.go
phinze
left a comment
There was a problem hiding this comment.
Claude and I dug into this one together — traced the full flow from manifest computation through cache staging and the new build paths. I've already dubbed this "mirsync" in my head.
Really like the design. The fallback-to-full-upload approach means this is pure upside with no risk to existing deploys, and the buildFromDir extraction is a clean refactor.
CodeRabbit flagged a few things (auth check on BuildFromPrepared, session path tracking, deferred close in Walk) — skipping those here since you've seen them.
Two small optional nits below. Also, food for thought: now that we have per-file sizes in the manifest and the cached layer.tar.gz on disk, we've got everything needed to estimate compression ratio and show real upload progress bars with ETAs + rsync-style speedup stats. Going to file a follow-up for that.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
servers/build/source_cache.go (2)
193-198: Consider checkingChmoderror.The
Chmodcall on line 197 silently ignores any error. While permission failures are rare and the file is still usable, logging or returning the error would help diagnose issues on restrictive filesystems.🔧 Proposed fix
if _, err := io.Copy(outFile, tr); err != nil { outFile.Close() return err } - outFile.Chmod(os.FileMode(hdr.Mode) & os.ModePerm) + if err := outFile.Chmod(os.FileMode(hdr.Mode) & os.ModePerm); err != nil { + outFile.Close() + return fmt.Errorf("chmod %s: %w", outPath, err) + } outFile.Close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@servers/build/source_cache.go` around lines 193 - 198, The outFile.Chmod call currently ignores any returned error; modify the block in the function that performs io.Copy (the code using io.Copy(outFile, tr) then outFile.Chmod(os.FileMode(hdr.Mode) & os.ModePerm) and outFile.Close) to capture the Chmod error, ensure outFile is closed on all paths, and either return the Chmod error (or log it and return a wrapped error) so permission failures are not silently ignored; preserve the existing behavior of closing outFile (use defer or explicit closes) and return the captured error if non-nil.
48-53: Potential panic ifunlockcalled for non-existent app.If
unlockis called for an app that was never locked (e.g., due to a coding error), the map lookup will returnnil, andl.Unlock()will panic.While currently the code always pairs
lock/unlockcorrectly, adding a defensive check prevents future bugs.🛡️ Proposed defensive check
func (al *appLocks) unlock(app string) { al.mu.Lock() - l := al.locks[app] + l, ok := al.locks[app] al.mu.Unlock() + if !ok { + return // or log a warning + } l.Unlock() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@servers/build/source_cache.go` around lines 48 - 53, The unlock method may panic if the map lookup returns nil; update appLocks.unlock to defensively handle a missing lock by checking the retrieved pointer (l) from the locks map before calling l.Unlock(). Specifically, inside appLocks.unlock (which uses al.mu and al.locks), hold al.mu while reading al.locks[app], if l == nil then release al.mu and return (or log a warning) instead of calling l.Unlock(); otherwise release al.mu and call l.Unlock() as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@servers/build/source_cache.go`:
- Around line 193-198: The outFile.Chmod call currently ignores any returned
error; modify the block in the function that performs io.Copy (the code using
io.Copy(outFile, tr) then outFile.Chmod(os.FileMode(hdr.Mode) & os.ModePerm) and
outFile.Close) to capture the Chmod error, ensure outFile is closed on all
paths, and either return the Chmod error (or log it and return a wrapped error)
so permission failures are not silently ignored; preserve the existing behavior
of closing outFile (use defer or explicit closes) and return the captured error
if non-nil.
- Around line 48-53: The unlock method may panic if the map lookup returns nil;
update appLocks.unlock to defensively handle a missing lock by checking the
retrieved pointer (l) from the locks map before calling l.Unlock().
Specifically, inside appLocks.unlock (which uses al.mu and al.locks), hold al.mu
while reading al.locks[app], if l == nil then release al.mu and return (or log a
warning) instead of calling l.Unlock(); otherwise release al.mu and call
l.Unlock() as before.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a53a8a54-e036-406e-a43d-6910efb12d6c
📒 Files selected for processing (3)
servers/build/build.goservers/build/source_cache.goservers/build/source_cache_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- servers/build/source_cache_test.go
Instead of uploading a complete tar of all source files on every deploy, the client now sends a file manifest first, lets the server compare against a cached source code image, and only uploads changed files. This significantly reduces upload time for iterative development where typically only a few files change between deploys. New RPC methods: prepareUpload and buildFromPrepared. The original buildFromTar remains unchanged for backward compatibility and fallback.
- Add app authorization check in BuildFromPrepared - Fix deferred file close leak inside Walk callback in saveSourceImage - Add debug log for corrupt manifest.json in source cache - Reorder rename calls for safer crash recovery (manifest before layer) - Add per-app mutex to serialize concurrent cache reads and writes
febd363 to
8740097
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/tarx/tarx.go (1)
265-378:⚠️ Potential issue | 🟠 MajorError from
filepath.Walkis silently discarded in the goroutine.The goroutine at line 272 calls
filepath.Walkbut ignores its return value. If the walk fails (e.g., permission denied on a subdirectory), the error is lost, and the caller receives a valid-looking but incomplete tar stream.Additionally, using
defer f.Close()inside the loop (line 371) accumulates deferred closures until the goroutine exits, which can exhaust file descriptors on large directory trees.🛠️ Suggested direction
go func() { + var walkErr error defer w.Close() defer tw.Close() defer gzw.Close() emittedDirs := make(map[string]bool) - filepath.Walk(dir, func(path string, info os.FileInfo, err error) error { + walkErr = filepath.Walk(dir, func(path string, info os.FileInfo, err error) error { // ... existing code ... if info.Mode().IsRegular() { f, fErr := os.Open(path) if fErr != nil { return fErr } - defer f.Close() if _, err := io.Copy(tw, f); err != nil { + f.Close() return err } + f.Close() } return nil }) + if walkErr != nil { + // Consider closing the pipe with an error so the reader sees it + w.CloseWithError(walkErr) + } }()Using
*os.Filewith an explicitClose()call afterio.Copyavoids accumulating deferred closures. To propagatewalkErrto the reader, useio.Pipe()which supportsCloseWithError.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/tarx/tarx.go` around lines 265 - 378, The goroutine ignores the error returned by filepath.Walk and defers file closes inside the loop, which can leak descriptors; change the code so you capture the error returned by filepath.Walk(...) and if non-nil call the pipe writer's CloseWithError(err) (w.CloseWithError) instead of silently returning, and replace the deferred f.Close() inside the file-handling branch with an explicit f.Close() immediately after io.Copy completes (use a short-lived close and handle its error), ensuring emittedDirs/tw/gzw are still closed in the goroutine's defer but the walk error is propagated to the reader.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/tarx/tarx.go`:
- Around line 342-349: The code silently skips errors when calling os.Stat
(dirErr) and tar.FileInfoHeader (hdrErr) for parent directories (variables
dirInfo/dirErr and dirHdr/hdrErr), which can produce malformed tars; change the
behavior to surface failures instead of silently continuing by returning or
wrapping the error (with context including dirPath and the operation) from the
surrounding tar creation function, or at minimum log the error and fail the tar
write; ensure you handle the error paths for both os.Stat and tar.FileInfoHeader
consistently so parent directory entries are created or the caller is informed
of the failure.
---
Outside diff comments:
In `@pkg/tarx/tarx.go`:
- Around line 265-378: The goroutine ignores the error returned by filepath.Walk
and defers file closes inside the loop, which can leak descriptors; change the
code so you capture the error returned by filepath.Walk(...) and if non-nil call
the pipe writer's CloseWithError(err) (w.CloseWithError) instead of silently
returning, and replace the deferred f.Close() inside the file-handling branch
with an explicit f.Close() immediately after io.Copy completes (use a
short-lived close and handle its error), ensuring emittedDirs/tw/gzw are still
closed in the goroutine's defer but the walk error is propagated to the reader.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8aa21a85-875d-4e98-851d-8f8ab3e1b1ab
📒 Files selected for processing (10)
api/build/build_v1alpha/rpc.gen.goapi/build/rpc.ymlcli/commands/deploy.gocli/commands/deploy_ui.gocomponents/coordinate/coordinate.gopkg/tarx/tarx.gopkg/tarx/tarx_test.goservers/build/build.goservers/build/source_cache.goservers/build/source_cache_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- servers/build/source_cache.go
Summary
prepareUploadandbuildFromPreparedon the Builder service{DataPath}/source-code/{app}/with atomic writes to prevent corruptionMakeTar/MakeFilteredTarto share a commonmakeTarWithFilterimplementation with lazy directory emissionTest plan
ComputeManifestandMakeFilteredTarinpkg/tarx/tarx_test.goservers/build/source_cache_test.gomake testpasses (3579 tests)make lintclean