fix(compression): correctness findings from compression audit#2803
Conversation
PR SummaryLow Risk Overview Reviewed by Cursor Bugbot for commit 6f9e52c. Bugbot is set up for automated code reviews on this repo. Configure here. |
❌ 2 Tests Failed:
View the full list of 2 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
There was a problem hiding this comment.
Code Review
The alignment check in WriteAtWithoutLock only verifies that the write is at least one block long but does not ensure the total length is a multiple of the block size, which could lead to a panic when slicing the buffer for the final block; the condition should be updated to require that len(b) is an exact multiple of c.blockSize.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Idle timer not reset after successful Read completion
- Added timer.Reset(r.idle) after successful reads to give consumers the full idle budget between reads instead of reducing it by the read duration.
Or push these changes by commenting:
@cursor push f40a25dca8
Preview (f40a25dca8)
diff --git a/packages/shared/pkg/storage/storage_google.go b/packages/shared/pkg/storage/storage_google.go
--- a/packages/shared/pkg/storage/storage_google.go
+++ b/packages/shared/pkg/storage/storage_google.go
@@ -303,6 +303,8 @@
n, err := r.ReadCloser.Read(p)
if err != nil {
r.timer.Stop()
+ } else {
+ r.timer.Reset(r.idle)
}
return n, err
}You can send follow-ups to the cloud agent here.
1a7b59d to
c8b0411
Compare
c8b0411 to
d26c7de
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: GCS ErrObjectNotExist check never matches in peerSeekable
- Added error translation in openRangeReader to wrap cloud library ErrObjectNotExist with local package sentinel, matching the pattern used in Size() and WriteTo() methods.
Or push these changes by commenting:
@cursor push 3750f3cd4c
Preview (3750f3cd4c)
diff --git a/packages/shared/pkg/storage/storage_google.go b/packages/shared/pkg/storage/storage_google.go
--- a/packages/shared/pkg/storage/storage_google.go
+++ b/packages/shared/pkg/storage/storage_google.go
@@ -274,6 +274,10 @@
if err != nil {
cancel()
+ if errors.Is(err, storage.ErrObjectNotExist) {
+ return nil, fmt.Errorf("failed to create GCS range reader for %q at %d+%d: %w", o.path, off, length, ErrObjectNotExist)
+ }
+
return nil, fmt.Errorf("failed to create GCS range reader for %q at %d+%d: %w", o.path, off, length, err)
}You can send follow-ups to the cloud agent here.
ddf1bab to
5584b06
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ef6431c76c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Concurrent reads lose transition retry
- Removed the conditional CAS check so all concurrent goroutines in the retry window receive PeerTransitionedError and can retry after header swap.
Or push these changes by commenting:
@cursor push 2c923d7144
Preview (2c923d7144)
diff --git a/packages/orchestrator/pkg/sandbox/template/peerclient/seekable.go b/packages/orchestrator/pkg/sandbox/template/peerclient/seekable.go
--- a/packages/orchestrator/pkg/sandbox/template/peerclient/seekable.go
+++ b/packages/orchestrator/pkg/sandbox/template/peerclient/seekable.go
@@ -145,9 +145,8 @@
if errors.Is(err, storage.ErrObjectNotExist) {
at := s.transitionAt.Load()
if at != 0 && time.Since(time.Unix(0, at)) < postTransitionRetryWindow {
- if s.transitionAt.CompareAndSwap(at, 0) {
- return nil, &storage.PeerTransitionedError{}
- }
+ s.transitionAt.CompareAndSwap(at, 0)
+ return nil, &storage.PeerTransitionedError{}
}
}You can send follow-ups to the cloud agent here.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 27eeb26591
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
6b02855 to
393ddbe
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Stat failure deletes valid cache
- Modified logic to only remove cached files on confirmed size mismatch, treating Stat failures as transient errors that fall through to cache miss without deletion.
Or push these changes by commenting:
@cursor push 93e492f915
Preview (93e492f915)
diff --git a/packages/shared/pkg/storage/storage_cache_seekable_compressed.go b/packages/shared/pkg/storage/storage_cache_seekable_compressed.go
--- a/packages/shared/pkg/storage/storage_cache_seekable_compressed.go
+++ b/packages/shared/pkg/storage/storage_cache_seekable_compressed.go
@@ -32,23 +32,29 @@
// Cache hit: open compressed frame from NFS, validate size, wrap with decompressor.
// On size mismatch the file is corrupt (truncated write, disk full), drop it.
if f, err := os.Open(path); err == nil {
- if fi, statErr := f.Stat(); statErr == nil && fi.Size() == int64(r.Length) {
- recordCacheRead(ctx, true, int64(r.Length), cacheTypeSeekable, cacheOpOpenRangeReader)
- timer.Success(ctx, int64(r.Length))
+ if fi, statErr := f.Stat(); statErr == nil {
+ if fi.Size() == int64(r.Length) {
+ recordCacheRead(ctx, true, int64(r.Length), cacheTypeSeekable, cacheOpOpenRangeReader)
+ timer.Success(ctx, int64(r.Length))
- decompressed, err := newDecompressingReadCloser(f, frameTable.CompressionType())
- if err != nil {
- f.Close()
+ decompressed, err := newDecompressingReadCloser(f, frameTable.CompressionType())
+ if err != nil {
+ f.Close()
- return nil, fmt.Errorf("decompress cached frame: %w", err)
+ return nil, fmt.Errorf("decompress cached frame: %w", err)
+ }
+
+ return withNFSGauge(ctx, decompressed), nil
}
-
- return withNFSGauge(ctx, decompressed), nil
+ f.Close()
+ _ = os.Remove(path)
+ recordCacheReadError(ctx, cacheTypeSeekable, cacheOpOpenRangeReader,
+ fmt.Errorf("cached frame %s size mismatch", path))
+ } else {
+ f.Close()
+ recordCacheReadError(ctx, cacheTypeSeekable, cacheOpOpenRangeReader,
+ fmt.Errorf("cached frame %s stat failed: %w", path, statErr))
}
- f.Close()
- _ = os.Remove(path)
- recordCacheReadError(ctx, cacheTypeSeekable, cacheOpOpenRangeReader,
- fmt.Errorf("cached frame %s invalid", path))
} else if !os.IsNotExist(err) {
recordCacheReadError(ctx, cacheTypeSeekable, cacheOpOpenRangeReader, err)
}You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 393ddbe. Configure here.
c5d0197 to
ea94b8f
Compare
- reject misaligned writes in Cache.WriteAtWithoutLock to avoid OOB slice - cap V4 header LZ4 decompression at the size prefix - drop corrupt compressed cache entries on size mismatch - retry GCS 404 within post-transition window with backoff - use per-Read idle timeout on GCS range reads
ea94b8f to
819c7a3
Compare
The API takes ~20s after starting to listen before its readiness gate clears (cluster init), so the default 30s polling window can lose the race and report 'API failed to become healthy in time' on slower runners. Bump just the API leg to 60s.


Compression-audit correctness fixes:
Cache.WriteAtWithoutLock