Skip to content

Add rootfs reboot fallback for snapshots#2875

Draft
ValentaTomas wants to merge 15 commits into
mainfrom
cova/rootfs-reboot-snapshot-fallback
Draft

Add rootfs reboot fallback for snapshots#2875
ValentaTomas wants to merge 15 commits into
mainfrom
cova/rootfs-reboot-snapshot-fallback

Conversation

@ValentaTomas
Copy link
Copy Markdown
Member

Summary

  • Add a rootfs reboot fallback for paused snapshots when memory artifacts are skipped or unavailable.
  • Allow snapshot creation to request disk-only persistence while keeping normal memory snapshots as the default.

Tests

  • go test ./internal/handlers ./internal/orchestrator ./internal/sandbox/...
  • go test ./pkg/server ./pkg/sandbox ./pkg/template/metadata -run '^$'
  • go test ./pkg/types
  • go test ./internal/api

@cla-bot cla-bot Bot added the cla-signed label May 30, 2026
@cursor
Copy link
Copy Markdown

cursor Bot commented May 30, 2026

PR Summary

High Risk
Changes core sandbox create, pause, checkpoint, and upload paths; incorrect memory/reboot handling could lose process state or fail restores for existing snapshots.

Overview
This PR extends snapshot and resume so callers can keep only the filesystem (optional memory: false on snapshot/pause flows) and start again from a fresh VM backed by rootfs instead of restoring RAM. Connect and resume accept an optional reboot flag to force that path; the orchestrator can also fall back automatically when memory snapshot objects are missing. Disk-only pause skips memfile/snap upload work, runs a guest sync before snapshot, and uses rootfs reboot creation (empty memfile, systemd init) on the next start or after checkpoint when memory was not captured.

Reviewed by Cursor Bugbot for commit f162320. Bugbot is set up for automated code reviews on this repo. Configure here.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

❌ 4 Tests Failed:

Tests completed Failed Passed Skipped
2704 4 2700 7
View the full list of 4 ❄️ flaky test(s)
github.com/e2b-dev/infra/tests/integration/internal/tests/api/sandboxes::TestSandboxListPaginationRunningLargerLimit

Flake rate in main: 42.80% (Passed 747 times, Failed 559 times)

Stack Traces | 96.7s run time
=== RUN   TestSandboxListPaginationRunningLargerLimit
    sandbox_list_test.go:327: Created sandbox 1/12: imyb7ylrlkm4ju1qynlvs
    sandbox_list_test.go:327: Created sandbox 2/12: iiw2pmt1s80g7v82vgyo1
    sandbox_list_test.go:327: Created sandbox 3/12: i5xaa20o4z80nyykjkfca
    sandbox_list_test.go:327: Created sandbox 4/12: ig7mfvut86f8xzn6kh0ic
    sandbox_list_test.go:327: Created sandbox 5/12: il95214ymogu939vb68jc
    sandbox_list_test.go:327: Created sandbox 6/12: ii2gxqbwy3y1f3oxnj426
    sandbox_list_test.go:327: Created sandbox 7/12: iw700rqxq4jkofp41zxso
    sandbox_list_test.go:327: Created sandbox 8/12: ipvu6exljj8z1vsie2v3u
    sandbox_list_test.go:327: Created sandbox 9/12: ii8xqk5q7oudkh3wd1fto
    sandbox_list_test.go:327: Created sandbox 10/12: izkesmnvxwo3dvlww6i4r
    sandbox_list_test.go:327: Created sandbox 11/12: ixao6zu6ubp9lrqmhx1b5
    sandbox_list_test.go:327: Created sandbox 12/12: ivlrcdoio9vcetd0uz9x6
    sandbox_list_test.go:330: 
        	Error Trace:	.../api/sandboxes/sandbox_list_test.go:340
        	            				.../hostedtoolcache/go/1.26.3.../src/runtime/asm_amd64.s:1771
        	Error:      	"[]" should have 12 item(s), but has 0
    sandbox_list_test.go:330: 
        	Error Trace:	.../api/sandboxes/sandbox_list_test.go:330
        	Error:      	Condition never satisfied
        	Test:       	TestSandboxListPaginationRunningLargerLimit
--- FAIL: TestSandboxListPaginationRunningLargerLimit (96.70s)
github.com/e2b-dev/infra/tests/integration/internal/tests/orchestrator::TestSandboxMemoryIntegrity

Flake rate in main: 57.71% (Passed 740 times, Failed 1010 times)

Stack Traces | 65.9s run time
=== RUN   TestSandboxMemoryIntegrity
=== PAUSE TestSandboxMemoryIntegrity
=== CONT  TestSandboxMemoryIntegrity
    sandbox_memory_integrity_test.go:27: Build completed successfully
--- FAIL: TestSandboxMemoryIntegrity (65.87s)
github.com/e2b-dev/infra/tests/integration/internal/tests/orchestrator::TestSandboxMemoryIntegrity/tmpfs_hash

Flake rate in main: 57.83% (Passed 730 times, Failed 1001 times)

Stack Traces | 202s run time
=== RUN   TestSandboxMemoryIntegrity/tmpfs_hash
=== PAUSE TestSandboxMemoryIntegrity/tmpfs_hash
=== CONT  TestSandboxMemoryIntegrity/tmpfs_hash
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{start:{pid:1264}}
Executing command bash in sandbox iovt1eldinalctxmbpfov (user: root)
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stdout:"Total memory: 985 MB\nUsed memory before tmpfs mount: 194 MB\nFree memory before tmpfs mount: 790 MB\nMemory to use in integrity test (60% of free, min 64MB): 474 MB\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"474+0 records in\n474+0 records out\n497025024 bytes (497 MB, 474 MiB) copied, 1.9737 s, 252 MB/s\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"\tCommand being timed: \"dd if=/dev/urandom of=/mnt/testfile bs=1M count=474\"\n\tUser time (seconds): 0.00\n\tSystem time (seconds): 1.96\n\tPercent of CPU this job got: 99%\n\tElapsed (wall clock) time (h:mm:ss or m:ss): 0:01.97\n\tAverage shared text size (kbytes): 0\n\tAverage unshared data size (kbytes): 0\n\tAverage stack size (kbytes): 0\n\tAverage total size (kbytes): 0\n\tMaximum resident set size (kbytes): 2612\n\tAverage resident set size (kbytes): 0\n\tMajor (requiring I/O) page faults: 3\n\tMinor (reclaiming a frame) page faults: 340\n\tVoluntary context switches: 4\n\tInvoluntary context switches: 30\n\tSwaps: 0\n\tFile system inputs: 176\n\tFile system outputs: 0\n\tSocket messages sent: 0\n\tSocket messages received: 0\n\tSignals delivered: 0\n\tPage size (bytes): 4096\n\tExit status: 0\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stdout:"Used memory after tmpfs mount and file fill: 668 MB\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{end:{exited:true  status:"exit status 0"}}
    sandbox_memory_integrity_test.go:70: Command [bash] completed successfully in sandbox i51vg7ahzoj54w37hfsqm
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
    sandbox_memory_integrity_test.go:80: Command [bash] output: event:{start:{pid:1280}}
Executing command bash in sandbox ihybdyoj9jz98jgej9tpy (user: root)
    sandbox_memory_integrity_test.go:80: Command [bash] output: event:{data:{stdout:"48678be3d83684ac3b6412c86fd09bc41e2314f8bb4712d5278648d8e7af904b\n"}}
    sandbox_memory_integrity_test.go:80: Command [bash] output: event:{end:{exited:true  status:"exit status 0"}}
    sandbox_memory_integrity_test.go:80: Command [bash] completed successfully in sandbox i51vg7ahzoj54w37hfsqm
    sandbox_memory_integrity_test.go:80: Command [bash] output: event:{start:{pid:1283}}
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
Executing command bash in sandbox i51vg7ahzoj54w37hfsqm (user: root)
    sandbox_memory_integrity_test.go:110: 
        	Error Trace:	.../tests/orchestrator/sandbox_memory_integrity_test.go:81
        	            				.../hostedtoolcache/go/1.26.3.../src/runtime/asm_amd64.s:1771
        	Error:      	Received unexpected error:
        	            	failed to execute command bash in sandbox i51vg7ahzoj54w37hfsqm: unavailable: HTTP status 502 Bad Gateway
    sandbox_memory_integrity_test.go:110: 
        	Error Trace:	.../tests/orchestrator/sandbox_memory_integrity_test.go:78
        	            				.../tests/orchestrator/sandbox_memory_integrity_test.go:110
        	Error:      	Condition never satisfied
        	Test:       	TestSandboxMemoryIntegrity/tmpfs_hash
--- FAIL: TestSandboxMemoryIntegrity/tmpfs_hash (201.72s)
github.com/e2b-dev/infra/tests/integration/internal/tests/proxies::TestEnvdAccessTokenAutoResumeViaProxy

Flake rate in main: 42.99% (Passed 736 times, Failed 555 times)

Stack Traces | 10.8s run time
=== RUN   TestEnvdAccessTokenAutoResumeViaProxy
=== PAUSE TestEnvdAccessTokenAutoResumeViaProxy
=== CONT  TestEnvdAccessTokenAutoResumeViaProxy
    traffic_access_token_test.go:357: 
        	Error Trace:	.../tests/proxies/traffic_access_token_test.go:357
        	Error:      	Received unexpected error:
        	            	Get "http://localhost:3002/health": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
        	Test:       	TestEnvdAccessTokenAutoResumeViaProxy
--- FAIL: TestEnvdAccessTokenAutoResumeViaProxy (10.84s)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The empty memfile created by block.NewEmpty in createSandboxFromRootfs is never closed, leading to a file descriptor and resource leak on both successful and failed sandbox creation paths. This should be resolved by deferring a call to close the memfile immediately after its creation.

Comment thread packages/orchestrator/pkg/server/sandboxes.go
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Rootfs reboot breaks memory snapshots
    • Removed empty memfile masking in createSandboxFromRootfs so pause/checkpoint operations use the underlying template's memfile as the appropriate base for memory diffing instead of an incorrect empty memfile.

Create PR

Or push these changes by commenting:

@cursor push c0f29e8484
Preview (c0f29e8484)
diff --git a/packages/orchestrator/pkg/server/sandboxes.go b/packages/orchestrator/pkg/server/sandboxes.go
--- a/packages/orchestrator/pkg/server/sandboxes.go
+++ b/packages/orchestrator/pkg/server/sandboxes.go
@@ -31,7 +31,6 @@
 	sbxtemplate "github.com/e2b-dev/infra/packages/orchestrator/pkg/sandbox/template"
 	"github.com/e2b-dev/infra/packages/orchestrator/pkg/template/constants"
 	"github.com/e2b-dev/infra/packages/orchestrator/pkg/template/metadata"
-	"github.com/e2b-dev/infra/packages/orchestrator/pkg/units"
 	"github.com/e2b-dev/infra/packages/shared/pkg/events"
 	fcmodels "github.com/e2b-dev/infra/packages/shared/pkg/fc/models"
 	"github.com/e2b-dev/infra/packages/shared/pkg/featureflags"
@@ -39,7 +38,6 @@
 	"github.com/e2b-dev/infra/packages/shared/pkg/logger"
 	sbxlogger "github.com/e2b-dev/infra/packages/shared/pkg/logger/sandbox"
 	"github.com/e2b-dev/infra/packages/shared/pkg/storage"
-	"github.com/e2b-dev/infra/packages/shared/pkg/storage/header"
 	"github.com/e2b-dev/infra/packages/shared/pkg/telemetry"
 	"github.com/e2b-dev/infra/packages/shared/pkg/utils"
 )
@@ -315,27 +313,7 @@
 	runtime sandbox.RuntimeMetadata,
 	req *orchestrator.SandboxCreateRequest,
 ) (*sandbox.Sandbox, error) {
-	pageSize := int64(header.PageSize)
-	if config.HugePages {
-		pageSize = int64(header.HugepageSize)
-	}
-
-	buildID, err := uuid.Parse(template.Files().BuildID)
-	if err != nil {
-		return nil, fmt.Errorf("parse build id: %w", err)
-	}
-
-	memfile, err := block.NewEmpty(
-		units.MBToBytes(config.RamMB),
-		pageSize,
-		buildID,
-	)
-	if err != nil {
-		return nil, fmt.Errorf("create empty memfile: %w", err)
-	}
-	defer memfile.Close()
-
-	maskedTemplate := sbxtemplate.NewMaskTemplate(template, sbxtemplate.WithMemfile(memfile))
+	maskedTemplate := sbxtemplate.NewMaskTemplate(template)
 	ioEngine := fcmodels.DriveIoEngineSync
 	kvmClock, err := utils.IsGTEVersion(config.Envd.Version, "0.2.11")
 	if err != nil {

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 9e58d28. Configure here.

Comment thread packages/orchestrator/pkg/server/sandboxes.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant