Skip to content

feat(envd): support user-defined file metadata via xattrs#2732

Open
mishushakov wants to merge 22 commits into
mainfrom
mishushakov/envd-file-xattr-metadata
Open

feat(envd): support user-defined file metadata via xattrs#2732
mishushakov wants to merge 22 commits into
mainfrom
mishushakov/envd-file-xattr-metadata

Conversation

@mishushakov
Copy link
Copy Markdown
Member

@mishushakov mishushakov commented May 19, 2026

Summary

  • Adds optional user-defined file metadata to POST /files. Any request header of the form X-Metadata-<key>: <value> is persisted as an extended attribute (xattr) on the uploaded file. The X-Metadata- prefix is stripped and the remaining header name is lowercased to form the key. Multiple files in one multipart upload receive the same metadata.
  • Surfaces that metadata on every EntryInfo: the HTTP upload response, and the filesystem gRPC service (Stat, ListDir, Move, MakeDir, etc.) via a new map<string, string> metadata proto field (field 11). The wiring is centralized in filesystem.GetEntryInfo, so all callers pick it up.
  • xattrs are stored under the user.e2b. namespace — user. is required by the Linux VFS for unprivileged xattrs, and e2b. namespaces our keys so they don't collide with other tooling writing to user.*. Foreign user.* xattrs are filtered out and not surfaced.
  • Replace-on-upload semantics: each upload rewrites the file's full metadata set. Keys present from a prior upload but absent from the new request are removed, and an upload with no X-Metadata-* headers clears all existing metadata (O_TRUNC preserves xattrs, so we always rewrite).
  • Validation: keys and values must be printable US-ASCII (0x200x7E), else HTTP 400. Keys are capped at 246 bytes (255-byte VFS xattr-name limit minus the user.e2b. prefix); values at 1024 bytes.
  • Error handling on write: xattr-unsupported filesystems (e.g. /proc, /sys) are best-effort — the body is persisted and we log a warning rather than fail; ENOSPC/EDQUOT map to HTTP 507. The response EntryInfo reads xattrs back from disk, so it never falsely claims metadata was persisted.

Implementation

  • xattr read/write/validate helpers live in a single cross-platform packages/shared/pkg/filesystem/xattr.go, backed by the github.com/pkg/xattr library rather than hand-rolled syscalls. The library handles the size-probe/retry and name parsing that previously lived in our own xattr_linux.go.
  • ReadMetadata/WriteMetadata carry the e2b-specific policy on top of the library: user.e2b. namespacing, full-set replace semantics, and ValidateMetadata. IsXattrUnsupported is exported so upload.go can treat xattr-less filesystems as best-effort without duplicating the errno check.
  • No platform shims needed — pkg/xattr is cross-platform, so the metadata tests run on Linux and macOS alike (only extractStatTimes keeps a linux/darwin split, for Stat_t field differences).
  • Bumps envd version to 0.5.28 and the OpenAPI spec to 0.1.3.

Test plan

  • make lint clean for envd and shared
  • go test ./... passes for packages/envd and packages/shared/pkg/filesystem (metadata tests now run on macOS too)
  • Upload a file with -H 'X-Metadata-author: mish' -H 'X-Metadata-purpose: upload' and confirm Stat returns the same map
  • Confirm xattrs land under user.e2b.* via getfattr -d <path> inside a sandbox
  • Re-upload the same path with different headers and confirm prior keys are cleared

🤖 Generated with Claude Code

Adds an optional `metadata` query parameter on POST /files (deepObject
style — `metadata[key]=value`) that envd persists as xattrs under the
`user.` namespace. The same metadata is returned on every EntryInfo
surfaced by the HTTP upload response and the filesystem gRPC service
(Stat, ListDir, Move, MakeDir).
@cla-bot cla-bot Bot added the cla-signed label May 19, 2026
@cursor
Copy link
Copy Markdown

cursor Bot commented May 19, 2026

PR Summary

Medium Risk
Touches core file upload and stat/list paths with new xattr I/O and replace semantics; failures are mostly bounded by validation and best-effort handling on non-xattr filesystems.

Overview
This PR lets clients attach user-defined metadata on file upload via X-Metadata-* request headers, stores it as user.e2b.* extended attributes, and returns it on EntryInfo from uploads and filesystem lookups (HTTP and gRPC). Each upload replaces the full metadata set (missing keys are removed; uploads with no metadata headers clear prior xattrs). Keys and values are validated (printable US-ASCII, per-key and total size limits) before write; unsupported virtual filesystems skip persistence with a warning while the file body still saves.

Reviewed by Cursor Bugbot for commit 7dbc57c. Bugbot is set up for automated code reviews on this repo. Configure here.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

❌ 4 Tests Failed:

Tests completed Failed Passed Skipped
2741 4 2737 7
View the full list of 4 ❄️ flaky test(s)
github.com/e2b-dev/infra/tests/integration/internal/tests/api/sandboxes::TestFuseDevicePermissions

Flake rate in main: 38.82% (Passed 862 times, Failed 547 times)

Stack Traces | 0.86s run time
=== RUN   TestFuseDevicePermissions
=== PAUSE TestFuseDevicePermissions
=== CONT  TestFuseDevicePermissions
Executing command curl in sandbox i0qqtehwc3n0bsixnyc42
    sandbox_fuse_test.go:24: Command [ls] output: event:{start:{pid:1270}}
    sandbox_fuse_test.go:24: Command [ls] output: event:{data:{stdout:"crw-rw-rw- 1 root root 10, 229 Jun  4 14:23 /dev/fuse\n"}}
    sandbox_fuse_test.go:24: Command [ls] output: event:{end:{exited:true  status:"exit status 0"}}
    sandbox_fuse_test.go:24: Command [ls] completed successfully in sandbox itozvgf5g4wuhvq355s0r
    sandbox_fuse_test.go:25: /dev/fuse listing: crw-rw-rw- 1 root root 10, 229 Jun  4 14:23 /dev/fuse
Executing command stat in sandbox itozvgf5g4wuhvq355s0r (user: root)
    sandbox_fuse_test.go:28: Command [stat] output: event:{start:{pid:1271}}
    sandbox_fuse_test.go:29: 
        	Error Trace:	.../api/sandboxes/sandbox_fuse_test.go:29
        	Error:      	Received unexpected error:
        	            	failed to execute command stat in sandbox itozvgf5g4wuhvq355s0r: invalid_argument: protocol error: incomplete envelope: unexpected EOF
        	Test:       	TestFuseDevicePermissions
        	Messages:   	Failed to stat /dev/fuse
--- FAIL: TestFuseDevicePermissions (0.86s)
github.com/e2b-dev/infra/tests/integration/internal/tests/orchestrator::TestSandboxMemoryIntegrity

Flake rate in main: 55.18% (Passed 866 times, Failed 1066 times)

Stack Traces | 70s run time
=== RUN   TestSandboxMemoryIntegrity
=== PAUSE TestSandboxMemoryIntegrity
=== CONT  TestSandboxMemoryIntegrity
    sandbox_memory_integrity_test.go:27: Build completed successfully
--- FAIL: TestSandboxMemoryIntegrity (70.00s)
github.com/e2b-dev/infra/tests/integration/internal/tests/orchestrator::TestSandboxMemoryIntegrity/tmpfs_hash

Flake rate in main: 55.25% (Passed 856 times, Failed 1057 times)

Stack Traces | 199s run time
=== RUN   TestSandboxMemoryIntegrity/tmpfs_hash
=== PAUSE TestSandboxMemoryIntegrity/tmpfs_hash
=== CONT  TestSandboxMemoryIntegrity/tmpfs_hash
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{start:{pid:1263}}
Executing command bash in sandbox itulqqspuq6n6oj0oduuw (user: root)
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stdout:"Total memory: 985 MB\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stdout:"Used memory before tmpfs mount: 188 MB\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stdout:"Free memory before tmpfs mount: 795 MB\nMemory to use in integrity test (60% of free, min 64MB): 477 MB\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"477+0 records in\n477+0 records out\n500170752 bytes (500 MB, 477 MiB) copied, 1.93217 s, 259 MB/s\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"\t"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"C"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"o"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"m"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"m"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"a"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"d"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" "}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"b"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"e"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"i"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"g"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" "}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"t"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"i"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"m"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"e"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"d"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:":"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" "}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"\""}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"dd if=/dev/urandom of=/mnt/testfile bs=1M count=477\"\n\tUser time (seconds): 0.00\n\tSystem time (seconds): 1.92\n\tPercent of CPU this job got: 99%\n\tElapsed (wall clock) time (h:mm:ss or m:ss): 0:01.94\n\tAverage shared text s"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"i"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ze (kby"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"tes):"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" 0\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"\tAverage un"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"shared"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" data"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" size "}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"(kbyt"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"es): 0\n\t"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"Avera"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ge st"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ack s"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ize (kby"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"tes):"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" 0\n\tA"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"vera"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ge to"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"tal size"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" (kby"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"tes):"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" 0\n\t"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"Maxim"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"um resi"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"dent "}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"set s"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ize "}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"(kbyt"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"es): 2700\n\t"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"Avera"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ge re"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"side"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"nt set s"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ize ("}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"kbytes): 0\n\tM"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ajor ("}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"requi"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ring I/O) p"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"age fa"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ults:"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" 3\n\t"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"Mino"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"r (reclaiming a frame) page faults: 344\n\tVoluntary context switches: 4\n\tInvoluntary context switches: 9\n\tSwaps: 0\n\tFile system inputs: 176\n\tFile system outputs: 0\n\tSocket messages sent: 0\n\tSocket messages received: 0\n\tSignals delivered: 0\n\tPage size (bytes): 4096\n\tExit status: 0\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stdout:"Used memory after tmpfs mount and file fill: 670 MB\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{end:{exited:true  status:"exit status 0"}}
    sandbox_memory_integrity_test.go:70: Command [bash] completed successfully in sandbox ioah89hpffn6dvnw5ubv1
Executing command bash in sandbox ioah89hpffn6dvnw5ubv1 (user: root)
    sandbox_memory_integrity_test.go:80: Command [bash] output: event:{start:{pid:1279}}
Executing command bash in sandbox i0xrhjtox1piapjkj8od4 (user: root)
    sandbox_memory_integrity_test.go:80: Command [bash] output: event:{data:{stdout:"261f5813ed273000a0e414e7c4a490bdeb93fded0d472184ff3f3de329e1924b\n"}}
    sandbox_memory_integrity_test.go:80: Command [bash] output: event:{end:{exited:true  status:"exit status 0"}}
    sandbox_memory_integrity_test.go:80: Command [bash] completed successfully in sandbox ioah89hpffn6dvnw5ubv1
    sandbox_memory_integrity_test.go:80: Command [bash] output: event:{start:{pid:1283}}
Executing command bash in sandbox i0xrhjtox1piapjkj8od4 (user: root)
    sandbox_memory_integrity_test.go:110: 
        	Error Trace:	.../tests/orchestrator/sandbox_memory_integrity_test.go:81
        	            				.../hostedtoolcache/go/1.26.3.../src/runtime/asm_amd64.s:1771
        	Error:      	Received unexpected error:
        	            	failed to execute command bash in sandbox ioah89hpffn6dvnw5ubv1: unavailable: HTTP status 502 Bad Gateway
    sandbox_memory_integrity_test.go:110: 
        	Error Trace:	.../tests/orchestrator/sandbox_memory_integrity_test.go:78
        	            				.../tests/orchestrator/sandbox_memory_integrity_test.go:110
        	Error:      	Condition never satisfied
        	Test:       	TestSandboxMemoryIntegrity/tmpfs_hash
--- FAIL: TestSandboxMemoryIntegrity/tmpfs_hash (198.71s)
github.com/e2b-dev/infra/tests/integration/internal/tests/proxies::TestSandboxAutoResumeViaProxy

Flake rate in main: 40.73% (Passed 860 times, Failed 591 times)

Stack Traces | 14.3s run time
=== RUN   TestSandboxAutoResumeViaProxy
=== PAUSE TestSandboxAutoResumeViaProxy
=== CONT  TestSandboxAutoResumeViaProxy
Executing command cat in sandbox iiiiyb625etdbs47livv9 (user: root)
    auto_resume_test.go:116: 
        	Error Trace:	.../tests/proxies/auto_resume_test.go:116
        	Error:      	Received unexpected error:
        	            	Get "http://localhost:3002": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
        	Test:       	TestSandboxAutoResumeViaProxy
--- FAIL: TestSandboxAutoResumeViaProxy (14.31s)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The listxattr and getxattr functions in xattr_linux.go are susceptible to race conditions where the size of extended attributes can increase between the initial size check and the subsequent data retrieval. These calls should be wrapped in retry loops that handle unix.ERANGE by re-querying the size and re-allocating the buffer to ensure correctness in environments with concurrent file modifications.

Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
@mishushakov
Copy link
Copy Markdown
Member Author

@claude review

- Retry listxattr/getxattr on ERANGE to tolerate concurrent xattr writers.
- Short-circuit empty-value reads to avoid a slice-bounds panic when the
  kernel reinterprets a zero-length buffer as a size query.
- Validate metadata keys/values (non-empty, no NUL, length caps) at the
  HTTP boundary and inside WriteMetadata so invalid input returns 400
  instead of a bare 500.
- Add unit tests for validation and xattr round-trip.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread packages/envd/internal/api/upload.go
mishushakov and others added 4 commits May 20, 2026 15:30
envd writes metadata once at upload time and only reads it afterwards —
there are no concurrent xattr writers in the design, so the size-then-fetch
race the retry loops guarded against cannot happen in practice. Restore the
simpler single-pass form.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces the deepObject `?metadata[key]=value` query parameter with
`X-Metadata-<key>: <value>` request headers, matching the S3/GCS/Azure
convention for object metadata. The header form works uniformly for raw
and multipart uploads, avoids URL-encoding `[` / `]`, and sidesteps query
string length caps.

Keys are lowercased and the `X-Metadata-` prefix is stripped before they
become `user.<key>` xattrs. Bumps envd to 0.5.27 and the OpenAPI spec to
0.1.3.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
revive's inefficient-map-lookup correctly flagged the loop — a direct
`got[MetadataXattrPrefix+"keep"]` lookup expresses the assertion better.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
@mishushakov
Copy link
Copy Markdown
Member Author

@claude review
@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a5fbaa9523

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/envd/internal/api/upload.go Outdated
mishushakov and others added 2 commits May 20, 2026 17:11
Storage exhaustion during the metadata write was returning 500, breaking
clients that key off the 507 the rest of /files already returns for
ENOSPC. Map both ENOSPC and EDQUOT from filesystem.WriteMetadata to
StatusInsufficientStorage so the same resource condition is reported
consistently.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Uploads to /sys, /proc, and other filesystems without xattr support
were succeeding for the file body but then returning 500 from the
WriteMetadata call, breaking the supported case of writing to e.g.
/sys/fs/cgroup that the upload tests already exercise.

Match ReadMetadata's behavior: when WriteMetadata returns ENOTSUP /
EOPNOTSUPP, log a warning and drop the in-flight metadata so the
response EntryInfo doesn't falsely claim it was persisted, then
complete the upload normally.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread packages/envd/internal/api/upload.go Outdated
processFile previously mutated the shared metadata map via clear() to
hide it from the response when the FS didn't support xattrs. That broke
multipart uploads: clearing on the first file's path also stripped
metadata from every subsequent file in the request.

Drop the clear() and the request-side maps.Copy; instead, read xattrs
back from the file after writing and use that for EntryInfo.Metadata.
The response always reflects what's actually on disk, regardless of FS
support, partial writes, or pre-existing xattrs — matching Stat /
ListDir semantics.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mishushakov mishushakov marked this pull request as ready for review May 20, 2026 16:26
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a878e36aff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
Comment thread packages/envd/internal/api/upload.go Outdated
Comment thread packages/envd/internal/api/upload.go
Comment thread packages/envd/internal/api/upload.go
Comment thread packages/shared/pkg/filesystem/entry_linux.go Outdated
Comment thread packages/shared/pkg/filesystem/entry.go Outdated
Comment thread packages/envd/internal/services/filesystem/utils.go
Comment thread packages/envd/spec/envd.yaml
Comment thread packages/envd/spec/envd.yaml Outdated
Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
Comment thread packages/envd/internal/api/upload_metadata_test.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 79d1ea7e67

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/shared/pkg/filesystem/xattr.go Outdated
Comment thread packages/envd/internal/api/upload.go Outdated
- Namespace xattr keys under user.e2b. to avoid clashes with foreign user.* xattrs
- Lower MaxMetadataValueLen to 1 KiB to fit a single ext4 4 KiB block alongside other xattrs and the inode header
- Enforce printable US-ASCII on metadata keys and values (matches the spec)
- WriteMetadata: empty/nil = no-op; non-empty = replace the full user.e2b.* set so re-uploading with new metadata clears stale keys without affecting foreign xattrs
- Simplify splitNullTerminated via bytes.Split
- Drop the readEntryMetadata wrapper and call ReadMetadata directly
- Document upload metadata semantics, size caps, and ASCII constraint in envd.yaml
- Bump envd to 0.5.28

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread packages/shared/pkg/filesystem/entry.go
Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
Comment thread packages/shared/pkg/filesystem/xattr_linux.go Outdated
Comment thread packages/envd/spec/filesystem/filesystem.proto
Comment thread packages/shared/pkg/filesystem/entry.go
@mishushakov mishushakov marked this pull request as draft June 4, 2026 12:45
Replace the hand-rolled xattr syscall wrappers with the cross-platform
github.com/pkg/xattr library:

- Drop the custom listxattr/getxattr size-then-read helpers and the
  null-separator name parsing; the library handles them.
- Unify the Linux-only implementation and the darwin no-op shims into a
  single cross-platform xattr.go, removing the //go:build linux
  constraint. Rename xattr_linux_test.go to xattr_metadata_test.go so the
  metadata tests run on all platforms.
- Export IsXattrUnsupported so upload.go reuses it instead of
  duplicating the ENOTSUP/EOPNOTSUPP check.

No behavioral change; the user.e2b.* namespacing, full-set replace
semantics, and validation are unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread packages/shared/pkg/filesystem/xattr.go
The EntryInfo.metadata field comment said keys live in the `user.`
namespace, but the implementation uses `user.e2b.` and filters out
foreign `user.*` xattrs. Update the .proto and the two generated
filesystem.pb.go files to match.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread packages/envd/internal/api/upload.go Outdated
Comment thread packages/envd/internal/api/upload.go Outdated
Comment thread packages/envd/internal/api/upload.go Outdated
Comment thread packages/envd/spec/envd.yaml
Comment thread packages/shared/pkg/filesystem/entry.go
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c6e843f. Configure here.

Comment thread packages/envd/internal/api/upload.go
Comment thread packages/shared/pkg/filesystem/xattr.go Outdated
mishushakov and others added 3 commits June 4, 2026 15:54
- Log (instead of silently dropping) ReadMetadata errors when building
  upload responses and in GetEntryInfo, so xattr read failures are
  observable. Metadata stays best-effort — a read failure no longer
  hides itself but still doesn't fail the entry lookup.
- Correct the metadataHeaderPrefix comment: the resulting xattr is
  user.e2b.<key>, not user.<key>.
- Document in the OpenAPI spec that duplicate X-Metadata-<key> headers
  use the first value.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the per-value 1 KiB limit with a single 4 KiB cap on the
combined size of all metadata on a file (stored name + value, summed
across entries). ext4 keeps an inode's xattrs in one filesystem block
(~4 KiB), so a large set would otherwise fail late inside WriteMetadata
with ENOSPC/E2BIG; validating the total up front rejects it cleanly with
HTTP 400. The per-key length cap stays (it's the hard VFS xattr-name
limit). Spec and tests updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
GetEntryInfo is a pure shared helper with no logger in scope, and zap.L()
isn't used anywhere in the shared module outside the logger package, so
reaching for the global logger here introduced a non-idiomatic dependency
on a hot path. Revert to the best-effort silent read; the upload handler
still calls ReadMetadata directly and logs failures where it has a logger.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment on lines +78 to +82
// Metadata is best-effort: a read failure shouldn't fail the entry lookup
// (Size/Mode/times are still valid), and this helper has no logger to
// report it through. Callers that need the error (e.g. the upload handler)
// call ReadMetadata directly.
entry.Metadata, _ = ReadMetadata(path)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new entry.Metadata, _ = ReadMetadata(path) in GetEntryInfo (entry.go:82) issues path-based listxattr/getxattr syscalls in the caller's mount namespace. The orchestrator volume service's toEntry (packages/orchestrator/pkg/volumes/service.go:136) invokes this outside fs.act/ns.Do (unlike path_stat.go:35, which correctly goes through fs.GetEntry), so for every entry returned by dir_list.go:97 (depth up to 10), dir_create.go:85, file_create.go:115, and path_update.go:90 we issue an extra path-based xattr syscall against the host's view of the chroot-relative path — typically ENOENT, but pure waste in recursive listings. Today the result is silently discarded by fromEntryInfo because the orchestrator EntryInfo proto has no metadata field, so this is only wasted syscalls plus a latent footgun: if anyone later adds metadata to that proto without rewiring toEntry through fs.act, the service will return host-side xattrs labeled as the volume entry's. Fix is either to route toEntry through fs.GetEntry/fs.act like path_stat.go already does, or to give GetEntryInfo an opt-out for callers that don't surface metadata.

Extended reasoning...

What the bug is

GetEntryInfo (packages/shared/pkg/filesystem/entry.go:82) now unconditionally calls ReadMetadata(path), which performs path-based xattr.List/xattr.Get syscalls. Unlike every other field that GetEntryInfo populates (all derived from the passed-in fileInfo), these syscalls run in whichever mount namespace the calling goroutine is in, against whatever inode the kernel resolves path to in that namespace. That is fine when the caller has already arranged for the goroutine to be inside the right chroot, but it is wrong when the caller hands in a chroot-relative path string from the outside.

Where the call goes wrong

The orchestrator volume service does exactly that. toEntry (packages/orchestrator/pkg/volumes/service.go:136-140) calls filesystem.GetEntryInfo directly, with no fs.act wrapper:

func toEntry(fullVolumePath string, fileInfo os.FileInfo) *orchestrator.EntryInfo {
    entryInfo := filesystem.GetEntryInfo(fullVolumePath, fileInfo)
    return fromEntryInfo(fullVolumePath, entryInfo)
}

The four callers all obtain fileInfo via fs.ReadDir/fs.Stat (which correctly enter the chroot via the pinned-thread goroutine in chrooted.Chrooted.actmountns.Do), but then call toEntry from the outer goroutine, which lives in the host mount namespace:

  • packages/orchestrator/pkg/volumes/dir_list.go:97 (per item in listRecursive, depth up to 10)
  • packages/orchestrator/pkg/volumes/dir_create.go:85
  • packages/orchestrator/pkg/volumes/file_create.go:115
  • packages/orchestrator/pkg/volumes/path_update.go:90

Contrast with packages/orchestrator/pkg/volumes/path_stat.go:35, which uses fs.GetEntry(path). That helper wraps filesystem.GetEntryFromPath inside fs.act so the ReadMetadata syscalls run in the chrooted namespace. The exact same data on Stat versus ListDir therefore goes through two different code paths today, only one of which is correct.

Addressing the refutations

The refuters note that this is the same root cause as a separate bug_005 finding, that there is no current user-visible bug because the orchestrator EntryInfo proto (packages/shared/pkg/grpc/orchestrator/volume.pb.go) has no metadata field, and that the leak scenario requires a hypothetical future proto change. All three points are correct, and they are why I'm filing this as nit rather than normal:

  1. fromEntryInfo (service.go:142-156) does not copy entryInfo.Metadata, and the orchestrator proto has no place to surface it. Whatever xattr.List returns for the host's view of the path string is silently discarded.
  2. The leak-on-collision case (e.g. /etc/passwd exists on both host and chroot) is currently harmless for the same reason.
  3. The wasted-syscall framing alone is small per call — one extra listxattr per FileInfo returned, typically failing with ENOENT in the host namespace.

What pushes this above zero-cost-to-flag is the latent footgun: the wiring in toEntry is wrong in a way that will not be visible to anyone who later adds a metadata field to the orchestrator proto because the existing fromEntryInfo → entryInfo.Metadata plumbing will appear to work end-to-end — except the xattrs it returns will be from the host, not the volume. That is the kind of mistake that escapes review precisely because the path looks symmetric to the envd side (where the goroutine running GetEntryInfo is the request goroutine inside the sandbox, so xattrs resolve correctly).

The refutation that "this is just a duplicate of bug_005" is also fine — they share a root cause. But the synthesis description here adds the structural framing (chroot/namespace mismatch + which call sites lose, contrasted with the correctly-wired path_stat.go) that a code reviewer needs in order to fix it once rather than playing whack-a-mole when the proto is extended.

Step-by-step proof for the latent leak (one-line proto change away)

  1. A sandbox volume is rooted at /var/data/volumes/sandbox-X on the host; inside the chroot the same data appears at /.
  2. Someone adds map<string,string> metadata = 11; to packages/shared/pkg/grpc/orchestrator/volume.proto's EntryInfo and a one-line Metadata: entryInfo.Metadata copy to fromEntryInfo (mirroring what the envd-side filesystem.pb.go already does). No other change.
  3. A caller invokes ListDir("/") on the volume service. dir_list.go runs fs.ReadDir("/") inside fs.act and obtains correct FileInfo for foo.txt from the chrooted view.
  4. Back in the outer goroutine (host namespace), toEntry("/foo.txt", item) calls filesystem.GetEntryInfoReadMetadata("/foo.txt")unix.Listxattr("/foo.txt", ...). The kernel resolves /foo.txt in the host namespace — usually ENOENT, but on path collision (e.g. a real /foo.txt exists on the host) it succeeds and returns the host file's user.e2b.* xattrs.
  5. fromEntryInfo copies that map into the response. The client sees metadata attributed to the volume entry that actually came from a host inode of the same path.

This is the future bug that the present code is one-line away from. The fix is to remove the divergence between Stat (correct via fs.GetEntry) and ListDir/MakeDir/CreateFile/UpdatePath (incorrect via direct GetEntryInfo) now, before the proto grows the field that exposes it.

How to fix

Two reasonable options:

  • Rewire toEntry through fs.GetEntry (which already does fs.act(filesystem.GetEntryFromPath)). Callers would need to pass fs to toEntry or split the work so the chrooted call happens inside the existing fs.act block; the latter is simpler in dir_list.go/dir_create.go/file_create.go/path_update.go because they already hold fs.
  • Add a GetEntryInfoWithoutMetadata (or a bool argument) that skips the ReadMetadata call, and use it from the orchestrator volume service since the orchestrator proto does not surface metadata anyway. This keeps the orchestrator behavior unchanged and removes the wasted syscalls today; it would still require revisiting if the proto gains a metadata field, but at least the silent discard is replaced with a visible opt-out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants