Skip to content

feat: WASI filesystem interceptor for /ipfs/ path interception#263

Merged
lthibault merged 7 commits intomasterfrom
feat/membrane-ipfs-intercept
Mar 27, 2026
Merged

feat: WASI filesystem interceptor for /ipfs/ path interception#263
lthibault merged 7 commits intomasterfrom
feat/membrane-ipfs-intercept

Conversation

@lthibault
Copy link
Copy Markdown
Contributor

@lthibault lthibault commented Mar 27, 2026

Summary

Intercepts open_at calls for /ipfs/<CID>/... paths in WASI guests, resolving them lazily through the pinset cache. Everything else delegates to the standard wasmtime-wasi filesystem implementation.

IPFS Interceptor (src/fs_intercept.rs)

  • Custom HostDescriptor impl that intercepts open_at for ipfs/ paths
  • Parses CID + optional subpath, rejects path traversal (.. components)
  • Ensures CID is pinned via cache, fetches bytes to staging dir, opens as real fd
  • Writes to /ipfs/ rejected with NotPermitted (content-addressed = immutable)
  • Linker override via allow_shadowing — clean, non-invasive integration

Cache Refactoring (crates/cache/src/pinset.rs)

  • Dropped in-memory byte cache (PinEntry.data, inline_threshold) — OS page cache handles hot files
  • PinsetCache and IsolatedPinset now own their staging TempDir:
    • CacheMode::Shared = host-wide shared staging dir (all procs on a host share one cache)
    • CacheMode::Isolated = per-proc staging dir (for cache timing attack isolation)
  • CacheMode gains staging_dir(), ensure(), and fetch() convenience methods
  • ensure() returns Result<()> instead of Result<Option<Arc<[u8]>>>

Security

  • Path traversal rejection for IPFS subpaths (commit f7b0e45)
  • Large objects no longer silently return NoEntry — all objects fetch to staging

Test Coverage

All new code paths have test coverage (100%).

  • 33 cache crate tests (ARC + pinset)
  • 12 fs_intercept tests (5 unit + 7 integration)
  • 116 other tests unaffected

Pre-Landing Review

No issues found.

Plan Completion

All plan items addressed (12 DONE, 1 CHANGED). The staging dir was moved into cache types instead of ComponentRunStates — better co-location with cache lifecycle.

Test plan

  • All cache crate tests pass (33 tests)
  • All ww crate tests pass (111 tests)
  • All integration tests pass (15 tests)
  • cargo clippy -p cache clean

Custom wasi:filesystem/types Host impl that intercepts open_at calls
for /ipfs/<CID>/... paths, resolving them lazily through PinsetCache.
All other filesystem operations delegate to the standard wasmtime-wasi
implementation via allow_shadowing linker override.

- IpfsFilesystemView wraps WasiFilesystemCtxView with cache + staging
- 27-method HostDescriptor delegation, open_at intercept for ipfs/ paths
- CID parsed from path, ensure() via cache, materialize to staging TempDir
- Activated only when cache_mode is set on the process Builder
- cap-std added for ambient dir/file construction
- MockPinner + TestHarness for exercising open_ipfs without a real IPFS node
- test_open_ipfs_file_materializes_and_returns_descriptor: full flow
- test_open_ipfs_write_rejected: /ipfs/ is read-only
- test_open_ipfs_no_cache_returns_error: no cache_mode → error
- test_open_ipfs_unknown_cid_returns_error: CID not in pinner → error
- test_open_ipfs_with_shared_cache: Shared CacheMode works
- test_open_ipfs_with_subpath: nested paths materialized correctly
A guest opening "ipfs/QmCID/../../etc/passwd" could escape the staging
directory via ".." components in the subpath. Reject any subpath
containing ".." segments in parse_ipfs_path().
When ensure() returns None (object above inline_threshold), the
interceptor now calls fetch() to pull bytes from IPFS and writes
them to the staging directory. Previously this was a silent NoEntry,
meaning any CID above ~1MB just didn't work.

Adds PinsetCache::fetch() and IsolatedPinset::fetch() which delegate
to the underlying Pinner. Adds test with inline_threshold=0 to force
the large-object path.
The ARC now manages pin lifecycle only (which CIDs stay pinned in IPFS).
File content is not held in memory — the staging TempDir serves as the
local cache. On open_ipfs: ensure CID is pinned, check if already staged
(disk cache hit), fetch to staging if not.

Removes: PinEntry.data, inline_threshold parameter, PinEntry re-export.
Simplifies PinsetCache::new to (pinner, budget) and IsolatedPinset::new
to (pinner). One config knob instead of two.
PinsetCache and IsolatedPinset now own their staging TempDir:
- Shared cache = shared staging dir (all procs on a host)
- Isolated cache = per-proc staging dir (cleaned up on drop)

CacheMode gains staging_dir(), ensure(), and fetch() methods,
eliminating match arms at every call site. ipfs_staging removed
from ComponentRunStates since it's now co-located with the cache.
@lthibault lthibault changed the title feat: WASI filesystem interceptor for /ipfs/ paths feat: WASI filesystem interceptor for /ipfs/ path interception Mar 27, 2026
Each IsolatedPinset pins independently in IPFS, giving cross-process
refcounting: content stays pinned as long as any process holds a pin.
Holding the lock across the async pin prevents the TOCTOU race that
could double-pin within a single process, keeping the refcount clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant