Skip to content

Remove GH Actions e2e-docker workflow, use local Docker exclusively#18

Merged
NathanFlurry merged 41 commits intomainfrom
ralph/e2e-docker-local-only
Mar 20, 2026
Merged

Remove GH Actions e2e-docker workflow, use local Docker exclusively#18
NathanFlurry merged 41 commits intomainfrom
ralph/e2e-docker-local-only

Conversation

@NathanFlurry
Copy link
Member

Summary

  • Deletes .github/workflows/e2e-docker.yml (GH Actions service containers for Postgres, MySQL, Redis, SSH)
  • Simplifies e2e-docker.test.ts to always use local Docker containers via startContainer() — removes all CI branching logic (E2E_DOCKER_CI, env var reads for PG_HOST/MYSQL_HOST/etc.)
  • All 17 e2e-docker tests verified passing with local Docker

Test plan

  • All 17 e2e-docker tests pass (pg-connect, pg-pool, pg-types, pg-errors, pg-prepared, pg-ssl, mysql2-connect, ioredis-connect, ssh2-connect, ssh2-key-auth, ssh2-tunnel, ssh2-sftp-dirs, ssh2-sftp-transfer, ssh2-sftp-large, ssh2-auth-fail, ssh2-connect-refused, services are configured)
  • Tests skip gracefully via skipUnlessDocker() when Docker is unavailable

🤖 Generated with Claude Code

NathanFlurry and others added 30 commits March 19, 2026 14:30
…e stories

Add Testing Policy section requiring real services (no mocking), sandbox
execution for all tests, and real Docker containers for e2e fixtures.
Expand PRD with 5 stories to run Docker fixtures against real containers
and fix net bridge issues until parity passes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e-exec sandbox VM

Rewrite Pi headless E2E tests to load and execute Pi's JavaScript inside
the sandbox VM via NodeRuntime instead of spawning Pi as a host process.

- Create NodeRuntime with NodeFileSystem, moduleAccess overlay, network
  bridge (with SSRF exemption for mock server port), and allowAll permissions
- Load Pi via dynamic import() inside the sandbox VM
- Patch globalThis.fetch in sandbox to redirect Anthropic API calls to mock server
- Use Pi's SDK (createAgentSession + runPrintMode) for headless execution
- File read/write tests go through the sandbox's fs bridge
- Bash tool test uses a real CommandExecutor through the child_process bridge
- Probe Pi VM loading in beforeAll; skip all tests with clear reason if Pi
  cannot load (currently blocked by pnpm transitive dependency resolution
  in the moduleAccess overlay)
… and fix until parity passes

Implement TCP socket bridge (net module) for the secure-exec sandbox,
enabling the pg library to connect to real PostgreSQL through the
sandbox's network bridge.

- Add net module with Socket class, connect(), createConnection(),
  isIP/isIPv4/isIPv6, and createServer (throws)
- Bridge architecture: guest _netSocketConnectRaw → host creates real
  net.Socket → events dispatched back via _netSocketDispatch
- Extend NetworkAdapter interface with TCP socket methods
- Forward TCP methods through wrapNetworkAdapter permission wrapper
- Remove net from deferred core module stubs
- Use trust auth for Postgres to bypass SCRAM-SHA-256 (subtle.deriveBits
  not yet available in sandbox)
- Update pg-connect fixture expectation from fail to pass
…r and fix until parity passes

Fix the mysql2-connect Docker fixture to connect to a real MySQL container
through the sandbox's net bridge. The core issue was that iconv-lite (used
by mysql2 for character encoding) lazily loads encoding modules on first use,
which happens inside net socket data callbacks dispatched via applySync.
Since applySyncPromise cannot pump the event loop inside applySync contexts,
module loading would crash with "This function may not be called from the
default thread".

Fixes:
- Add synchronous module resolution bridge (_resolveModuleSync) using
  Node.js require.resolve() for use inside applySync contexts
- Add synchronous file loading bridge (_loadFileSync) using fs.readFileSync
  for use inside applySync contexts
- Add JS-side resolution cache to avoid applySyncPromise for previously
  resolved modules
- Cache modules by raw name (not just resolved path) for faster lookup
- Skip polyfill check for path-based requires (never polyfills)
- Use mysql_native_password auth for MySQL container (avoids
  crypto.publicEncrypt needed by caching_sha2_password)
- Add iconv-lite as direct dependency and eagerly initialize encodings
- Update fixture expectation from fail to pass

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d fix until parity passes

Also fixes US-005 (ssh2-sftp-transfer) as the same underlying issues affected both.

Three key fixes to enable ssh2 in the sandbox:

1. Buffer prototype patching: Added internal V8 Buffer methods (latin1Slice,
   base64Slice, utf8Write, etc.) to both the bridge Buffer and the polyfill
   Buffer prototypes. ssh2 protocol parser uses these for fast binary access.

2. NetSocket._readableState.ended: ssh2's isWritable() checks
   _readableState.ended === false. Without this field, all socket writes
   were silently skipped, causing handshake timeout.

3. Stateful cipher/decipher bridge: Added _cryptoCipherivCreate/Update/Final
   bridge functions for streaming crypto. ssh2 AES-GCM needs update() to
   return encrypted data immediately, not buffer to final().
…() inside the sandbox

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…a sandbox child_process bridge

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…a sandbox child_process bridge

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- examples/virtual-file-system-s3: S3-backed VirtualFileSystem using @aws-sdk/client-s3, with Docker Compose for MinIO
- examples/virtual-file-system-sqlite: SQLite-backed VirtualFileSystem using sql.js (WASM), no native deps
- Both tested end-to-end: sandbox writes/reads files, host verifies via storage API
- docs/features/virtual-filesystem.mdx: linked examples with use case descriptions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nShell() with opencode binary

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… via sandbox child_process bridge

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lities

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lities

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implementation fixes:
- rename() now moves directory children (both S3 and SQLite)
- SQLite: escape LIKE wildcards in readDir queries (% and _ injection)
- SQLite: resolve symlinks in readFile/writeFile/exists
- SQLite: guard against removeDir("/")
- S3: removeDir throws ENOENT for nonexistent dirs
- S3: createDir checks parent exists
- S3: exists()/stat() only catch 404, not all errors
- S3: pin MinIO Docker image version

Test coverage expanded from 6/21 to 17/21 VFS methods:
- S3: 12 tests (write, read, stat, exists, rename, rename-dir,
  remove, rmdir, truncate, err-symlink, err-read, err-rmdir)
- SQLite: 15 tests (write, read, stat, exists, rename, rename-dir,
  remove, rmdir, chmod, truncate, symlink, link, err-read, err-rmdir,
  snapshot)
- Both exit non-zero on failure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
US-016: SSH key-based authentication
US-017: SSH port forwarding / tunneling
US-018: SFTP directory operations
US-019: SSH/SFTP error paths (auth fail, connect refused)
US-020: SFTP large file transfer, rename, and streaming

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Postgres gaps (from adversarial review):
- US-021: UPDATE, DELETE, transactions (BEGIN/COMMIT/ROLLBACK)
- US-022: pg.Pool connection pooling and concurrent queries
- US-023: Data types (JSON, JSONB, TIMESTAMPTZ, BYTEA, arrays, UUID)
- US-024: Error paths (bad SQL, constraint violations, connection errors)
- US-025: Named prepared statements (Parse/Bind/Execute wire protocol)
- US-026: Fix auth method discrepancy (local=trust, CI=scram-sha-256)

TLS gaps:
- US-027: TLS database connections (pg with ssl:true)
- US-028: HTTPS error handling (expired certs, hostname mismatch)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nsactions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NathanFlurry and others added 11 commits March 19, 2026 20:36
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ents

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement tls.connect() bridge for TLS socket upgrade, enabling SSL-encrypted
database connections through the sandbox. The pg-ssl fixture validates that
Postgres SSL connections work end-to-end via the sandbox's net + tls bridge.

Bridge changes:
- Add NetSocketUpgradeTlsRaw bridge contract for TLS socket upgrade
- Add tls module to sandbox (connect, TLSSocket, createSecureContext)
- Host-side wraps existing net.Socket with Node.js tls.TLSSocket
- Forward end/close events to wrapped raw socket (pg relies on this)
- Wire tls module into require-setup for require('tls') resolution

Infrastructure:
- Custom postgres-ssl Dockerfile with self-signed certificate
- Postgres container now runs with SSL enabled (all pg fixtures unaffected)
- New pg-ssl fixture: connects with ssl:{rejectUnauthorized:false}, queries
  pg_stat_ssl to verify encryption, runs CRUD operations through TLS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NathanFlurry NathanFlurry merged commit fa73599 into main Mar 20, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant