feat: proxy-aware transport (--use-env-proxy), backfill CLI, retry improvements#290
feat: proxy-aware transport (--use-env-proxy), backfill CLI, retry improvements#290
Conversation
Add labels to retry log messages for easier debugging, include error messages in retry output, make getAccount gracefully fall back on failure, and format HTTP trace logs as reproducible curl commands. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude
Sources mitmweb-env.sh to route Node/Bun/curl traffic through mitmweb on localhost:8080, auto-starting it with the internal upstream proxy if needed. mitmweb-env.test.sh verifies all three runtimes in parallel. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Node's built-in fetch (undici) silently bypasses HTTP_PROXY/HTTPS_PROXY without the --use-env-proxy flag. assertUseEnvProxy() throws at startup if a proxy is configured but the flag is absent, preventing silent proxy bypass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
… set" This reverts commit eb4e061.
Fails fast if HTTPS_PROXY/HTTP_PROXY is set without --use-env-proxy, preventing silent proxy bypass in Node's built-in fetch (undici). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Throws when HTTPS_PROXY/HTTP_PROXY is set but --use-env-proxy is absent (Node's built-in fetch silently bypasses the proxy without it). Bun is exempt since it always respects proxy env natively. Subprocess tests spawn real node/bun processes to verify the flag check works end-to-end, not just in unit logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
- Auto-install mitmproxy via pip if not found - Auto-detect upstream proxy from http_proxy/https_proxy env vars instead of hardcoding the Stripe egress proxy URL; falls back to direct mode in CI and clean environments - Add mitmweb proxy intercept test step to CI test job (httpbin.org) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Avoids adding latency to the main test job. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR introduces a new apps/engine sync engine implementation (HTTP API + CLI + connector resolution), adds a new apps/service package, and adds a React dashboard app—along with Docker/CI/repo-ops updates. The scope appears substantially larger than the PR title/description suggests.
Changes:
- Add a new streaming-based sync engine library/API/CLI in
apps/engine(NDJSON streaming, connector resolver, progress/state utilities, OpenAPI schema generation + validation). - Add a new
apps/dashboardReact UI with Playwright e2e coverage and typed API clients viaopenapi-fetch. - Update repo operations: new multi-target root Dockerfile, new audit workflow, docs/agent guidance changes, and various tooling configs.
Reviewed changes
Copilot reviewed 91 out of 699 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/service/src/tests/openapi.test.ts | Adds OpenAPI 3.1 schema validation test for the Service spec. |
| apps/service/package.json | Introduces the @stripe/sync-service package manifest and scripts. |
| apps/engine/vitest.config.ts | Adds Vitest configuration (exclusions + timeouts). |
| apps/engine/tsconfig.json | Adds engine TS build config (dist output + excludes tests). |
| apps/engine/src/serve-command.ts | Adds a serveAction to run the engine server with connector resolution. |
| apps/engine/src/logger.ts | Adds pino logger config with redaction settings. |
| apps/engine/src/lib/state-store.ts | Adds state store abstraction + file/readonly implementations. |
| apps/engine/src/lib/source-test.ts | Adds test source connector implementation for engine tests. |
| apps/engine/src/lib/source-exec.ts | Adds subprocess-backed Source wrapper (exec-based connector). |
| apps/engine/src/lib/select-state-store.ts | Adds dynamic destination state-store selection via optional package import. |
| apps/engine/src/lib/resolver.ts | Adds connector resolution (specifier/bin) + resolver cache + schema extraction. |
| apps/engine/src/lib/resolver.test.ts | Adds tests for resolver behavior and caching. |
| apps/engine/src/lib/remote-engine.ts | Adds HTTP client implementing the Engine interface using streaming endpoints. |
| apps/engine/src/lib/remote-engine.test.ts | Adds integration tests for the remote engine client against an in-memory server. |
| apps/engine/src/lib/progress.ts | Adds progress tracking/enrichment and record counting for sync output. |
| apps/engine/src/lib/progress.test.ts | Adds tests for progress enrichment and state merging. |
| apps/engine/src/lib/pipeline.ts | Adds pipeline stages (catalog enforcement, state persistence, limits, piping). |
| apps/engine/src/lib/ndjson.ts | Adds NDJSON parsing/serialization helpers (string/chunks/streams). |
| apps/engine/src/lib/ndjson.test.ts | Adds tests for NDJSON parsing across chunk boundary scenarios. |
| apps/engine/src/lib/index.ts | Re-exports engine library surface area. |
| apps/engine/src/lib/exec.test.ts | Adds tests for exec-based connector wrappers using real connector binaries. |
| apps/engine/src/lib/exec-helpers.ts | Adds spawn helpers for streaming/collecting subprocess NDJSON. |
| apps/engine/src/lib/engine.ts | Adds the core in-process Engine implementation (streams everywhere). |
| apps/engine/src/lib/destination-test.ts | Adds test destination connector implementation for engine tests. |
| apps/engine/src/lib/destination-filter.ts | Adds catalog field selection pruning middleware for destinations. |
| apps/engine/src/lib/destination-filter.test.ts | Adds tests for destination field-selection pruning. |
| apps/engine/src/lib/destination-exec.ts | Adds subprocess-backed Destination wrapper (exec-based connector). |
| apps/engine/src/lib/default-connectors.ts | Defines default in-process connectors bundled with engine. |
| apps/engine/src/lib/createSchemas.ts | Builds typed connector config/input schemas for OpenAPI generation. |
| apps/engine/src/lib/backfill.ts | Adds retry-until-complete helper for paging/backfill via repeated sync calls. |
| apps/engine/src/lib/backfill.test.ts | Adds tests for backfill retry and state carry-forward. |
| apps/engine/src/index.ts | Exposes engine library + API factory + NDJSON response helper. |
| apps/engine/src/cli/sync.ts | Adds a sync CLI shortcut for Stripe→Postgres with optional state handling. |
| apps/engine/src/cli/supabase.ts | Adds Supabase install/uninstall CLI integration. |
| apps/engine/src/cli/index.ts | Adds CLI entrypoint. |
| apps/engine/src/cli/command.ts | Adds CLI program builder + shared connector discovery flags. |
| apps/engine/src/cli/backfill.ts | Adds backfill CLI that retries a remote engine until EOF complete. |
| apps/engine/src/api/openapi-utils.ts | Adds helper utilities for OpenAPI doc patching/table printing. |
| apps/engine/src/api/index.ts | Adds a runnable API server entrypoint with Node/Bun branching. |
| apps/engine/src/api/header-size.test.ts | Adds tests validating support for large X-Pipeline header sizes. |
| apps/engine/src/tests/sync.test.ts | Adds dockerized Postgres lifecycle test for sync & resume flow. |
| apps/engine/src/tests/stripe-to-postgres.test.ts | Adds Stripe-mock to Postgres integration tests for selective sync and resume. |
| apps/engine/src/tests/openapi.test.ts | Adds engine OpenAPI validation and schema/content-key assertions. |
| apps/engine/src/tests/docker.test.ts | Adds Docker image build/run smoke tests for engine CLI/server. |
| apps/engine/package.json | Introduces the @stripe/sync-engine package manifest and exports. |
| apps/dashboard/vitest.config.ts | Adds dashboard Vitest config with @ alias. |
| apps/dashboard/vite.config.ts | Adds Vite config including proxying to engine/service backends. |
| apps/dashboard/tsconfig.json | Adds strict TS config + workspace OpenAPI type paths. |
| apps/dashboard/src/pages/PipelineList.tsx | Adds pipeline list UI with delete action. |
| apps/dashboard/src/pages/PipelineDetail.tsx | Adds pipeline detail UI including polling and progress display. |
| apps/dashboard/src/pages/PipelineCreate.tsx | Adds pipeline creation wizard with connector config + stream selection. |
| apps/dashboard/src/main.tsx | Adds React app bootstrap. |
| apps/dashboard/src/lib/utils.ts | Adds Tailwind/clsx class merging helper. |
| apps/dashboard/src/lib/stream-groups.ts | Adds stream grouping/filtering heuristics for UI. |
| apps/dashboard/src/lib/stream-groups.test.ts | Adds unit tests for stream grouping/filtering heuristics. |
| apps/dashboard/src/lib/api.ts | Adds typed engine/service API clients and discovery helper. |
| apps/dashboard/src/index.css | Adds Tailwind import entry. |
| apps/dashboard/src/components/StreamSelector.tsx | Adds grouped stream selector component with search and bulk select. |
| apps/dashboard/src/components/JsonSchemaForm.tsx | Adds dynamic form renderer for JSON Schema connector configs. |
| apps/dashboard/src/App.tsx | Adds minimal client-side routing + top-level app layout. |
| apps/dashboard/playwright.config.ts | Adds Playwright e2e configuration using system Chrome. |
| apps/dashboard/package.json | Adds dashboard package manifest and scripts. |
| apps/dashboard/index.html | Adds dashboard HTML entry. |
| apps/dashboard/e2e/pipeline-create.test.ts | Adds e2e test for full pipeline creation flow. |
| apps/dashboard/e2e/global-teardown.ts | Adds e2e teardown that stops the engine server. |
| apps/dashboard/e2e/global-setup.ts | Adds e2e setup that starts an engine server for tests. |
| apps/dashboard/Dockerfile | Adds a dev Dockerfile for dashboard (pnpm + Vite). |
| apps/dashboard/.gitignore | Adds dashboard-specific gitignore. |
| README.md | Removes the root README content. |
| Dockerfile | Replaces root Dockerfile with multi-target build (engine + service) using pnpm deploy. |
| CHANGELOG.md | Adds changelog content including a 2026-04-11 entry. |
| AGENTS.md | Rewrites agent instructions and repo conventions. |
| .vscode/settings.json | Updates VS Code workspace settings (Deno path + Ruby LSP + explorer sort). |
| .vscode/extensions.json | Adds recommended VS Code extension list. |
| .verdaccio/config.yaml | Adds Verdaccio config for local/private registry workflows. |
| .ruby-version | Adds pinned Ruby version file. |
| .releaserc.json | Removes semantic-release config. |
| .prettierignore | Expands ignore list (generated outputs, terraform, etc.). |
| .npmrc | Adds configurable @stripe registry via env var. |
| .github/workflows/release.yml | Changes release workflow to “promote SHA” flow (manual fallback). |
| .github/workflows/docs.yml | Removes docs publish workflow. |
| .github/workflows/audit.yml | Adds a scheduled/manual repo audit workflow. |
| .githooks/pre-push | Adds pre-push hook to validate generated OpenAPI specs are up to date. |
| .eslintrc.js | Removes root ESLint config file. |
| .dockerignore | Updates dockerignore (no longer ignores dist). |
Files not reviewed (2)
- apps/engine/src/generated/openapi.json: Language not supported
- apps/service/src/generated/openapi.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Monorepo Dockerfile — build individual images with --target: | ||
| # docker build --target engine . → @stripe/sync-engine (stateless HTTP API) | ||
| # docker build --target service . → @stripe/sync-service (serve + worker CLI) |
There was a problem hiding this comment.
The PR title/description focuses on mitmweb proxy intercept tooling and a new CI job, but the diff is dominated by adding new apps (apps/engine, apps/service, apps/dashboard), Docker/CI/tooling changes, and numerous tests. Either the PR description needs to be updated to match the actual scope, or the unrelated changes should be split into separate PRs to keep reviewable boundaries and reduce merge risk.
| const resolver = createConnectorResolver({ | ||
| sources: { stripe: sourceStripe }, | ||
| destinations: { postgres: destinationPostgres, google_sheets: destinationGoogleSheets }, | ||
| }) | ||
|
|
||
| const app = createApp(resolver) |
There was a problem hiding this comment.
createConnectorResolver and createApp are async in this codebase (used with await elsewhere). Here they are called without await, so resolver/app will be Promises at runtime and serve({ fetch: app.fetch, ... }) will fail. Fix by awaiting both calls (and ensure any subsequent uses handle the resolved values).
| const resolver = createConnectorResolver({ | |
| sources: { stripe: sourceStripe }, | |
| destinations: { postgres: destinationPostgres, google_sheets: destinationGoogleSheets }, | |
| }) | |
| const app = createApp(resolver) | |
| const resolver = await createConnectorResolver({ | |
| sources: { stripe: sourceStripe }, | |
| destinations: { postgres: destinationPostgres, google_sheets: destinationGoogleSheets }, | |
| }) | |
| const app = await createApp(resolver) |
| const engine = createEngine(makeResolver()) | ||
| const catalogMsg = await collectFirst( | ||
| engine.source_discover({ | ||
| type: 'stripe', | ||
| stripe: { api_key: 'sk_test_fake', base_url: STRIPE_MOCK_URL }, | ||
| }), | ||
| 'catalog' | ||
| ) |
There was a problem hiding this comment.
createEngine is declared as async and returns Promise<Engine>, but this test calls it without await and then immediately calls engine.source_discover(...). This will throw at runtime because engine is a Promise. Update all occurrences in this file to const engine = await createEngine(...).
| } | ||
|
|
||
| // Scan PATH directories | ||
| const pathDirs = (process.env['PATH'] ?? '').split(':') |
There was a problem hiding this comment.
Splitting PATH with ':' is not portable (Windows uses ;). Use path.delimiter from node:path (or an equivalent helper) when splitting PATH to keep connector discovery cross-platform.
| export function splitCmd(cmd: string): [string, string[]] { | ||
| const parts = cmd.trim().split(/\s+/) | ||
| const [bin = cmd, ...baseArgs] = parts |
There was a problem hiding this comment.
The whitespace split does not respect shell quoting/escaping, so commands like node -e \"console.log('a b')\" or paths with spaces will be split incorrectly. Consider either (1) storing command + args as an array throughout the resolver/commandMap path, or (2) using a small shell-quote parser to correctly split strings when you must accept a single command string.
| const output = await validate( | ||
| 'https://spec.openapis.org/oas/3.1/schema-base', | ||
| serviceSpec as unknown as Json | ||
| ) |
There was a problem hiding this comment.
This test depends on fetching the OpenAPI 3.1 schema from the public internet at runtime. That can make CI flaky or fail in restricted environments. Prefer vendoring/pinning the schema locally (or using a resolver/cache supported by the validator) so tests are deterministic and do not require external network access.
| const output = await validate( | |
| 'https://spec.openapis.org/oas/3.1/schema-base', | |
| serviceSpec as unknown as Json | |
| ) | |
| const output = await validate(serviceSpec as unknown as Json) |
| db | ||
| test | ||
| dist | ||
| node_modules |
There was a problem hiding this comment.
The dockerignore no longer excludes dist/, which can significantly inflate the Docker build context and slow down builds (especially in CI). If dist/ is generated locally, it should typically be excluded from the build context unless there's a specific need to include it.
| node_modules | |
| node_modules | |
| dist |
| // Pipeline config and connector state are passed via the X-Pipeline header. | ||
| // Node.js defaults to 16 KB which caps state at ~250 entries — too small for | ||
| // connectors like google-sheets that carry row mappings. 50 MB is a conservative | ||
| // ceiling; typical headers remain small. See docs/engine/header-size-limits.md | ||
| serverOptions: { maxHeaderSize: 50 * 1024 * 1024 }, |
There was a problem hiding this comment.
Setting maxHeaderSize to 50 MB materially increases memory usage per connection and can amplify DoS risk (large headers are buffered/parsed before app handling). If large payloads are required, consider moving pipeline/state off headers (e.g., request body for state and an ID reference header), or at minimum make this limit configurable via env with a smaller default and document required reverse-proxy limits.
Fixes CI failure in the test job where bun isn't available yet. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Tests that ws connections route through mitmweb via explicit HttpsProxyAgent (ws ignores HTTP_PROXY and --use-env-proxy). CI mitmweb job now installs pnpm deps so ws/https-proxy-agent resolve. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Node 24 --use-env-proxy makes undici/fetch respect HTTPS_PROXY natively, so the manual ProxyAgent dispatcher injection in fetchWithProxy is redundant. Replace all call sites with tracedFetch (curl trace logging only, no proxy logic) and remove withFetchProxy/fetchWithProxy entirely. getHttpsProxyAgentForTarget is kept — ws does not respect --use-env-proxy and still needs an explicit HttpsProxyAgent for WebSocket connections. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
- Restore getAccount to use requestWithRetry (e0fb332 unintentionally removed retry by switching to request directly) - Update test that expected getAccount failure to halt backfill: since getAccountCreatedTimestamp now swallows errors (fault-tolerant), backfill proceeds with STRIPE_LAUNCH_TIMESTAMP fallback instead of failing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
…at startup - Add --use-env-proxy flag to both engine and service Docker ENTRYPOINT so fetch/undici respects HTTPS_PROXY env var in container deployments - Call assertUseEnvProxy() at startup in both CLI entry points so proxy misconfiguration (proxy set without --use-env-proxy) is caught immediately with a clear error rather than silently bypassing the proxy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
- Resolve merge conflicts with v2 (combine retry label + signal, use tracedFetch with combineSignals/makeApiFetch from v2) - Replace ss -tlnp (Linux-only) with a portable _port_listening helper that tries ss → lsof → nc, so the script works on macOS too - Add TODO for logging in getAccountCreatedTimestamp silent catch Made-with: Cursor Committed-By-Agent: cursor
Summary
IR-readiness improvements across the proxy/transport layer, a new backfill CLI, and observability enhancements.
Proxy-aware transport refactor (Node 24
--use-env-proxy)undiciProxyAgentdispatchers — Node 24's--use-env-proxyflag makesfetchrespectHTTPS_PROXY/HTTP_PROXYnatively, replacing the customfetchWithProxy/withFetchProxyplumbing with a simplertracedFetchwrapperassertUseEnvProxy()guard (packages/ts-cli/src/env-proxy.ts) — throws at startup if proxy env vars are set without--use-env-proxy, preventing silent proxy bypass. Bun is exempt (respects proxy natively)--use-env-proxyto both engine and service containersmakeApiFetch(signal)+combineSignalsensure abort signals flow through to OpenAPI spec fetches and retries (withHttpRetrynow acceptssignal)mitmweb intercept tooling
scripts/mitmweb-env.sh— sources into a shell to route Node/Bun/curl through mitmweb; auto-installs mitmproxy if missing, auto-detects upstream proxy from env. Uses a portable_port_listeninghelper (ss → lsof → nc) so it works on both Linux and macOSscripts/mitmweb-env.test.sh— verifies curl, Node fetch, Bun fetch, and ws WebSocket all route through the proxymitmwebCI job runs the intercept test in parallelBackfill CLI command
pipelineSyncUntilComplete(apps/engine/src/lib/backfill.ts) — loopspipeline_syncuntileof:complete, carrying state forward acrossstate_limit/time_limitboundariesbackfillCLI subcommand (apps/engine/src/cli/backfill.ts) — calls a remote sync engine repeatedly with state persistence to a JSON fileRetry & observability improvements
withHttpRetrynow accepts alabeloption (e.g.GET /v1/customers) for easier debugging; error messages included in retry outputtracedFetchemits reproducible curl commands whenDANGEROUSLY_VERBOSE_LOGGING=truegetAccountCreatedTimestamp— catches errors and falls back toSTRIPE_LAUNCH_TIMESTAMPinstead of halting backfillgetAccountretry restored — fixed accidental removal of retry wrapper from an earlier commitHousekeeping
ts-climoved next to source files (co-located test pattern)Test plan
v2base branch