Multi-tenant refactor + Network integration (8 commits)#10
Merged
keithfawcett merged 11 commits intomainfrom Apr 27, 2026
Merged
Multi-tenant refactor + Network integration (8 commits)#10keithfawcett merged 11 commits intomainfrom
keithfawcett merged 11 commits intomainfrom
Conversation
Three new migrations and matching type updates lay the groundwork for
multi-tenant deployments while keeping single-tenant self-host working
identically (just with tenantId='default' baked in).
20260507000000_multi_tenant
- New Tenant table; seeded 'default' tenant.
- tenantId column on every data table (Partner, Campaign, Link, Click,
Identity, Event, Attribution, Commission, Payout, ApiKey, Config,
Admin, MagicLinkToken, Session, WebhookEndpoint, WebhookDelivery).
- Backfill existing rows to the default tenant.
- Re-scope unique constraints to be per-tenant: Partner.email,
Admin.email, Link.linkKey, Config.(key→tenantId,key).
20260507010000_rls_policies
- PlatformAdmin table (cross-tenant Coherence support staff).
- RLS ENABLE + FORCE on every tenanted table.
- Policy: row visible iff tenantId matches `app.tenant_id` GUC OR
`app.platform_admin` GUC = 'on'.
- Tenant table: row visible iff its id matches app.tenant_id (or
platform admin). Same for PlatformAdmin.
- Policies use COALESCE / current_setting(..., true) so an unset GUC
returns 0 rows (default deny) instead of erroring.
20260507020000_app_role
- Provisions a non-superuser openpartner_app role from
OPENPARTNER_APP_DB_PASSWORD. Postgres bypasses RLS for superusers
and BYPASSRLS roles regardless of FORCE, so RLS only protects when
the app connects as a constrained role.
- Grants DML (no DDL) on every tenanted table.
- Idempotent: rotates password if the role already exists.
- Skipped (with notice) when OPENPARTNER_APP_DB_PASSWORD is unset —
self-host installs that don't need RLS isolation can run the app as
the same role as migrations.
Migration runner sets `row_security = off` at session start so DDL
runs unrestricted.
Verified: connecting as openpartner_app, queries return 0 rows when
app.tenant_id is unset or mismatched, and only the in-scope tenant's
rows when set correctly. Platform-admin override works.
Types: every Row interface gained `tenantId: string`; new TenantRow,
PlatformAdminRow types and DEFAULT_TENANT_ID constant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two knex instances now:
db (admin pool, DATABASE_URL)
- migrations, signup, platform-admin tooling, jobs that need
cross-tenant access
- bypasses RLS (superuser/owner role)
appDb (app pool, DATABASE_URL_APP if set, else DATABASE_URL)
- normal request handling. When pointed at the openpartner_app role
every query is subject to RLS.
- per-request transaction in tenancy middleware sets
`app.tenant_id` (and optionally `app.platform_admin = 'on'`)
so RLS policies match correctly.
OPENPARTNER_TENANCY env (defaults 'single'):
single — every request runs as tenantId = DEFAULT_TENANT_ID. Self-host.
multi — path-based tenant resolution (/t/<slug>/...). Reserved
slugs (www, api, app, signup, etc.) reject.
tenantMiddleware:
- resolves tenantId for the request
- opens a transaction on appDb
- stamps req.db, req.tenantId, req.tenantSlug
- awaits response finish before committing/rolling back so handler
queries land in the right transaction context.
Routes will switch from `db('Partner')...` to `req.db('Partner')...`
and add `tenantId: req.tenantId` to inserts. That refactor is the next
commit; this one just lays the wiring.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Architecture decisions, what's committed, file-by-file refactor plan, test fixup plan, and how to resume. Read this first before continuing the multi-tenant work on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section A + B + C + E of the multi-tenant refactor: every route handler now uses tenantOf(req) for a per-request transaction with app.tenant_id pinned. Helpers (auth-sessions, auth.resolvePrincipal, config, mail- settings, mailer, attribution, payouts, usage-billing, webhook-dispatcher) take Knex + tenantId as parameters. tenantMiddleware is mounted in app.ts; install + metrics stay public above it. Stripe webhook resolves tenantId from event metadata and runs each event in appDb.transaction with SET LOCAL app.tenant_id. Scheduler iterates active tenants per tick. Typecheck passes. What this leaves: section D (public /signup), F (test seed updates so 35 of 64 currently-failing tests go green), G (multi-tenant isolation tests), H (env config + ops). Documented in docs/multi-tenant-refactor.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section D + F of the multi-tenant refactor.
D — POST /signup creates a Tenant + first Admin and emails an activation
magic link. Public, IP rate-limited (10/min), gated by slug validation
(/^[a-z0-9-]{3,30}$/, not in RESERVED_SLUGS, not already taken). Mounted
before tenantMiddleware in app.ts and uses the privileged db. Multi-mode
only — single-mode operators use /install.
F — every direct db().insert() in integration.test.ts, regressions.test.ts,
stripe-webhook.test.ts, and webhooks.test.ts now stamps tenantId:
DEFAULT_TENANT_ID. Test setups force OPENPARTNER_TENANCY=single. Cannot
verify against a live Postgres in this session; flagged as DONE BUT NOT
VALIDATED in docs/multi-tenant-refactor.md so the next pass runs the
suite first.
Handoff doc updated with current branch state and remaining work
(sections G, H + test validation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section H of the multi-tenant refactor. - .env.example: OPENPARTNER_TENANCY, OPENPARTNER_APP_DB_PASSWORD, DATABASE_URL_APP with explanatory comments. - docker-compose.yml: mount docker/initdb so postgres provisions the openpartner_app role on first boot. Role is NOLOGIN if no password set so RLS isolation can still be exercised via SET ROLE in tests. - .do/app.yaml: add OPENPARTNER_TENANCY=multi (default for hosted), DATABASE_URL_APP + OPENPARTNER_APP_DB_PASSWORD secrets on the api component. - docs/deploy-production.md: rows for the new secrets in the env table; new "Multi-tenant rollout" subsection covering URL routing, signup, RLS engagement, Stripe webhook tenant resolution, reserved slugs, and the migration path from single-tenant. The route, helper, signup, and stripe-webhook refactors plus this ops layer make the multi-tenant branch deployable. What's left in docs/multi-tenant-refactor.md is section G (live-Postgres isolation tests) — needs a real DB to write meaningfully. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dd isolation tests
Section G of the multi-tenant refactor + two real bugs the existing
suite surfaced once it ran against a real Postgres.
Bug 1: privileged db was subject to FORCE RLS. The migration role
owns the tenanted tables but FORCE RLS still gates the owner unless
row_security is explicitly off or app.tenant_id is set. Without
either, /metrics, /signup, the stripe-webhook tenant resolver, the
scheduler, and every test's direct cleanup query silently saw zero
rows. Fixed by adding bypassRls: true to createDb (sets row_security
= off in afterCreate) and turning it on for the privileged pool. The
appDb (tenant pool) keeps RLS engaged.
Bug 2: tenantMiddleware committed the per-request transaction on
res.on('finish'), which fires AFTER the response is sent. Tests doing
`await request(app).post(...)` then `await db(...).insert(...)` raced
the commit and got FK violations because the route's writes weren't
yet visible. Fixed by patching res.json/send/end so the trx commits
(or rolls back on 5xx) before any byte goes out. Belt-and-suspenders
res.on('close') still rolls back if the patched methods are bypassed.
Section G: apps/api/src/__tests__/multi-tenant.test.ts — 9 tests
that connect as openpartner_app via SET ROLE inside a privileged-pool
transaction (so RLS engages because openpartner_app has neither
BYPASSRLS nor superuser). Covers default deny, per-tenant visibility,
WITH CHECK rejection on cross-tenant inserts, platform_admin override,
session isolation, and the Tenant table self-policy. Suite skips
cleanly with a warning if the openpartner_app role isn't provisioned.
Stripe webhook tenant resolution: customer/invoice/charge events that
don't carry our metadata now fall back to a local Identity → Click
lookup so checkout-stitched customers still route to the right tenant
on subsequent invoice.paid / charge.refunded.
Result: 73/73 tests pass against the docker-compose postgres.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three pieces, designed so the same code paths cover hosted multi-tenant and self-host: 1. Public POST /partner-signup (apps/api/src/routes/partner-signup.ts). Tenant-scoped, IP rate-limited, creates a Partner row + magic link. Honors a per-tenant partner_signup config (auto_approve vs require_review, with disabled override). On hosted multi-tenant the URL is /t/<slug>/partner-signup; on self-host it's /partner-signup. 2. Vendor-side Network client (apps/api/src/network-client.ts) + NetworkOutbox migration. Fire-and-forget POSTs to /partners/upsert on creator events (signup, admin invite, revoke); failures persist to the outbox and the scheduler drains them every 5 min with exponential backoff (~24h max). vendorToken stored AES-GCM encrypted in Config (network_membership), never returned by GET /config/network. backfillPartners(...) reconciles a vendor's existing roster when they enable Network membership later — the Network dedups on email and returns alreadyExisted=true for creators who joined another vendor first. 3. Network protocol spec (docs/network-protocol.md). Defines the /vendors/register, /partners/upsert, /vendors/backfill-partners, and /vendors/me/heartbeat surface that openpartner-network implements. Spells out the identity model (vendorId, vendorPartnerId, networkCreatorId), auth rotation, and the late-join reconciliation behavior. Wired into existing flows: - POST /partners (admin invite) + /partners/:id/revoke push to Network when membership is enabled. autoEnroll gates new-partner upserts; revokes mirror unconditionally so a Network-known creator stops being matched after the vendor cuts them off. - Settings router exposes GET/POST /config/network, POST /config/network/backfill, and GET/POST /config/partner-signup. - Scheduler runs network-outbox-drain every 5 min per active tenant. Tests (apps/api/src/__tests__/network-and-signup.test.ts, 9 cases) spin up an in-process HTTP receiver to act as the Network and verify: signup without Network is silent; with Network on stamps networkCreatorId on Partner.metadata.network; with Network down enqueues outbox; drain retries succeed; require_review still pushes status=pending; admin invite + revoke push; late-join backfill flips preExisting=true for emails the Network already knew; GET /config/network never leaks the vendor token. 82/82 tests pass against the docker-compose postgres. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the openpartner vendor side to the openpartner-network
self-serve onboarding flow.
network-client.ts: signupWithNetwork() POSTs to /vendors/signup;
completeNetworkConnect() POSTs to /vendors/verify-and-issue-token.
Failures surface immediately to the admin (no outbox queueing — a
failed signup is something the admin retries by hand).
routes/settings.ts: POST /config/network/start-connect mints a fresh
scoped key with NETWORK_FEDERATION_SCOPES, calls signupWithNetwork
with inferred instanceUrl + portalCallbackUrl, stashes partial state
in network_membership Config (enabled=false until verify lands).
POST /config/network/complete-connect consumes the magic-link ntoken,
calls Network /vendors/verify-and-issue-token, saves the returned
vendorToken with enabled=true. Same shape works for hosted multi-
tenant tenants (slug-aware URL inference) and self-host (request host).
routes/signup.ts: hosted multi-tenant signup auto-calls
signupWithNetwork after Tenant/Admin creation when NETWORK_URL env
is set. Best-effort: a Network outage doesn't fail the signup; the
admin can finish later via Settings → Network → Connect button.
Returns network: { status, vendorId } in the signup response so the
portal can show the right next-step UI.
.env.example: NETWORK_URL added with explanatory comment.
82/82 vendor-side tests still pass (no regressions; the new endpoints
are additive).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the gap where vendors had backend wiring for Network membership
but no UI to use it. Without this, Network was invisible to vendor
admins on hosted multi-tenant + self-host.
Backend (apps/api):
- network-client.ts: NetworkProxyError + networkProxy.{listOfferings,
createOffering, updateOffering, deleteOffering, listRequests,
approveRequest, rejectRequest, whoami}. Decrypts the vendor token
from network_membership Config and proxies to Network endpoints
with the right bearer.
- routes/settings.ts: /admin/network/{me,offerings,offerings/:id,
requests,requests/:id/approve,requests/:id/reject}. Each is a thin
wrapper around networkProxy.* that turns NetworkProxyError into
the appropriate HTTP status. Required because the vendorToken is
a server-side secret — the portal can't hold it.
Portal (apps/portal):
- pages/admin/Network.tsx: connection status, contact-email/display-
name form for the Connect button, autoEnroll toggle, backfill
panel for late-join reconciliation.
- pages/admin/NetworkComplete.tsx: handles ?ntoken= callback from
the Network onboarding email; calls /config/network/complete-connect,
redirects to /admin/network on success. StrictMode-safe (one-shot
guard).
- pages/admin/NetworkOfferings.tsx: list + create + publish/unpublish
+ delete. Campaign dropdown pulls from /campaigns. Form fields:
title, description, productUrl, campaign, commission summary,
cookie window.
- pages/admin/NetworkRequests.tsx: pending requests list with creator
bio + pitch; approve dispatches federation (creates Partner +
Link on this instance); reject + status filter (pending /
approved / rejected / cancelled).
Wired into App.tsx routes + a new "Network" sidebar section
(Connection, Offerings, Requests).
Typecheck passes; portal builds (318 KB JS, 92 KB gzip).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The full multi-tenant refactor + the integration with the new
openpartner-networkrepo.Multi-tenant (commits 1–6):
feat(db): multi-tenant foundation + RLS— Tenant + tenantId on every data table; FORCE RLS policies + openpartner_app role.feat(api): tenancy middleware + connection split— privileged `db` (admin pool) + `appDb` (app pool, RLS-engaged).feat(tenancy): add tenantOf(req) helper— route ergonomics.feat(api): route + helper refactor— every handler uses `tenantOf(req)`; helpers (auth-sessions, auth.resolvePrincipal, attribution, payouts, usage-billing, webhook-dispatcher, mailer, mail-settings, config) take `(db, tenantId, ...)`.feat(api): public /signup + test seed tenantId fixes.chore(ops): multi-tenant env + docker + DO + docs.fix(tenancy): bypass RLS on privileged db, commit trx pre-response, add isolation tests— fixed two real bugs (privileged-pool RLS gating, trx-commit-after-response race) and added 9 isolation tests.Network integration (commits 7–8):
feat(network): creator self-signup + vendor↔Network protocol— new POST /partner-signup; network-client.ts (NetworkOutbox, push/upsert/revoke, backfill, drainOutbox); /config/network settings + backfill endpoint; partners.ts wired to push on admin-create + revoke; NETWORK_FEDERATION constants; openpartner-network repo's protocol layer + 9 round-trip tests.feat(network): vendor-side onboarding integration— signupWithNetwork + completeNetworkConnect helpers; POST /config/network/start-connect + /complete-connect; hosted /signup auto-registers when NETWORK_URL is set.feat(portal): vendor admin Network UI— Connection, Offerings, Requests pages + complete-connect callback. NetworkProxy backend routes that decrypt the vendorToken and proxy to Network.Test plan