Skip to content

Feature: PostgreSQL LIST partitioning support for multi-tenant data isolation #232

@intel352

Description

@intel352

Problem

In multi-tenant applications, PostgreSQL LIST partitioning by tenant/affiliate ID is a common pattern for data isolation:

CREATE TABLE forms (
    id UUID PRIMARY KEY,
    tenant_id TEXT NOT NULL,
    slug TEXT NOT NULL,
    ...
) PARTITION BY LIST (tenant_id);

CREATE TABLE forms_org_alpha PARTITION OF forms FOR VALUES IN ('org-alpha');
CREATE TABLE forms_org_beta  PARTITION OF forms FOR VALUES IN ('org-beta');

This provides strong data isolation guarantees (tenant A's queries physically cannot touch tenant B's partition) and enables independent per-tenant maintenance operations.

Currently, there is no engine-level support for this pattern. Applications must handle partition management themselves, and pipeline steps like step.db_query / step.db_exec require manual WHERE tenant_id = $X clauses in every query.

Proposed Solution

1. Partition-aware module type: database.partitioned

A module that automatically manages LIST partitions when new tenants are onboarded:

- name: db
  type: database.partitioned
  config:
    driver: pgx
    dsn: "${DB_DSN}"
    partition_key: tenant_id
    tables:
      - forms
      - form_submissions
      - follow_ups

When a new tenant is registered, the module creates partitions for all configured tables.

2. Automatic tenant scoping in step.db_query / step.db_exec

An optional tenant_key config that automatically appends AND tenant_id = $N to queries:

- name: fetch
  type: step.db_query
  config:
    database: db
    tenant_key: auth.tenant_id   # from JWT claim or pipeline context
    query: "SELECT * FROM forms WHERE active = true ORDER BY created_at DESC"

This would reduce boilerplate and eliminate the risk of forgetting the tenant filter on a query.

Use Case

  1. Multi-tenant SaaS where each organization's data must be physically isolated
  2. New tenants are onboarded dynamically — partitions must be created at runtime
  3. Every data query must be scoped to the requesting tenant's partition
  4. Currently requires custom step types for partition management and manual WHERE clauses in every pipeline query

Benefits

  • Security: Automatic tenant scoping eliminates accidental cross-tenant data leaks from missing WHERE clauses
  • Performance: PostgreSQL partition pruning ensures queries only scan the relevant tenant's data
  • Maintenance: Per-partition operations (VACUUM, index rebuilds) don't lock other tenants
  • Simplicity: Application developers don't need to remember tenant filtering on every query

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions