Skip to content

Comments

Stateless DDL replication#691

Merged
eminano merged 43 commits intov1-0-0-rcfrom
stateless-ddl-replication
Feb 3, 2026
Merged

Stateless DDL replication#691
eminano merged 43 commits intov1-0-0-rcfrom
stateless-ddl-replication

Conversation

@eminano
Copy link
Contributor

@eminano eminano commented Jan 26, 2026

Description

This branch implements a major architectural shift from the existing schemalog table based DDL replication to a stateless approach using DDL events emitted directly into the WAL.

DDL tracking

Instead of maintaining a separate schema log store/table to track DDL changes, the system now:

  • Captures DDL events as logical messages in the PostgreSQL WAL stream
  • Computes table metadata and attaches it to the logical message alongside the DDL query
  • Computes schema diffs directly from the DDL event during the processing pipeline

This means schema changes are sent through the same replication stream as data changes, preserving the order of data events, making the entire system stateless since there's no external schema log to maintain or query.

Injector

The injector relied on the pgstream.table_ids table to get the pgstream ids that would then be used by processors such as the search indexer. Before, it offered the option to select a version column for the event while using the LSN by default. The refactor was used to simplify this part, and use the LSN as version for all events, removing the added complexity of configuring the version.

The identity columns are still selected in the same way (primary keys or unique not null columns), and are still required for replication.

Migrations

The migrations have been simplified and are now split by functionality:

  • Core: this contains the basic migrations to make pgstream work (functions and event triggers for DDL replication)
  • Injector: this contains the migrations to inject pgstream metadata into events (including table_ids table), required for search targets to work.

This way, only the relevant migrations are applied for each specific usecase, reducing the footprint on the source database.

The initialization flow has been updated accordingly to be able to manage the new migrations structure, as well as the status checker.

Snapshots

The schema snapshot generator was simplified by replacing the dedicated schema log generator with a restoreToWAL function that parses DDL statements from pg_dump output and converts them directly into WAL DDL events. These snapshot DDL events now flow through the same processor pipeline as runtime DDL events, eliminating the need for the schema log table and creating a single unified processing path for all schema changes.

Search processor

With the new schemalog-less approach, the search processor had to be refactored. Before, it maintained a pgstream index with schemalog entries that was used to map the internal pgstream IDs used as field names with the table and column names. Now, there's no schemalog tracking on the search store, and instead aliases are created in the index mapping to map names to pgstream IDs.

  • When a column is renamed, a new alias is added while preserving the original storage
  • When a table is renamed, all column aliases are transferred to the new table name
    This makes the search integration much more user-friendly while maintaining the stability benefits of using immutable internal IDs for storage.

Commits have been split for ease of reviewing. Each commit can be reviewed independently to simplify the review process.

Related Issue(s)

Type of Change

  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📚 Documentation update
  • 🔧 Refactoring (no functional changes)
  • 🧹 Code cleanup

Changes Made

  • Removed schema log dependency from all components.
  • Added support for logical messages in WAL data
  • Added WAL DDL event structure
  • Added WAL schema diff computation
  • Updated all processors to use DDL events instead of schema log
  • Added new migrator internal library
  • Added new injector migrations
  • Added new core migrations
  • Added CLI flags for injector migrations
  • Updated stream initialization to apply multiple migrations
  • Updated status checker to support multiple migrations
  • Fixed OpenSearch compatibility with latest version
  • Removed internal column version from WAL data
  • Updated integration tests for new architecture

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • All existing tests pass

Additional Notes

This PR will be part of a new major pgstream version, v1.0.0, since it contains structural changes that break backwards compatibility (removal of the schema_log and all previous pgstream state in the source database).

Functionally the overall behaviour should remain the same, but the underlying behaviour is inherently different, so by having a new major version we ensure enough care is put into the migration.

@eminano eminano requested a review from Copilot January 26, 2026 16:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a major architectural shift from schema log table-based DDL replication to a stateless approach using DDL events emitted directly into the WAL stream. Instead of maintaining a separate pgstream.schema_log table, the system now captures DDL events as logical messages in the PostgreSQL WAL, computes table metadata on-the-fly, and processes schema diffs directly from DDL events.

Changes:

  • Removed schema log dependency from all components and replaced with WAL-based DDL event processing
  • Split migrations into core (basic DDL replication) and injector (metadata injection with table_ids) for reduced database footprint
  • Updated initialization flow, status checker, and CLI to support the new multi-migration structure
  • Fixed OpenSearch compatibility issues and removed internal column version from WAL data

Reviewed changes

Copilot reviewed 132 out of 149 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/wal/processor/search/store/search_adapter.go Updated to use versioned index names and removed schema log conversion method
pkg/wal/processor/search/store.go Changed interface from schema log entries to WAL schema diffs
pkg/wal/processor/search/search_store_retrier.go Updated method signature to use schema diffs instead of log entries
pkg/wal/processor/search/search_msg_batch.go Replaced schema change field with schema diff field
pkg/wal/processor/search/search_batch_indexer.go Updated to process schema diffs and removed schema log specific logic
pkg/wal/processor/search/errors.go Removed version-related error constant
pkg/wal/processor/postgres/postgres_writer.go Removed schema log store dependency from writer initialization
pkg/wal/processor/postgres/postgres_wal_adapter.go Updated to use DDL events instead of schema log events
pkg/wal/processor/postgres/postgres_schema_observer.go Refactored to process DDL events instead of schema log entries
pkg/wal/processor/postgres/postgres_bulk_ingest_writer.go Added flag to ignore DDL events
pkg/wal/processor/postgres/postgres_batch_writer.go Removed schema log store dependency and added feature not supported error handling
pkg/wal/processor/postgres/config.go Removed schema log store configuration
pkg/wal/processor/kafka/wal_kafka_batch_writer.go Updated message key extraction to use DDL events
pkg/wal/processor/filter/wal_filter.go Updated filtering logic to work with DDL events
pkg/stream/stream_status.go Changed migration status from singular to array to support multiple migrations
pkg/stream/stream_init.go Refactored to use new migrator library with support for multiple migration sets
pkg/stream/config.go Added helper methods for init configuration with injector migration flag
internal/searchstore/search_api.go Added JSON tags to mapping structs
internal/searchstore/opensearch/opensearch_client.go Fixed index mapping retrieval to work with aliases
internal/postgres/errors.go Added feature not supported error type
internal/migrator/migrator.go New internal library for managing multiple migration sets
migrations/postgres/core/* New core migrations for basic DDL replication functionality
migrations/postgres/injector/* New injector migrations for metadata injection functionality
docs/* Updated documentation to reflect stateless DDL replication approach
cmd/* Updated CLI commands to support injector migration flags

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@eminano eminano force-pushed the stateless-ddl-replication branch from 6ea2cd0 to 4c178f2 Compare January 27, 2026 13:28
@eminano eminano requested a review from exekias January 27, 2026 13:32
@eminano eminano force-pushed the stateless-ddl-replication branch from 4c178f2 to fcd936b Compare January 28, 2026 14:07
@github-actions
Copy link

Merging this branch changes the coverage (9 decrease, 8 increase)

Impacted Packages Coverage Δ 🤖
github.com/xataio/pgstream/cmd 0.00% (ø)
github.com/xataio/pgstream/cmd/config 84.22% (+0.19%) 👍
github.com/xataio/pgstream/internal/migrator 0.00% (ø)
github.com/xataio/pgstream/internal/postgres 30.11% (-0.14%) 👎
github.com/xataio/pgstream/internal/postgres/retrier 91.67% (+0.12%) 👍
github.com/xataio/pgstream/internal/searchstore 0.00% (ø)
github.com/xataio/pgstream/internal/searchstore/opensearch 0.00% (ø)
github.com/xataio/pgstream/migrations/postgres/core 0.00% (ø)
github.com/xataio/pgstream/migrations/postgres/injector 0.00% (ø)
github.com/xataio/pgstream/pkg/schemalog 0.00% (-54.14%) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/schemalog/instrumentation 0.00% (ø)
github.com/xataio/pgstream/pkg/schemalog/mocks 0.00% (ø)
github.com/xataio/pgstream/pkg/schemalog/postgres 0.00% (-86.67%) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/pgdumprestore 89.09% (+0.99%) 👍
github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/schemalog 0.00% (-94.12%) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/stream 35.84% (-0.12%) 👎
github.com/xataio/pgstream/pkg/stream/integration 0.00% (ø)
github.com/xataio/pgstream/pkg/wal 71.31% (+71.31%) 🌟
github.com/xataio/pgstream/pkg/wal/listener/snapshot/builder 0.00% (ø)
github.com/xataio/pgstream/pkg/wal/processor 0.00% (-84.62%) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/wal/processor/filter 94.00% (+8.49%) 👍
github.com/xataio/pgstream/pkg/wal/processor/injector 76.24% (+2.77%) 👍
github.com/xataio/pgstream/pkg/wal/processor/kafka 54.24% (-7.88%) 👎
github.com/xataio/pgstream/pkg/wal/processor/postgres 78.29% (-5.21%) 👎
github.com/xataio/pgstream/pkg/wal/processor/search 82.56% (-2.11%) 👎
github.com/xataio/pgstream/pkg/wal/processor/search/instrumentation 0.00% (ø)
github.com/xataio/pgstream/pkg/wal/processor/search/mocks 0.00% (ø)
github.com/xataio/pgstream/pkg/wal/processor/search/store 71.52% (+1.22%) 👍
github.com/xataio/pgstream/pkg/wal/processor/transformer 82.42% (ø)
github.com/xataio/pgstream/pkg/wal/replication/postgres 69.34% (+1.00%) 👍

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/xataio/pgstream/cmd/config/config_env.go 90.50% (+0.23%) 221 (-5) 200 (-4) 21 (-1) 👍
github.com/xataio/pgstream/cmd/config/config_yaml.go 89.42% (+0.41%) 189 (-2) 169 (-1) 20 (-1) 👍
github.com/xataio/pgstream/cmd/init_cmd.go 0.00% (ø) 36 (+3) 0 36 (+3)
github.com/xataio/pgstream/cmd/root_cmd.go 0.00% (ø) 87 (+4) 0 87 (+4)
github.com/xataio/pgstream/cmd/run_cmd.go 0.00% (ø) 82 (+4) 0 82 (+4)
github.com/xataio/pgstream/cmd/status_cmd.go 0.00% (ø) 34 0 34
github.com/xataio/pgstream/internal/migrator/migrator.go 0.00% (ø) 45 (+45) 0 45 (+45)
github.com/xataio/pgstream/internal/postgres/errors.go 61.76% (-3.86%) 34 (+2) 21 13 (+2) 👎
github.com/xataio/pgstream/internal/postgres/retrier/pg_querier_retrier.go 91.67% (+0.12%) 72 (+1) 66 (+1) 6 👍
github.com/xataio/pgstream/internal/searchstore/opensearch/opensearch_client.go 0.00% (ø) 209 (+1) 0 209 (+1)
github.com/xataio/pgstream/internal/searchstore/search_api.go 0.00% (ø) 12 0 12
github.com/xataio/pgstream/migrations/postgres/core/migrations.go 0.00% (ø) 107 (+107) 0 107 (+107)
github.com/xataio/pgstream/migrations/postgres/injector/migrations.go 0.00% (ø) 121 (+121) 0 121 (+121)
github.com/xataio/pgstream/pkg/schemalog/instrumentation/instrumented_store.go 0.00% (ø) 0 (-16) 0 0 (-16)
github.com/xataio/pgstream/pkg/schemalog/log_entry.go 0.00% (ø) 0 (-53) 0 0 (-53)
github.com/xataio/pgstream/pkg/schemalog/mocks/store_mock.go 0.00% (ø) 0 (-14) 0 0 (-14)
github.com/xataio/pgstream/pkg/schemalog/postgres/pg_schemalog_store.go 0.00% (-86.67%) 0 (-45) 0 (-39) 0 (-6) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/schemalog/schema.go 0.00% (-21.48%) 0 (-149) 0 (-32) 0 (-117) 💀 💀
github.com/xataio/pgstream/pkg/schemalog/schema_diff.go 0.00% (-88.94%) 0 (-199) 0 (-177) 0 (-22) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/schemalog/store.go 0.00% (ø) 0 0 0
github.com/xataio/pgstream/pkg/schemalog/store_cache.go 0.00% (-90.91%) 0 (-22) 0 (-20) 0 (-2) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/pgdumprestore/pg_options_generator.go 100.00% (ø) 75 (-8) 75 (-8) 0
github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/pgdumprestore/pg_snapshot_wal_restore.go 95.33% (+95.33%) 107 (+107) 102 (+102) 5 (+5) 🌟
github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/pgdumprestore/snapshot_pg_dump_restore_generator.go 83.11% (-0.70%) 302 (-44) 251 (-39) 51 (-5) 👎
github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/schemalog/snapshot_schemalog_generator.go 0.00% (-94.12%) 0 (-34) 0 (-32) 0 (-2) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/stream/config.go 71.05% (+1.61%) 38 (+2) 27 (+2) 11 👍
github.com/xataio/pgstream/pkg/stream/stream.go 0.00% (ø) 126 0 126
github.com/xataio/pgstream/pkg/stream/stream_init.go 6.67% (+0.22%) 90 (-3) 6 84 (-3) 👍
github.com/xataio/pgstream/pkg/stream/stream_run.go 0.00% (ø) 88 0 88
github.com/xataio/pgstream/pkg/stream/stream_status.go 78.41% (+0.76%) 88 (+3) 69 (+3) 19 👍
github.com/xataio/pgstream/pkg/stream/stream_status_checker.go 83.72% (-3.88%) 129 108 (-5) 21 (+5) 👎
github.com/xataio/pgstream/pkg/wal/listener/snapshot/builder/config.go 0.00% (ø) 0 0 0
github.com/xataio/pgstream/pkg/wal/listener/snapshot/builder/wal_listener_snapshot_generator_builder.go 0.00% (ø) 42 (-9) 0 42 (-9)
github.com/xataio/pgstream/pkg/wal/processor/filter/wal_filter.go 94.00% (+8.49%) 50 (-19) 47 (-12) 3 (-7) 👍
github.com/xataio/pgstream/pkg/wal/processor/injector/wal_injector.go 76.24% (+2.77%) 101 (+3) 77 (+5) 24 (-2) 👍
github.com/xataio/pgstream/pkg/wal/processor/kafka/wal_kafka_batch_writer.go 54.24% (-7.88%) 59 (-7) 32 (-9) 27 (+2) 👎
github.com/xataio/pgstream/pkg/wal/processor/postgres/config.go 0.00% (ø) 3 0 3
github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_batch_writer.go 79.27% (+3.71%) 82 (-8) 65 (-3) 17 (-5) 👍
github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_bulk_ingest_writer.go 72.60% (-1.01%) 73 (+1) 53 20 (+1) 👎
github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_schema_observer.go 81.98% (-2.55%) 111 (+14) 91 (+9) 20 (+5) 👎
github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_wal_adapter.go 60.00% (-3.33%) 30 18 (-1) 12 (+1) 👎
github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_wal_ddl_adapter.go 100.00% (+7.12%) 10 (-285) 10 (-264) 0 (-21) 👍
github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_writer.go 37.14% (ø) 35 13 22
github.com/xataio/pgstream/pkg/wal/processor/search/errors.go 0.00% (ø) 4 0 4
github.com/xataio/pgstream/pkg/wal/processor/search/instrumentation/instrumented_search_store.go 0.00% (ø) 31 0 31
github.com/xataio/pgstream/pkg/wal/processor/search/mocks/mock_search_mapper.go 0.00% (ø) 2 0 2
github.com/xataio/pgstream/pkg/wal/processor/search/search_adapter.go 96.91% (-1.49%) 97 (-28) 94 (-29) 3 (+1) 👎
github.com/xataio/pgstream/pkg/wal/processor/search/search_batch_indexer.go 76.00% (-0.32%) 75 (-1) 57 (-1) 18 👎
github.com/xataio/pgstream/pkg/wal/processor/search/search_msg_batch.go 0.00% (ø) 2 0 2
github.com/xataio/pgstream/pkg/wal/processor/search/search_store_retrier.go 80.56% (ø) 72 58 14
github.com/xataio/pgstream/pkg/wal/processor/search/store.go 50.00% (ø) 8 4 4
github.com/xataio/pgstream/pkg/wal/processor/search/store/search_adapter.go 59.46% (+9.46%) 37 (-7) 22 15 (-7) 👍
github.com/xataio/pgstream/pkg/wal/processor/search/store/search_pg_mapper.go 67.26% (-0.29%) 113 (-1) 76 (-1) 37 👎
github.com/xataio/pgstream/pkg/wal/processor/search/store/search_store.go 77.08% (+0.25%) 144 (-20) 111 (-15) 33 (-5) 👍
github.com/xataio/pgstream/pkg/wal/processor/transformer/wal_transformer.go 71.70% (ø) 53 38 15
github.com/xataio/pgstream/pkg/wal/processor/wal_processor.go 0.00% (-84.62%) 0 (-13) 0 (-11) 0 (-2) 💀 💀 💀 💀 💀
github.com/xataio/pgstream/pkg/wal/replication/postgres/pg_replication_handler.go 68.70% (+1.03%) 131 (-2) 90 41 (-2) 👍
github.com/xataio/pgstream/pkg/wal/wal_data.go 0.00% (ø) 11 (-1) 0 11 (-1)
github.com/xataio/pgstream/pkg/wal/wal_ddl.go 68.18% (+68.18%) 44 (+44) 30 (+30) 14 (+14) 🌟
github.com/xataio/pgstream/pkg/wal/wal_schema_diff.go 85.07% (+85.07%) 67 (+67) 57 (+57) 10 (+10) 🌟

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/xataio/pgstream/cmd/config/config_env_test.go
  • github.com/xataio/pgstream/cmd/config/config_yaml_test.go
  • github.com/xataio/pgstream/cmd/config/helper_test.go
  • github.com/xataio/pgstream/pkg/schemalog/postgres/pg_schemalog_store_test.go
  • github.com/xataio/pgstream/pkg/schemalog/schema_diff_test.go
  • github.com/xataio/pgstream/pkg/schemalog/schema_test.go
  • github.com/xataio/pgstream/pkg/schemalog/store_cache_test.go
  • github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/pgdumprestore/helper_test.go
  • github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/pgdumprestore/pg_options_generator_test.go
  • github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/pgdumprestore/pg_snapshot_wal_restore_test.go
  • github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/pgdumprestore/snapshot_pg_dump_restore_generator_test.go
  • github.com/xataio/pgstream/pkg/snapshot/generator/postgres/schema/schemalog/snapshot_schemalog_generator_test.go
  • github.com/xataio/pgstream/pkg/stream/helper_test.go
  • github.com/xataio/pgstream/pkg/stream/integration/helper_test.go
  • github.com/xataio/pgstream/pkg/stream/integration/pg_kafka_integration_test.go
  • github.com/xataio/pgstream/pkg/stream/integration/pg_pg_integration_test.go
  • github.com/xataio/pgstream/pkg/stream/integration/pg_search_integration_test.go
  • github.com/xataio/pgstream/pkg/stream/integration/pg_webhook_integration_test.go
  • github.com/xataio/pgstream/pkg/stream/integration/setup_test.go
  • github.com/xataio/pgstream/pkg/stream/integration/snapshot_pg_integration_test.go
  • github.com/xataio/pgstream/pkg/stream/stream_status_checker_test.go
  • github.com/xataio/pgstream/pkg/stream/stream_status_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/filter/wal_filter_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/injector/helper_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/injector/wal_injector_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/kafka/wal_kafka_batch_writer_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/postgres/helper_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_batch_writer_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_schema_observer_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_wal_adapter_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_wal_ddl_adapter_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/search/helper_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/search/search_adapter_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/search/search_batch_indexer_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/search/store/helper_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/search/store/search_pg_mapper_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/search/store/search_store_test.go
  • github.com/xataio/pgstream/pkg/wal/processor/wal_processor_test.go
  • github.com/xataio/pgstream/pkg/wal/wal_ddl_test.go
  • github.com/xataio/pgstream/pkg/wal/wal_schema_diff_test.go

@eminano eminano force-pushed the stateless-ddl-replication branch from 535d77a to 62cba1b Compare February 3, 2026 10:23
@eminano eminano changed the base branch from main to v1-0-0-rc February 3, 2026 10:24
@eminano eminano merged commit 89df6e6 into v1-0-0-rc Feb 3, 2026
6 of 7 checks passed
@eminano eminano deleted the stateless-ddl-replication branch February 3, 2026 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Explore stateless DDL tracking

2 participants