Skip to content

Spoc 387: Prevent adding ignored-schema tables to replication sets#319

Merged
mason-sharp merged 5 commits intomainfrom
spoc-387
Feb 9, 2026
Merged

Spoc 387: Prevent adding ignored-schema tables to replication sets#319
mason-sharp merged 5 commits intomainfrom
spoc-387

Conversation

@danolivo
Copy link
Contributor

@danolivo danolivo commented Jan 20, 2026

  • Reject tables from ignored schemas (spock, lolor, snowflake) at replication_set_add_table()/replication_set_add_seq() time with clear error messages, instead of failing at replication runtime.
  • Refactor dump_structure() to use helpers for building --exclude-extension/--exclude-schema arguments, centralizing the global skip lists in spock_node.c.
  • Stabilize regression test output: deterministic ordering via ORDER BY ... COLLATE "C", use SELECT 1 FROM to hide implementation-dependent OIDs, and join on subscription names instead of internal IDs.
  • Add new excluded_schema regression test demonstrating the fix.

@danolivo danolivo requested a review from mason-sharp January 20, 2026 08:53
@danolivo danolivo self-assigned this Jan 20, 2026
@danolivo danolivo added the enhancement New feature or request label Jan 20, 2026
@danolivo danolivo changed the title Demo for the issue Spoc 387 + regression tests stabilisation Spoc 387 + regression tests stabilisation Jan 20, 2026
Copy link
Member

@mason-sharp mason-sharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just was not sure about a TODO comment.

- Add new excluded_schema regression test
- Reformat multi-parameter SQL function definitions for readability
  (sub_create, sub_show_status, repset_create, repset_drop)

This test demonstrates two issues:
1. table from an 'excluded' schema disables subscription.
2. Spock tests are unstable.
Previous commits exposed test instability due to non-deterministic ordering
in query results.

Changes made to stabilize output:

- Replace SELECT * FROM spock.sub_create() with SELECT 1 FROM spock.sub_create()
  to avoid returning implementation-dependent OIDs.
- Queries on spock.local_sync_status now join with spock.subscription to display
  subscription names instead of internal IDs, with proper ORDER BY on sub_name
  (using COLLATE "C") and additional columns (sync_kind, sync_nspname,
  sync_relname) where needed.
- Add ORDER BY subscription_name COLLATE "C" to spock.sub_show_status() calls.
Centralize extension and schema skipping logic into helper functions
build_exclude_extension_string() and build_exclude_schema_string().
Use dynamic list instead of fixed-size array for argument construction.

Global skip lists for schemas and extensions (spock, lolor, snowflake)
are now defined in spock_node.c and exported via spock_node.h.
Previously, tables in ignored schemas (spock, lolor, snowflake) could be
added to replication sets, which would then cause replication failures
at runtime. This change adds proactive validation that rejects such
tables at the time they are added to a replication set, providing clear
error messages with hints for resolution.

Add EnsureRelationNotIgnored() function that checks both the global
skip_schema and skip_extension lists. Call it from both
replication_set_add_table() and replication_set_add_seq().

Also fix PG17 compatibility by using CheckRelationOidLockedByMe()
instead of LockHeldByMe() with LOCKTAG, and move the test helper
function pg_current_xlog_location() to the spock schema to avoid
potential conflicts with user functions in public schema.
@coderabbitai
Copy link

coderabbitai bot commented Feb 6, 2026

📝 Walkthrough

Walkthrough

Add schema/extension skip lists and EnsureRelationNotIgnored; integrate exclusions into pg_dump argument construction; add PG-version-gated relation lock checks; reformat several SQL function declarations (and change one OUT type int→bigint); add regression tests and include new excluded_schema test target.

Changes

Cohort / File(s) Summary
Core exclusion declarations & impl
include/spock_node.h, src/spock_node.c
Declare skip_schema[], skip_extension[] and implement EnsureRelationNotIgnored(Relation); add extension/dependency includes. Review error messages and public symbols.
Replication-set & locking updates
src/spock_repset.c
Add PG-version-gated lock checks (pre/post PG17) and call EnsureRelationNotIgnored() in replication_set_add_table and replication_set_add_seq. Verify lock-tag handling and conditional compilation.
Sync/pg_dump refactor
src/spock_sync.c
Add build_exclude_schema_string() and build_exclude_extension_string(); switch to List-based pg_dump argv construction; include exclude args; adjust transaction boundaries; add commands/extension.h. Check argument quoting and transaction lifetimes.
SQL function declarations
sql/spock--6.0.0-devel.sql
Reformat several function signatures (multi-line params, LANGUAGE repositioned); change spock.get_apply_worker_status OUT worker_pid from int to bigint. Confirm SQL compatibility and binary interface.
Makefile test target
Makefile
Add excluded_schema to REGRESS test list.
New regression test
tests/regress/sql/excluded_schema.sql
Add end-to-end test exercising replication-set table moves, subscriber creation with skip options, status polling, and cleanup.
Test projection & function relocation
tests/regress/sql/init.sql, tests/regress/sql/init_fail.sql, tests/regress/sql/bidirectional.sql, tests/regress/sql/node_origin_cascade.sql
Replace SELECT * FROM spock.sub_create(...) with SELECT 1 FROM spock.sub_create(...); move helper function to spock schema where applicable.
Test status joins & ordering
tests/regress/sql/apply_delay.sql, tests/regress/sql/multiple_upstreams.sql, tests/regress/sql/sync_table.sql, tests/regress/sql/parallel.sql
Replace direct spock.local_sync_status selects with joins to spock.subscription, include sub_name, and add deterministic ORDER BY ... COLLATE "C".
Small test tweak
tests/regress/sql/functions.sql
Use spock.pg_current_xlog_location() instead of pg_current_xlog_location().

Poem

🐇 I nibble code where schemas hide,

I mark the lists where extensions bide.
I thump the locks that change with time,
And stitch pg_dump args line by line.
A hop, a guard — replication kept in stride.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: preventing tables from ignored schemas from being added to replication sets, which is the core objective of this PR.
Description check ✅ Passed The description is directly related to the changeset, outlining the four main aspects: schema validation, refactoring dump_structure, stabilizing test output, and adding a new test.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch spoc-387

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sql/spock--6.0.0-devel.sql (1)

669-675: ⚠️ Potential issue | 🔴 Critical

Add CREATE OR REPLACE FUNCTION for get_apply_worker_status to the upgrade script.

The function signature change from worker_pid int to worker_pid bigint is not reflected in the upgrade script sql/spock--5.0.4--6.0.0-devel.sql. Users upgrading from 5.0.4 to 6.0.0-devel will retain the old function signature unless this definition is included in the upgrade path. Add the updated function definition to the upgrade script to ensure the signature change is applied during extension upgrade.

🤖 Fix all issues with AI agents
In `@tests/regress/sql/excluded_schema.sql`:
- Around line 6-8: Replace the incorrect header comment "Test resynchronization"
with a short, accurate description for this test, e.g. "Test excluded schema
behavior (SPOC-387)"; locate the top comment block that currently contains "Test
resynchronization" and update it to reference excluded schema behavior and the
ticket number so the test purpose matches its contents.
- Around line 42-53: The loop polling spock.sub_show_status(subscription_name :=
'sub_387') busy-waits with no delay and only exits when status = 'down'; add a
short sleep (e.g., PERFORM pg_sleep(0.1); or 0.5) inside the LOOP to avoid 100%
CPU spinning, and change the exit logic to handle both terminal states: EXIT
WHEN cnt > 0 OR EXISTS(SELECT 1 FROM spock.sub_show_status(subscription_name :=
'sub_387') WHERE status = 'replicating'); alternatively, replace this custom
loop with the existing test helper (e.g., sub_wait_for_sync) if available and
assert explicitly if the subscription reaches 'replicating' instead of timing
out.

In `@tests/regress/sql/parallel.sql`:
- Line 10: Fix the typo in the SQL test comment: change "statment" to
"statement" in the commented line that reads "-- FIXME: The statment below is
commented out temporarily." so the comment becomes "-- FIXME: The statement
below is commented out temporarily."
🧹 Nitpick comments (6)
src/spock_node.c (1)

1266-1310: Error messages say "table" but this function is also called for sequences.

EnsureRelationNotIgnored is invoked from both replication_set_add_table and replication_set_add_seq (in spock_repset.c). The errmsg text always says "table %s cannot be added to any replication set", which is misleading when the caller is adding a sequence.

Consider using "relation" instead of "table" for accuracy, or include the relkind in the message.

Suggested fix
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-				 errmsg("table %s cannot be added to any replication set",
+				 errmsg("relation %s cannot be added to any replication set",
 						RelationGetRelationName(rel)),

(Apply similarly to the extension-check ereport block on line 1303.)

src/spock_sync.c (1)

242-270: Verify safety of storing stack-local pg_dump pointer in the list.

pg_dump is a stack-allocated char[MAXPGPATH] (line 228). Its address is stored in args (line 242) and later passed to exec_cmd via cmdargv. This works because the entire execution completes within dump_structure's stack frame. However, this is fragile — if someone later refactors to return args or defer execution, the dangling pointer would cause UB.

The comment at lines 278–283 already notes that the list contains string literals (which can't be freed), but doesn't mention the stack pointer. Consider adding a note or using pstrdup(pg_dump) for defensive safety.

tests/regress/sql/apply_delay.sql (1)

34-37: Minor ORDER BY inconsistency with other test files.

Other files in this PR (e.g., multiple_upstreams.sql, sync_table.sql) order by sub_name,sync_kind,sync_nspname,sync_relname COLLATE "C", whereas this query orders only by sub_name COLLATE "C". For consistency and to ensure deterministic output even if additional tables are added to this test later, consider adding the extra sort columns.

Suggested fix
 SELECT sync_kind, sub_name, sync_nspname, sync_relname, sync_status IN ('y', 'r')
 FROM spock.local_sync_status l JOIN spock.subscription s
   ON (l.sync_subid = s.sub_id)
-ORDER BY sub_name COLLATE "C";
+ORDER BY sub_name,sync_kind,sync_nspname,sync_relname COLLATE "C";
tests/regress/sql/excluded_schema.sql (1)

1-1: TODO: sequence-in-excluded-schema coverage is noted but not yet implemented.

The TODO on line 23 flags an uncovered scenario. Would you like me to open a tracking issue for it?

Also applies to: 22-24

tests/regress/sql/init.sql (1)

86-86: Inconsistent with the join-based pattern used in parallel.sql.

This query still exposes sync_subid (an implementation-dependent OID) directly. Other test files in this PR (e.g., parallel.sql lines 34–37) join spock.local_sync_status with spock.subscription to display sub_name instead, and add ORDER BY ... COLLATE "C" for deterministic ordering. Consider applying the same pattern here for consistency and stability.

♻️ Suggested refactor to match the parallel.sql pattern
-SELECT sync_kind, sync_subid, sync_nspname, sync_relname, sync_status IN ('y', 'r') FROM spock.local_sync_status ORDER BY 2,3,4;
+SELECT sync_kind, sub_name, sync_nspname, sync_relname, sync_status IN ('y', 'r')
+FROM spock.local_sync_status l JOIN spock.subscription s
+  ON (l.sync_subid = s.sub_id)
+ORDER BY sub_name COLLATE "C", sync_nspname, sync_relname;
tests/regress/sql/parallel.sql (1)

34-37: Good improvement; consider extending ORDER BY for full determinism.

The join on spock.subscription to display sub_name instead of the raw OID is a solid stability improvement. However, the ORDER BY only covers sub_name. If multiple sync entries exist per subscription (e.g., from test_subscription created in init.sql), rows within the same subscription will be non-deterministic. Consider adding sync_nspname, sync_relname to the sort key, as done in the suggested refactor for init.sql.

♻️ Suggested extended ORDER BY
 SELECT sync_kind, sub_name, sync_nspname, sync_relname, sync_status IN ('y', 'r')
 FROM spock.local_sync_status l JOIN spock.subscription s
   ON (l.sync_subid = s.sub_id)
-ORDER BY sub_name COLLATE "C";
+ORDER BY sub_name COLLATE "C", sync_nspname, sync_relname;

@danolivo danolivo changed the title Spoc 387 + regression tests stabilisation Spoc 387: Prevent adding ignored-schema tables to replication sets Feb 9, 2026
@danolivo danolivo requested a review from mason-sharp February 9, 2026 11:43
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/spock_node.c`:
- Around line 1267-1312: In EnsureRelationNotIgnored, guard against nspname
being NULL after calling get_namespace_name(RelationGetNamespace(rel)) before
calling strcmp against skip_schema entries (return or treat as non-match if
NULL); then perform the skip_schema loop only if nspname is non-NULL. Also
harmonize the errmsg wording between the schema and extension branches (use the
same term, e.g., "relation %s") so RelationGetRelationName(rel) is described
consistently in both error messages; keep other errdetail/errhint text the same.

In `@tests/regress/sql/excluded_schema.sql`:
- Around line 8-12: Remove the redundant INSERT statement "INSERT INTO
spock.test_387 (x) VALUES (1);" that follows the CREATE TABLE for spock.test_387
in the test SQL; the test only needs the table definition and the calls to
spock.repset_create and spock.repset_add_table to validate the schema-exclusion
error, so delete that INSERT line and keep the CREATE TABLE and the two
spock.repset_* calls unchanged (no changes needed to the expected output which
already contains the ERROR/DETAIL/HINT).

@mason-sharp mason-sharp merged commit 3e58b7c into main Feb 9, 2026
10 checks passed
@mason-sharp mason-sharp deleted the spoc-387 branch February 9, 2026 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants