Skip to content

Conversation

@iambriccardo
Copy link
Contributor

@iambriccardo iambriccardo commented Oct 24, 2025

This PR adds partitioned table support by treating a partitioned table as a single entity to replicate.

Implemented Behavior

Partition detachment handling:

  • If a partition is detached, the downstream data will NOT be deleted and the table will stop replication.
  • If a partition is detached but the publication includes ALL TABLES (with or without schema selection), the detached table will be added as a standalone table when the pipeline restarts.

publish_via_partition_root handling:

  • If publish_via_partition_root=false the system will throw an error if there is at least one partitioned table in the publication.
  • If publish_via_partition_root=true the system behaves as expected, treating the partitioned tables as one big table.

Testing

Several tests have been added to verify the behavior functions correctly.

Requirements

Note: FOR TABLES IN [schema] is only supported from Postgres 15+

@iambriccardo iambriccardo changed the title riccardobusetti/etl 268 partitioned tables do not work directly due to lack of pk @iambriccardo feat(core): Add partitioned table support Oct 24, 2025
@iambriccardo iambriccardo changed the title @iambriccardo feat(core): Add partitioned table support feat(core): Add partitioned table support Oct 24, 2025
@coveralls
Copy link

coveralls commented Oct 24, 2025

Pull Request Test Coverage Report for Build 18839109103

Details

  • 262 of 271 (96.68%) changed or added relevant lines in 6 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 82.469%

Changes Missing Coverage Covered Lines Changed/Added Lines %
etl-api/src/db/publications.rs 0 1 0.0%
etl/src/pipeline.rs 19 21 90.48%
etl/src/test_utils/event.rs 10 12 83.33%
etl/src/replication/client.rs 167 171 97.66%
Totals Coverage Status
Change from base Build 18773857928: 0.2%
Covered Lines: 15444
Relevant Lines: 18727

💛 - Coveralls

@iambriccardo iambriccardo force-pushed the riccardobusetti/etl-268-partitioned-tables-do-not-work-directly-due-to-lack-of-pk branch from f5e3b11 to 12416f4 Compare October 24, 2025 13:32
@iambriccardo iambriccardo marked this pull request as ready for review October 24, 2025 14:52
@iambriccardo iambriccardo requested a review from a team as a code owner October 24, 2025 14:52
Copy link
Contributor

@imor imor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions:

  • When publish_via_partition_root is false, do we copy the root table as well the the partitions?
  • If publish_via_partition_root is true, and a new partition is added, do we replicate it successfully? We don't have tests for this case.


```sql
-- Create publication with partitioned table support
CREATE PUBLICATION my_publication FOR TABLE users, orders WITH (publish_via_partition_root = true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: user lowercase SQL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we want to have lowercase in user facing docs? Most docs have the uppercase convention and I felt like it would be better to follow their lead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our existing docs have mostly lowercase SQL. You can check it out in the docs folder.

.await
.unwrap();

let _ = pipeline.shutdown_and_wait().await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test for an absence of events, we should insert one row into an existing partition after detaching & dropping the partition table and wait for that event to arrive. In the current form, the test could be passing because we shutdown the pipeline too quickly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's a fair comment. I will see what I can do.

.await
.unwrap();

let _ = pipeline.shutdown_and_wait().await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@iambriccardo
Copy link
Contributor Author

iambriccardo commented Oct 27, 2025

A few questions:

  • When publish_via_partition_root is false, do we copy the root table as well the the partitions?
  • If publish_via_partition_root is true, and a new partition is added, do we replicate it successfully? We don't have tests for this case.
  1. We do copy only the root table, meaning all the data across all partitions. The difference is that messages in the stream are tagged with partition oids instead of root oid, meaning that they will be skipped.
  2. partitioned_table_copy_and_streams_new_data_from_new_partition already exists.

@iambriccardo iambriccardo requested a review from imor October 27, 2025 09:43
@iambriccardo
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 409 to +466
/// Retrieves the OIDs of all tables included in a publication.
///
/// For partitioned tables with `publish_via_partition_root=true`, this returns only the parent
/// table OID. The query uses a recursive CTE to walk up the partition inheritance hierarchy
/// and identify root tables that have no parent themselves.
pub async fn get_publication_table_ids(
&self,
publication_name: &str,
) -> EtlResult<Vec<TableId>> {
let publication_query = format!(
"select c.oid from pg_publication_tables pt
join pg_class c on c.relname = pt.tablename
join pg_namespace n on n.oid = c.relnamespace AND n.nspname = pt.schemaname
where pt.pubname = {};",
quote_literal(publication_name)
let query = format!(
r#"
with recursive pub_tables as (
-- Get all tables from publication (pg_publication_tables includes explicit tables,
-- ALL TABLES publications, and FOR TABLES IN SCHEMA publications)
select c.oid
from pg_publication_tables pt
join pg_class c on c.relname = pt.tablename
join pg_namespace n on n.oid = c.relnamespace and n.nspname = pt.schemaname
where pt.pubname = {pub}
),
hierarchy(relid) as (
-- Start with published tables
select oid from pub_tables
union
-- Recursively find parent tables in inheritance hierarchy
select i.inhparent
from pg_inherits i
join hierarchy h on h.relid = i.inhrelid
)
-- Return only root tables (those without a parent)
select distinct relid as oid
from hierarchy
where not exists (
select 1 from pg_inherits i where i.inhrelid = hierarchy.relid
);
"#,
pub = quote_literal(publication_name)
);

let mut table_ids = vec![];
for msg in self.client.simple_query(&publication_query).await? {
let mut roots = vec![];
for msg in self.client.simple_query(&query).await? {
if let SimpleQueryMessage::Row(row) = msg {
// For the sake of simplicity, we refer to the table oid as table id.
let table_id = Self::get_row_value::<TableId>(&row, "oid", "pg_class").await?;
table_ids.push(table_id);
roots.push(table_id);
}
}

Ok(table_ids)
Ok(roots)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Collapse child partitions even when publication publishes child OIDs

The new get_publication_table_ids query now always walks up pg_inherits and returns only root tables. When the publication was created with publish_via_partition_root = false (the PostgreSQL default), logical replication messages still contain the child partition OIDs. Because only the parent ID is returned here, the schema cache never contains entries for those child OIDs and handle_relation_message will raise MissingTableSchema as soon as a child relation message arrives (replication/apply.rs handle_relation_message). This turns what used to be a “no CDC after copy” scenario into a hard pipeline failure. Either avoid collapsing when pubviaroot is false or teach the apply loop to handle child OIDs gracefully.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

@iambriccardo iambriccardo Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the code to now validate whether a publication has partitioned tables and if publish_via_partition_root=false the pipeline won't start.

@imor
Copy link
Contributor

imor commented Oct 27, 2025

A few questions:

  • When publish_via_partition_root is false, do we copy the root table as well the the partitions?
  • If publish_via_partition_root is true, and a new partition is added, do we replicate it successfully? We don't have tests for this case.
  1. We do copy only the root table, meaning all the data across all partitions. The difference is that messages in the stream are tagged with partition oids instead of root oid, meaning that they will be skipped.

Okay, does this mean only initial table copy will be done and no CDC on the root?

  1. partitioned_table_copy_and_streams_new_data_from_new_partition already exists.

Cool, that's great.

@iambriccardo
Copy link
Contributor Author

A few questions:

  • When publish_via_partition_root is false, do we copy the root table as well the the partitions?
  • If publish_via_partition_root is true, and a new partition is added, do we replicate it successfully? We don't have tests for this case.
  1. We do copy only the root table, meaning all the data across all partitions. The difference is that messages in the stream are tagged with partition oids instead of root oid, meaning that they will be skipped.

Okay, does this mean only initial table copy will be done and no CDC on the root?

  1. partitioned_table_copy_and_streams_new_data_from_new_partition already exists.

Cool, that's great.

Exactly, that's the result. Since it's a weird behavior, I have added an additional check on startup that fails the pipeline when the setting is on false and you have at least one partitioned table.

@imor
Copy link
Contributor

imor commented Oct 27, 2025

A few questions:

  • When publish_via_partition_root is false, do we copy the root table as well the the partitions?
  • If publish_via_partition_root is true, and a new partition is added, do we replicate it successfully? We don't have tests for this case.
  1. We do copy only the root table, meaning all the data across all partitions. The difference is that messages in the stream are tagged with partition oids instead of root oid, meaning that they will be skipped.

Okay, does this mean only initial table copy will be done and no CDC on the root?

  1. partitioned_table_copy_and_streams_new_data_from_new_partition already exists.

Cool, that's great.

Exactly, that's the result. Since it's a weird behavior, I have added an additional check on startup that fails the pipeline when the setting is on false and you have at least one partitioned table.

Alright, that should work. We can refine this behaviour in future.

@iambriccardo iambriccardo merged commit 17b2e1d into main Oct 27, 2025
9 checks passed
@iambriccardo iambriccardo deleted the riccardobusetti/etl-268-partitioned-tables-do-not-work-directly-due-to-lack-of-pk branch October 27, 2025 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants