Skip to content

Skip creating and renaming the temp inner table of non-append Refreshable MV DDLs in Replicated DBs#84858

Merged
tuanpach merged 2 commits intoClickHouse:masterfrom
tuanpach:skip-old-no-append-rmv-create-inner-tables-dlls
Aug 13, 2025
Merged

Skip creating and renaming the temp inner table of non-append Refreshable MV DDLs in Replicated DBs#84858
tuanpach merged 2 commits intoClickHouse:masterfrom
tuanpach:skip-old-no-append-rmv-create-inner-tables-dlls

Conversation

@tuanpach
Copy link
Copy Markdown
Member

@tuanpach tuanpach commented Aug 1, 2025

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Skip creating and renaming the old temp table of non-append RMV DDLs in Replicated DBs.

Refreshing in non-append Refreshable Materialized Views:

  1. Create a temporary table: .tmp_inner.
  2. Fill in the data from the SELECT clause
  3. Rename (with exchange) the temporary table to the target table name. The target table name can be the inner table of the view, or the table specified in the view create query.
  4. Drop the temporary table after exchanging. Note that the temporary table after exchanging is not the table created in step 1. It is the one exchanged in step 3.
  5. Update the coordination info in Keeper

Queries created in steps 1, 3, and 4 are replicated through the DDL logs.
If a replica is woken up after a long sleep, there are lots of creating, renaming and dropping DDLs generated by the RMVs.
To speed up the DDL processing, these old create DDLs are skipped.
Once the creating DDL is skipped, there is no table to rename in the renaming DDL, so the renaming DDL is skipped if the table name does not exist.
For the dropping DDLs, as the query contains IF EXISTS, we don't need to skip.

Enable this logic with DB replicated setting allow_skipping_old_temporary_tables_ddls_of_refreshable_materialized_views.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@tuanpach tuanpach added the can be tested Allows running workflows for external contributors label Aug 1, 2025
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Aug 1, 2025

Workflow [PR], commit [8f331e0]

Summary:

job_name test_name status info comment
Stateless tests (amd_tsan, sequential, 2/2) failure
03394_pr_insert_select_threads FAIL
Stateless tests (amd_ubsan, parallel) failure
02177_issue_31009 FAIL
AST fuzzer (amd_tsan) failure
SUMMARY: ThreadSanitizer: data race ci/tmp/build/./src/Common/PODArray.h:155:17 in void DB::PODArrayBase<1ul, 4096ul, Allocator<false, false>, 63ul, 64ul>::realloc<>(unsigned long) FAIL

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Aug 1, 2025
@tuanpach tuanpach force-pushed the skip-old-no-append-rmv-create-inner-tables-dlls branch from f3e5221 to 72251cc Compare August 1, 2025 04:13
@antaljanosbenjamin antaljanosbenjamin self-assigned this Aug 1, 2025
@tuanpach
Copy link
Copy Markdown
Member Author

tuanpach commented Aug 4, 2025

@tuanpach tuanpach changed the title Skip creating and renaming the old inner table of non-append RMV DDLs in Replicated DBs Skip creating and renaming the temp inner table of non-append RMV DDLs in Replicated DBs Aug 5, 2025
@tuanpach tuanpach force-pushed the skip-old-no-append-rmv-create-inner-tables-dlls branch from 72251cc to 6cbf8c9 Compare August 5, 2025 07:08
@tuanpach
Copy link
Copy Markdown
Member Author

tuanpach commented Aug 5, 2025

In the new push, I added allow_skipping_old_temporary_tables_ddls_of_refreshable_materialized_views to control the behavior. The default value is false.

Copy link
Copy Markdown
Member

@antaljanosbenjamin antaljanosbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review.

Comment on lines +495 to +542
if (coordination_znode.last_success_table_uuid == create_uuid)
return false;

LOG_TEST(log, "ddl_log_ctime {}, stats.mtime {}", ddl_log_ctime, stats.mtime);
return ddl_log_ctime < stats.mtime;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure I understood these conditions correctly.
I think the first one means we should always create the table that is the current table, because that is necessary.

The second one is more cryptic IMO. Skip creating table if the DDL log was created before the last update of the coordination info? I guess this means there will be some new tables.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comments in the code file and updated the PR description. Please let me know if it needs to be clarified further.

@tuanpach tuanpach force-pushed the skip-old-no-append-rmv-create-inner-tables-dlls branch from 6cbf8c9 to 5390ad9 Compare August 6, 2025 12:20
…tables_ddls_of_refreshable_materialized_views. If enabled, when processing DDLs in DatabaseReplicatedDDLWorker, it skips creating and exchanging DDLs of the temporary tables of refreshable materialized views if possible.
@tuanpach tuanpach force-pushed the skip-old-no-append-rmv-create-inner-tables-dlls branch from 5390ad9 to 59eb898 Compare August 6, 2025 12:23
@tuanpach
Copy link
Copy Markdown
Member Author

tuanpach commented Aug 7, 2025

return false;

LOG_TEST(log, "ddl_log_ctime {}, stats.mtime {}", ddl_log_ctime, stats.mtime);
// It is possible the the temporary table is created and replciated before the the coordiation info is updated.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this can be fixed in another PR, you don't have to run a whole CI again.

Suggested change
// It is possible the the temporary table is created and replciated before the the coordiation info is updated.
// It is possible the the temporary table is created and replciated before the coordiation info is updated.

@tuanpach tuanpach enabled auto-merge August 11, 2025 02:36
@tuanpach tuanpach changed the title Skip creating and renaming the temp inner table of non-append RMV DDLs in Replicated DBs Skip creating and renaming the temp inner table of non-append Refreshable MV DDLs in Replicated DBs Aug 11, 2025
@tuanpach tuanpach added this pull request to the merge queue Aug 13, 2025
Merged via the queue into ClickHouse:master with commit a5f26fd Aug 13, 2025
121 of 124 checks passed
@tuanpach tuanpach deleted the skip-old-no-append-rmv-create-inner-tables-dlls branch August 13, 2025 08:46
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Aug 13, 2025
@azat
Copy link
Copy Markdown
Member

azat commented Aug 15, 2025

@tuanpach new test is flaky. Should we revert?

>           assert count_skip_ddls(node2, db_name, last_log_ts) > 0
E           AssertionError: assert 0 > 0
E            +  where 0 = count_skip_ddls(<helpers.cluster.ClickHouseInstance object at 0x7fccc1f4b070>, 'test_mbd6xuJe', '2025-08-15 14:07:32.744195')

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=85698&sha=528856cc8dba02ab2b6ecb7024ac6e23b0e4fc8a&name_0=PR&name_1=Integration%20tests%20%28amd_tsan%2C%204%2F6%29

@tuanpach
Copy link
Copy Markdown
Member Author

@tuanpach new test is flaky. Should we revert?

>           assert count_skip_ddls(node2, db_name, last_log_ts) > 0
E           AssertionError: assert 0 > 0
E            +  where 0 = count_skip_ddls(<helpers.cluster.ClickHouseInstance object at 0x7fccc1f4b070>, 'test_mbd6xuJe', '2025-08-15 14:07:32.744195')

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=85698&sha=528856cc8dba02ab2b6ecb7024ac6e23b0e4fc8a&name_0=PR&name_1=Integration%20tests%20%28amd_tsan%2C%204%2F6%29

Let me check it first.

@tuanpach
Copy link
Copy Markdown
Member Author

In the test, the DB is recovered, so no DDLs to skip. I will update the test to make it stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants