Skip to content

Fix handling error when failing to drop the database completely on the cluster on the secondary node.#87802

Merged
tuanpach merged 2 commits intoClickHouse:masterfrom
tuanpach:rename-database-metadata-after-fail-to-drop
Oct 10, 2025
Merged

Fix handling error when failing to drop the database completely on the cluster on the secondary node.#87802
tuanpach merged 2 commits intoClickHouse:masterfrom
tuanpach:rename-database-metadata-after-fail-to-drop

Conversation

@tuanpach
Copy link
Copy Markdown
Member

@tuanpach tuanpach commented Sep 29, 2025

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix handling error when failing to drop the database completely on the cluster on the secondary node.

There are 2 problems in https://github.com/ClickHouse/support-escalation/issues/5792:

  1. When dropping a DatabaseReplicated in (DatabaseReplicated::drop), the process fails when trying to remove the Keeper path.
    In DatabaseCatalog::detachDatabase, we don't handle the exception, which leads to:
  • The database metadata file is still on disk. It will fail if users create a new database with the same name.
  • Database is not re-attached, so "DROP DATABASE IF EXISTS" will succeed but have no effect.
  1. When dropping the database with "ON CLUSTER", DDLWorker doesn't handle retry properly. In fact, it just writes down the status on Keeper. We can try to do the task if the task fails and is retriable.

The change in this PR:

  • Re-attach the database if it failed to drop the database completely
  • In DDLWorker, when the task is not finished, throw an ErrorCodes::UNFINISHED exception if the task is retriable to retry the task.

Closes https://github.com/ClickHouse/support-escalation/issues/5792

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Sep 29, 2025

Workflow [PR], commit [8c2cc4c]

Summary:

job_name test_name status info comment
Bugfix validation (integration tests) error

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Sep 29, 2025
@Avogar Avogar self-assigned this Sep 29, 2025
@tuanpach tuanpach force-pushed the rename-database-metadata-after-fail-to-drop branch from e490984 to d685a3d Compare September 30, 2025 07:19
@tuanpach tuanpach changed the title Rename the database metadata file and the database directory when it fails to drop the database completely. Fix handling error when failing to drop the database completely on the cluster on the secondary node. Sep 30, 2025
@tuanpach tuanpach force-pushed the rename-database-metadata-after-fail-to-drop branch 2 times, most recently from d4d18c6 to b71f7b8 Compare September 30, 2025 08:30
@tuanpach tuanpach requested a review from Avogar October 1, 2025 03:29
@tuanpach tuanpach force-pushed the rename-database-metadata-after-fail-to-drop branch from b71f7b8 to cc866b8 Compare October 1, 2025 05:23
- In DDLWorker, when the task is not finished, throw an ErrorCodes::UNFINISHED exception if the task is retriable to retry the task.
@tuanpach tuanpach force-pushed the rename-database-metadata-after-fail-to-drop branch from cc866b8 to 8c2cc4c Compare October 8, 2025 03:44
@tuanpach tuanpach added this pull request to the merge queue Oct 9, 2025
Merged via the queue into ClickHouse:master with commit 2246635 Oct 10, 2025
121 of 123 checks passed
@tuanpach tuanpach deleted the rename-database-metadata-after-fail-to-drop branch October 10, 2025 00:02
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Oct 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants