Skip to content

Conversation

@pufit
Copy link
Member

@pufit pufit commented Nov 1, 2024

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

All DDL ON CLUSTER queries now execute with the original query user context for better access validation.

Details

This PR adds two new fields to the DDL entry:

  • initiator_user - a user's name from the original request.
  • initiator_roles - a user's roles from the original request.

If an initiator_user is not present on the cluster's instance, the request will fail. This behaviour can be controlled by a new server setting distributed_ddl_use_initial_user_and_roles

@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-improvement Pull request with some product improvements label Nov 1, 2024
@robot-clickhouse-ci-2
Copy link
Contributor

robot-clickhouse-ci-2 commented Nov 1, 2024

This is an automated comment for commit f83a89a with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check nameDescriptionStatus
Flaky testsChecks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc❌ failure
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests❌ failure
Successful checks
Check nameDescriptionStatus
BuildsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Docs checkBuilds and tests the documentation✅ success
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Style checkRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Unit testsRuns the unit tests for different release types✅ success

@tavplubix tavplubix self-assigned this Nov 1, 2024
@pufit
Copy link
Member Author

pufit commented Nov 1, 2024

  • initiator_user - a user's name from the original request.
  • access_hash - a hash of all access rights of the user.

If an initiator_user is not present on the cluster's instance or has different permissions the request will fail.
You can change this behavior with the server setting validate_access_consistency_between_instances

Also, @vitlibar, what do you think about this?

pufit added 2 commits November 8, 2024 02:51
# Conflicts:
#	src/Core/ServerSettings.cpp
#	src/Interpreters/executeDDLQueryOnCluster.cpp
@vitlibar vitlibar self-assigned this Nov 11, 2024
@vitlibar
Copy link
Member

vitlibar commented Nov 11, 2024

  • initiator_user - a user's name from the original request.
  • access_hash - a hash of all access rights of the user.

If an initiator_user is not present on the cluster's instance or has different permissions the request will fail.
You can change this behavior with the server setting validate_access_consistency_between_instances

Also, @vitlibar, what do you think about this?

For asynchronous insert queue we store and later use user id, the current roles, and the current settings (see). Maybe it's better to do the same for on cluster queries?

I'm not sure access_hash is useful. The access rights of any user or role can change dynamically, but even with ReplicatedAccessStorage there can be some delay between hosts reading those changes from ZooKeeper. It doesn't seem nice to make the whole ON CLUSTER command fail because something (perhaps not even related to the query) changed.

@clickhouse-gh
Copy link

clickhouse-gh bot commented Dec 17, 2024

Dear @tavplubix, @vitlibar, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

pufit and others added 3 commits January 18, 2025 16:36
# Conflicts:
#	src/Core/ServerSettings.cpp
#	src/Interpreters/executeDDLQueryOnCluster.cpp
@pufit pufit requested a review from tavplubix January 19, 2025 00:27
@clickhouse-gh
Copy link

clickhouse-gh bot commented Feb 7, 2025

Workflow [PR], commit [a329866]

@pufit pufit requested a review from vitlibar March 10, 2025 15:07
@tuanpach tuanpach self-assigned this Mar 25, 2025
@clickhouse-gh
Copy link

clickhouse-gh bot commented Apr 29, 2025

Dear @tuanpach, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

pufit added 2 commits July 8, 2025 23:12
# Conflicts:
#	src/Interpreters/DDLTask.cpp
@pufit pufit requested review from nikitamikhaylov and tuanpach July 9, 2025 03:18
@clickhouse-gh
Copy link

clickhouse-gh bot commented Jul 9, 2025

Workflow [PR], commit [1916d8e]

Summary:

job_name test_name status info comment
Stateless tests (amd_binary, ParallelReplicas, s3 storage, parallel) failure
00440_nulls_merge_tree FAIL cidb
Stateless tests (amd_ubsan, sequential) failure
03141_fetches_errors_stress FAIL cidb, flaky
Integration tests (amd_asan, old analyzer, 3/6) failure
test_s3_access_headers/test.py::test_custom_access_header[test_access_over_custom_header] FAIL cidb, flaky
Integration tests (amd_binary, 5/5) failure
test_s3_access_headers/test.py::test_custom_access_header[test_access_over_custom_header] FAIL cidb, flaky
Integration tests (arm_binary, distributed plan, 4/4) failure
test_s3_access_headers/test.py::test_custom_access_header[test_access_over_custom_header] FAIL cidb, flaky
Integration tests (amd_tsan, 3/6) failure
test_s3_access_headers/test.py::test_custom_access_header[test_access_over_custom_header] FAIL cidb, flaky

@vitlibar vitlibar self-assigned this Jul 25, 2025
entry.setSettingsIfRequired(context);
entry.tracing_context = OpenTelemetry::CurrentContext();
entry.initial_query_id = context->getClientInfo().initial_query_id;
entry.initiator_user = *context->getUserID();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't context->getUserID() be nullopt?

if (!user)
LOG_INFO(getLogger("DDLTask"), "Initiator user is not present on the instance. Will use the global user for the query execution.");
else
query_context->setUser(entry.initiator_user, entry.initiator_user_roles);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the initiator user just hasn't been replicated to the current shard yet?
According to the code the query will be executed on behalf of the global user in that case.
It seems it would be better to fail instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this a lot. Since both replicated storage and DDL queue use ZooKeeper, which is a log-based consensus algorithm, it's technically impossible to have a state where a user is not yet written to the log, but it already writes DDL queries.
On the other hand, clusters with no access replications should work as they did before. Otherwise, the incompatibility will be devastating.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least DDLWorker's queue and ReplicatedAccessStorage's queue work in different threads. They don't have to do things in sync. Also there can be a configuration which uses multiple ZooKeeper servers so these queues can connect even to different servers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a good enough solution to that is adding a setting which sets some waiting time for such cases. I mean if a shard fails to find the initiator by its UUID it could wait for a while before throwing an exception and thus failing the current query.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User creation in replicated storage is an operation with a commit to the Keeper log, so no issue on the second consern.
For the first part, yes, indeed, it can potentially be a problem (mostly on paper, the atacker has to fully control network to the target instance to use this potential vurnalability). An easy fix would be just to throw an error if we use ReplicatedAccessStorage only.

Copy link
Member

@vitlibar vitlibar Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the atacker has to fully control network to the target instance to use this potential vurnalability

If a client executes hundreds or thousands of queries (which is quite normal for our Cloud) then this problem will appear quite often. And throwing an error immediately is secure but it is not nice to users so they will be complaining.

Copy link
Member

@vitlibar vitlibar Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got another idea but it requires changing the interface of IAccessStorage. I think we could modify ReplicatedAccessStorage so it could check more thoroughly that a specified user doesn't really exist (no node in ZooKeeper) before throwing exception.

I mean that exception-throwing functions ReplicatedAccessStorage::getID() and ReplicatedAccessStorage::read(const UUID & id, bool throw_if_not_exists) in case if there is no entry in memory_storage and if throw_if_not_exists == true then these function could try to read the correspondent nodes from ZooKeeper immediately without waiting a queue, and then throw exception User not found only after that.

Copy link
Member

@vitlibar vitlibar Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upd:

What if the initiator user just hasn't been replicated to the current shard yet?

Let's just ignore this problem in this PR. It seems it shouldn't be a very big problem in most configurations. If it becomes kind of a big problem for some customer they will be able to just turn off the server setting enabling this PR's improvements (see my comment) and ignore initiator_user & initiator_roles specified in DDLTasks.

According to the code the query will be executed on behalf of the global user in that case. It seems it would be better to fail instead.

Let's just fail if initiator_user or initiator_roles don't exist. Using the global context seems very random, it's better not to do so. If a customer doesn't like this PR they will have the server setting to turn it off.

I've got another idea but it requires changing the interface of IAccessStorage. I think we could modify ReplicatedAccessStorage so it could check more thoroughly that a specified user doesn't really exist (no node in ZooKeeper) before throwing exception.

Let's not do that - any big changes of ReplicatedAccessStorage can cause new troubles, we don't want to go into that because of this PR.

std::optional<UUID> parent_table_uuid;

UUID initiator_user;
std::vector<UUID> initiator_user_roles;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it's better to use names just because it's more general (and for ReplicatedAccessStorage it's almost the same). Also we need a way to enable/disable this feature, probably via some configuration setting, because not everyone will want it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is disabled by default right now with the distributed_ddl_entry_format_version

Copy link
Member

@vitlibar vitlibar Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what if someone wants to disable it after a while after we increase distributed_ddl_entry_format_version once again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, they can set distributed_ddl_entry_format_version to the previous version. That is also why I don't want to throw any exceptions in the DDL execution to make these changes as backward compatible as possible, so there is no need to turn off this feature.

Copy link
Member

@vitlibar vitlibar Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upd:

And what if someone wants to disable it after a while after we increase distributed_ddl_entry_format_version once again?

Let's add a server setting which will allow to enable or disable both sending initiator_user / initiator_roles and using them on replicas/shards.

@clickhouse-gh
Copy link

clickhouse-gh bot commented Aug 26, 2025

Dear @vitlibar, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

@vitlibar vitlibar self-assigned this Nov 4, 2025
@vitlibar vitlibar enabled auto-merge November 11, 2025 12:34
@vitlibar vitlibar added this pull request to the merge queue Nov 11, 2025
Merged via the queue into master with commit 543ce91 Nov 11, 2025
124 of 131 checks passed
@vitlibar vitlibar deleted the pufit/fix-on-cluster-user branch November 11, 2025 14:49
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants