-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store clusters from ClusterDiscovery in separate map #48795
Conversation
src/Interpreters/Context.cpp
Outdated
auto clusters = getClustersImpl()->getContainer(); | ||
if (shared->cluster_discovery) | ||
{ | ||
for (const auto & [name, cluster] : shared->cluster_discovery->getClusters()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh no, you take a lock each time and possibly get a different cluster map each time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside shared->cluster_discovery
, we have a mutex to protect cluster_impls
. It can be updated anytime (on the server added to/removed from the cluster).
As for shared->cluster_discovery
itself, it initialized once at Context::startClusterDiscovery
, but maybe it's worth protecting it with shared->clusters_mutex
also (as in tryGetCluster
), because I'm not really sure that getClusters
can't be called during server startup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be updated anytime (on the server added to/removed from the cluster).
Yes, but that's a problem when iterating no?
Each loop iteration you will end up with different possible iterators if map changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that
for (const auto & [name, cluster] : shared->cluster_discovery->getClusters())
{
...
is the same as
const auto & tmp = shared->cluster_discovery->getClusters();
for (const auto & [name, cluster] : tmp)
{
...
and getClusters
is just called once.
https://godbolt.org/z/sj6Ydc7zT
But we can change the code to be like the second snippet to ensure it.
for (const auto & [name, cluster] : shared->cluster_discovery->getClusters()) | |
const auto & cluster_discovery_map = shared->cluster_discovery->getClusters(); | |
for (const auto & [name, cluster] : cluster_discovery_map) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a cool tool for things like this, C++ Insights
your example
But yeah, I'm still afraid to have code that relies on such assumptions because it's not obvious
4fcd435
to
b31f696
Compare
Rerun CI to check integration tests broken by bad docker image |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Ref #47176
Context:
Clusters defined in
discovery
mode were stored inshared->clusters
with others. But the variableshared->clusters
is assigned in several places (e.g., on config reloading), and it's challenging to maintain it in the correct state. Current PR moves data structure to store auto-discovered clusters out ofshared->clusters
, similar to how it's done for the replicated database.