New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it possible to run Strimzi across all namespaces #1261
Conversation
|
@tombentley The code works, but I would appreciate if you could review the changes before I write the tests etc. as I'm not sure this is the ideal way how to implement this. Maybe you will have some improvement ideas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way we currently handle multiple namespaces is to use a separate verticle for each namespace. That has an advantage that each verticle has an independent context thread, so each Kafka cluster is somewhat isolated from the others. "Somewhat" because each verticle is using the same "kubernetes-ops-pool" worker pool, so the reality is that while each has it's own context thread when monitoring a lot of namespaces there could be contention for threads from that worker pool (and right now we can't configure the size of that pool).
This change allows * to watch all namespaces, but they're all using the same context thread. I would expect that if the number of namespaces and Kafka clusters is very large there would be contention for handling events on the single context thread.
It would not be ideal to have a difference in how we handle * and how we handle foo,bar.baz.
To make the handling for all namespaces the same as the handling for multiple namespaces we would need to watch the creation and deletion of namespaces (and regularly reconcile that), so we could have a verticle per namespace. I don't know whether that's allowed using the default RBAC rules, or whether we'd need some new ClusterRole, which would make the whole thing more difficult to install. The alternative (which I don't think is such a great idea) would be to change our handling for multiple namespaces to use a single verticle.
Finally, I think we need some validation of the config to so that watched namespaces are either foo,bar,baz or *, but not foo,bar,*.
Interesting ... I see this exactly the other way around. In a cluster with large number of namespaces the approach with separate verticles would run us soon out of resources. I also do not think that this necessarily means that the CO running with
Watching namespaces just because of this is IMHO quite a lot of effort and can possibly generate quite a lot of watches (imaging cluster with 100 namespaces where is just one actual cluster - we would have to watch and react on the namespaces and at the same time have the watches for each namespace. 100 namespaces == 401 watches). Watching multiple specific namespaces with single verticle seems just like a way to make our code more complex. There is IMHO no API call to watch just some namespaces. So there would be a lot of workarounds in the code just to handle this case where as for the
This makes definitely sense. |
|
I suspect that vertx would be fine with many verticles and would distribute them across the available cores. But needing lots of watches is a possible problem. But I'm not sure I understand where you get 401 from. Just one watch for namespaces being created and deleted, and one for But to be honest it's difficult to reason about the right way of doing this without knowing the sorts of scales people will be running at. My concern is more about problems arising from having two different ways of watching multiple namespaces. |
|
We have 4 watches per namespace. Plus one for the namespaces. So with 100
namespaces, you have 401 watches.
…On Mon, Jan 28, 2019, 11:43 Tom Bentley ***@***.*** wrote:
I suspect that vertx would be fine with many verticles and would
distribute them across the available cores. But needing lots of watches is
a possible problem. But I'm not sure I understand where you get 401 from.
Just one watch for namespaces being created and deleted, and one for Kafka
resources in each namespace = 101.
But to be honest it's difficult to reason about the right way of doing
this without knowing the sorts of scales people will be running at. My
concern is more about problems arising from having two different ways of
watching multiple namespaces.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1261 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFZXRz2MeODkwMy5_zNaR9a1pXzeeUqbks5vHtQ7gaJpZM4aUm02>
.
|
|
One for each of But right now you're right, and that really wouldn't appear to scale well. |
|
Well, I guess it is only a question of time until you run into a user who wants to run mirror maker in separate namespace. Anyway - what would you do with it? Setup the MirrorMaker watch only into namespaces where Kafka runs? That would sound to me like a lot of effort with very little gain. And there is probably no reason to assume that the same would apply for Connect. So you end up with slightly less watches, but not significantly less. While I think this discussion is useful for the long term, we also need to think about the short term. IMHO short-term there is nothing better than what I drafter in this PR. The only question IMHO is whether the handling of |
|
I guess I'm OK with using the single verticle for |
- This pull requests introduces a CSV and Package which can be bundled together and used by the [Operator Lifecycle Manager](https://github.com/operator-framework/operator-lifecycle-manager) to install, manage, and upgrade the strimzi-kafka operator in a cluster. - The strimzi-kafka operator versions available to all Kubernetes clusters using OLM can be updated by submitting a pull request to the [Community Operators GitHub repo](https://github.com/operator-framework/community-operators) that includes the latest CSVs, CRDs, and Packages. - The CSV was created based on the [documentation provided by OLM](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Documentation/design/building-your-csv.md) and the [documentation](https://github.com/operator-framework/community-operators/blob/master/docs/marketplace-required-csv-annotations.md) provided by [OperatorHub](https://github.com/operator-framework/operator-marketplace). OLM integration could be improved by future code changes to the Operator: - Adding additional information to the strimzi-kafka CSV based on the [CSV documentation provided by OLM](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Documentation/design/building-your-csv.md) - Adding additional information to the strimzi-kafka CSV based on the [CSV documentation provided by OperatorHub](https://github.com/operator-framework/community-operators/blob/master/docs/marketplace-required-csv-annotations.md) /cc @scholzj Operator Hub is currently not accepting operators that cannot watch all namespaces. However, I'm anticipating strimzi#1261 so that we can add the strimzi-kafka operator to Operator Hub.
|
A couple of questions ...
|
This PR uses a watcher which watches the whole cluster. It should not need to do anything when new namespace is added as there is always only one watch for all namespaces.
This can be done, but it would be a bit complicated and is largely unrelated to how this PR solves it since it doesn't open any watches per namespaces but watches per cluster. |
Should the CO has the rights for doing so? I mean in the installation file doesn't it need the right for watching namespaces in his cluster role?
Which cluster? OpenShift? Kafka? We have 4 watches per namespaces as you mentioned before right? Maybe I didn't get what you meant. |
It doesn't need any special rights for watching namespaces. It doesn't watch namespaces. It just tells Kubernetes Let me know if there is any change in
In the whole OpenShift / Kubernetes cluster. Of course in total it is still 4 watches - one for each resource. |
c5c19ee
to
f41aca0
Compare
|
I added some more tests. Could you please review @tombentley and @ppatierno? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits
cluster-operator/src/main/java/io/strimzi/operator/cluster/ClusterOperatorConfig.java
Outdated
Show resolved
Hide resolved
cluster-operator/src/main/java/io/strimzi/operator/cluster/ClusterOperatorConfig.java
Outdated
Show resolved
Hide resolved
...or/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractAssemblyOperator.java
Outdated
Show resolved
Hide resolved
...or/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractAssemblyOperator.java
Outdated
Show resolved
Hide resolved
...or/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractAssemblyOperator.java
Outdated
Show resolved
Hide resolved
...or/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractAssemblyOperator.java
Outdated
Show resolved
Hide resolved
cluster-operator/src/test/java/io/strimzi/operator/cluster/ClusterOperatorConfigTest.java
Show resolved
Hide resolved
cluster-operator/src/test/java/io/strimzi/operator/cluster/ClusterOperatorConfigTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a couple of nits.
cluster-operator/src/test/java/io/strimzi/operator/cluster/ClusterOperatorTest.java
Outdated
Show resolved
Hide resolved
cluster-operator/src/test/java/io/strimzi/operator/cluster/ClusterOperatorTest.java
Outdated
Show resolved
Hide resolved
cluster-operator/src/test/java/io/strimzi/operator/cluster/ClusterOperatorTest.java
Outdated
Show resolved
Hide resolved
|
Anymore comments @tombentley? |
Type of change
Description
This PR adds the ability to run Strimzi across all namespaces by setting the
STRIMZI_NAMESPACEoption to*.Checklist
Please go through this checklist and make sure all applicable tasks have been done
./design