New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache CR Reconciliation #747
Comments
@ryanemerson, can we utilize golang for this implementation? |
Definitely. Imo it's probably the best way to go due to the following:
|
Also we should define separate container build and run strategy: place code for |
How about the Infinispan operator moves to a meta-operator (one that deploys other operators), and in case a new cluster is deployed, the "config-reconciler" is implemented as a cache-operator? Strimzi has a similar approach, and it has 3 operators: cluster (meta), topic and user. |
Interesting approach, but not sure that simplifies things
…On Thu, 21 Jan 2021, 18:10 Ramon Gordillo, ***@***.***> wrote:
How about the Infinispan operator moves to a meta-operator (one that
deploys other operators), and in case a new cluster is deployed, the
"config-reconciler" is implemented as a cache-operator?
Strimzi has a similar approach, and it has 3 operators: cluster (meta),
topic and user.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#747 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC72HW6ABGJQQMMYSNXK33S3BNXPANCNFSM4WHUM4HA>
.
|
I think the reason why this implemented, that's Strimzi operator was written with Java before Quarkus was realised and consumers much more memory and resources than golang implementation. |
With this approach, you have:
AFAIK, it was not related with memory and resources, but maybe @scholzj or @ppatierno can help on the reasons for strimzi. |
That was absolutely not the case and it had nothing to do with this decision. The Topic and User operators have ultimately different scope then the cluster operator. They are bound to the Kafka cluster where they operate and their lifecycle. Without the Kafka cluster, the have not much to do. Having them as separate operator and deploying them as part of the Kafka cluster it self simplifies the code a bit. It also makes them much easier to use with Kafka clusters run by something else then Strimzi - this was not planned, but turned out to be fairly popular as well. That said, it has also some disadvantages - it is probably a bit harder to understand by users. You need to explain to them that they have to configure the additional operators etc. Also, in our case, we use label to tell in which cluster should the user or topic be created. And if the CR is created with a wrong label, nobody sees it and nothing happens ... which again - is a bit confusing to the user. Overall ... I don't think this was a bad decision which we regret or which is causing problems. But at the same time, not sure it gives us some huge advantages. When we later implemented the Connector operator for Kafka Connect, we did not follow the same approach and integrated it into the main operator. The bidirectional feature mentioned by Ramon is used only in our Topic Operator. We did this because there are too many applications creating topics directly in Kafka and if it worked only as traditional operator I do not think it would have had too much value and use. So we implemented the bidirectional sync. But it is quite complicated and brings a lot of issues. So I would recommend to avoid it unless you are in the same situation and really need it. |
Thanks for the insights @rgordill and @scholzj.
Very interesting! A big win for us from the support perspective is how much the operator unifies cluster deployment and removes many of the variables. So while this flexibility sounds good for some users, in our case I think it could potentially cause more problems.
This was our original intention 🙂, however there is a lot of demand for this feature. Having an operator per resource is an interesting idea, however I have reservations about the following:
|
Just some clarifications ... not necessarily trying to convince you to do it one way or another.
Tough decision. I'm not sure how exactly Infinispan is being used and how it works. Just count with it being harder than it might seem on a first look :-)
This is not how it works with Strimzi either. We have about 9 custom resource types. Only two of them are treated by the separate operator. The rest is done all by the main operator.
Again, it works a bit differently in Strimzi. There is only one OLM deployment - for the main Strimzi operator whihc manages the clusters (Kafka cluster, Kafka Connect cluster, Kafka Mirror Maker cluster etc.). It installs all the CRD including the user and topic CRDs. The topic and user operators are then (optionally) deployed as part of the Kafka cluster. So it just creates another deployment. It does not deploy another OLM operator or anything. |
It wasn't clear from my original message, but all of the above resources that I referenced would require bidirectional sync. I didn't think that it would need to be one operator per every CR, e.g. Backup/Restore CRs we already have, just those that require bidirectional sync. My original reservation about 1 operator per bidirectional resource was miss-placed anyway, as we could have the cluster operator and then a bidrectional operator (naming things is hard) that manages all of the bidirectional resources.
Thanks for clarifying, this makes a lot more sense now. So assuming we took a similar approach we could have a Cluster Operator (Infinispan, Backup, Restore CRs) and Resources Operator (Caches, Templates, Counters, Scripts). OLM installation would be the same as now. The Resources operator is then deployed per Infinispan CR [1]. The purpose of this operator would be to bidirectionally reconcile the various CRs by consuming events from the cluster it's associated with and updating/creating the respective CRs. This is actually similar to the "config-reconciler" pod idea I initially described as it addresses scalability concerns by being per cluster, except it also provides the benefits of providing additional separation of concerns between the Cluster and resource objects. [1] Should be configurable in the Infinispan CR if no resource CRs are required to avoid the overhead of an additional pod. |
In our case, the fact that one of the separate operators is bidirectional is just a coincidence. That has nothing to do with it being separate. The bidirectional nature will be PITA regardless :-/.
Yeah, that is how we have it as well. Some users just do not want to use it and prefer other tooling. So it can be enabled / disabled easily. |
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
- ConfigListener Deployment added to consume server-side cache lifecycle events and create/update/delete corresponding k8s Cache CR - Server caches now removed on Cache CR deletion - Cache CR spec.Template can be updated and changes are reflected on the server if runtime update of the configuration is possible.
The Cache CR allows Infinispan caches to be created by other k8 resources, e.g. other operator. However in it's current form a Cache CR does not reflect the current status of a cache on the server:
Proposal
Reconcile Cache CR state by consuming REST events ISPN-12606 and updating/creating Cache CRs. This allows CRs to be updated when CRUD operations are performed on a cache's state without polling.
Creation of new caches via any of the Infinispan clients will result in a corresponding Cache CR being created. Similarly, the removal of any caches will result in their CR being removed (or their status updated?).
Implementation
Deploy a stateless "config-reconciler" pod per Infinispan CR. This pod is responsible for consuming events from the Infinispan service and updating/creating a corresponding Cache CR. On startup/restart/failover, this pod must set
includeCurrentState=true
when connecting to the events endpoint to ensure that the latest state is reconciled.Utilising an independent pod has the following advantages:
Config Reconciler Pod
The config-reconciler should have as small a footprint as possible, so we should deploy this as a natively compiled executable based on a scratch image.
Reconciler per namespace
In order to reduce the total number of pods required by Infinispan, it could be possible to allow a "config-reconciler" pod to consume events from multiple Infinispan CRs in the same namespace. This could be configurable in the Infinispan CR. If enabled, it would be necessary for the config-reconciler pod spec to be updated and the pod restarted in order for the new cluster to be watched.
Future Work
We should extend the Operator and config-reconciler to support the following types:
The text was updated successfully, but these errors were encountered: