-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] annotation last-applied-tolerations magically added in 1.1.0 #2120
Comments
How does Terraform react when Longhorn automatically adds the annotation? E.g Does it deploy/redeploy Longhorn repeatedly? Do Longhorn's normal functionalities work ok? In another word, how badly is the problem? |
Hey @PhanLe1010 , Terraform will always show and applies a change, because longhorn automatically re-adds the annotation if it does not exist. I can (and I am currently) use a workaround, by simply adding the automated annotation to my terraform code, so it doesn't display a diff anymore. This is less a technical problem as it is a design flaw, that in my opinion shows, that Longhorn as a Project is not as mature as claims to be (>1). To be honest I really like this project, more than any of the other alternatives that I've tried so far and I'd like it to succeed and keep on using it in the future. But seeing changes like these let me worry, because the storage layer is the single most important part within a Cluster. If it breaks, most of the running workloads break and as such every design decision should be well thought through. Frankly, I don't think that that this was the case here. I had a deeper look at how this system works, and from the point of view of someone who has to deploy and manage this, it's as unusable as it gets. But this is a shitty system in general and I'd like to explain just why.
The second thing is, that it just does not make any sense from the point of view of whoever deploys this. Whoever deploys this knows which tolerations should be applied to the Resources he controls. For example it could make sense to force the Longhorn UI to run on the same Node as the Longhorn-Manager, as all UI responses are based on the API communication between the two. At the same time it does not necessarily mean, that I require the same tolerations for them as for all the other Resources that are deployed. The third thing is, that you guys should really start about thinking of how Users deploy stuff in the real world. At this time it's currently not possible to deploy multiple independent Longhorn Systems in a single Cluster, and I don't even dare to try to rename the namespace it's deployed into. There are various reasons why one would like to have Longhorn deployed multiple times independently on the same Cluster (or into a namespace with other resources like a general My recommendation would be to add a label to the Resources that are dynamically deployed. Perfectly suited for this would be app.kubernetes.io/managed-by=longhorn-manager, as it's a recommended label per the Kubernetes documentation. Then limit the UI configuration |
Hi @Bobonium , Thanks for the write-up. Your suggestion makes sense. The problem we were trying to solve before is to make it as easy as possible for the users to set tolerations. And since we don't expect the users to update the YAML file, we decided to update the tolerations for the running workload automatically. The decision we made at that time, as you noticed, continuously bothers us later. So in the latest v1.1.0 release, we decided to use the annotation as the record for tolerations set by Longhorn. But still, it won't solve all the problems since the user or Kubernetes might set the same taint to the workloads tool. Previously we're focusing too much on trying to get everything automated for the user, but we didn't realize the design will bring problems to the users using gitops. If we don't consider automatically updating the existing Longhorn components from the deployment file (yaml or chart), then that's much easier for us to deal with it, and it's easier for you to deploy it. Now thinking about it, your suggestion makes perfect sense. I'd like to change the design as you suggested in the next release. Regarding deploying multiple Longhorn clusters, I can see why it might be useful when there are different tiers. But the disk/node tag feature is designed to handle the situation. Also, since Longhorn is not multi-tenant by itself, there would be some use cases that users might want to deploy separate Longhorn for themselves. We haven't seen strong demands on this part yet so we didn't prioritize it. But resolving #1844 can a good first step toward that direction. |
Manual Tests: Case 1: Existing Longhorn installation
Case 2: New installation by Helm
Case 3: Upgrading from Helm
Case 4: Upgrading from kubectl
Case 5: Node with taints
Case 6: Priority Class UI
Case 7: Priority Class Helm
|
… Driver, Manager Now that we only watch/update managed components, we should allow user to specify Helm values for user deployed components (manager, driver, UI). Longhorn longhorn#2120 Signed-off-by: Phan Le <phan.le@rancher.com>
Pre-merged Checklist
|
… Driver, Manager Now that we only watch/update managed components, we should allow user to specify Helm values for user deployed components (manager, driver, UI). Longhorn longhorn#2120 Signed-off-by: Phan Le <phan.le@rancher.com>
… Driver, Manager Now that we only watch/update managed components, we should allow user to specify Helm values for user deployed components (manager, driver, UI). Longhorn #2120 Signed-off-by: Phan Le <phan.le@rancher.com>
@PhanLe1010 For upgrade scenario, users having/want tolerations set for Longhorn-manager, UI, Driver deployer with previous Longhorn versions will see tolerations getting removed on upgrading to v1.1.1 and they will have to upgrade the chart/yaml for Longhorn-manager etc with the tolerations. Do we want to mention this somewhere in the release notes? |
Yeah, Good point. I agree that we should mention this behavior in the release note. |
Verified with Longhorn-master - Validation - Fail There are two different behavior in below scenario when Longhorn is deployed with Helm and kubectl.
Also, the priority class doesn't get set for the recurring jobs when applied through the Longhorn UI. @PhanLe1010 Is this expected? Update: |
Sorry, I used the wrong image to test the helm installation. Updated #2120 (comment) as validation - Pass The behavior is as below.
We need to edit the manual test cases as per the behavior. |
Thanks @khushboo-rancher for carefully testing |
Linking this issue with #2262 to track and validate with Rancher charts changes. |
Closing, will track integration test implementation in a separate issue #2482 |
… Driver, Manager Now that we only watch/update managed components, we should allow user to specify Helm values for user deployed components (manager, driver, UI). Longhorn longhorn#2120 Signed-off-by: Phan Le <phan.le@rancher.com>
Describe the bug
I manage my deployments with terraform. As a result I can see changes being done to the managed resources. I just upgraded to 1.1.0 and I did not ever configure any tolerations through the longhorn UI.
After the upgrade the following change is visible for the driver, the manager and the ui:
The change shows us, that terraform wants to remove the annotation, as it's not part of the definition. It doesn't work though as longhorn always magically re-appends the value.
This is at least in theory also a problem for non terraform users. Re-applying the same deployment yamls should not result in a change. With this behavior it will always result in a change
To Reproduce
longhorn.io/last-applied-tolerations
was magically addedExpected behavior
Please don't dynamically adjust resources that are being created by the deployment yamls. This has negative impacts for anyone that manages their true state in git
Log
irrelevant for the bug
Environment:
irrelevant for the bug
Additional context
The text was updated successfully, but these errors were encountered: