Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to edit/update the statefulset through the UI is not working #10041

Open
gaktive opened this issue Nov 16, 2023 · 10 comments
Open

Trying to edit/update the statefulset through the UI is not working #10041

gaktive opened this issue Nov 16, 2023 · 10 comments
Assignees
Labels
JIRA kind/bug QA/dev-automation Issues that engineers have written automation around so QA doesn't have look at this size/5 Size Estimate 5
Milestone

Comments

@gaktive
Copy link
Member

gaktive commented Nov 16, 2023

Internal reference: SURE-7109
Reported in 2.7.5, confirmed also in 2.8.0

Repro steps:

step 1: on Rancher 2.7.5 install monitoring from the apps marketplace in downstream RKE1 cluster
step 2: go to workloads > statefulsets > "prometheus-rancher-monitoring-prometheus" > edit config
step 3: on prometheus tab, go to resources and changed CPU reservation from 750 mCPUs to 700mCPUs. Select save
step 4: observe that the statefulset was redeploying but the CPU reservation stayed at 750
step 5: kubectl edit <statfulset> change the prometheus CPU res from 750 to 700 and the change took and reflected in both the Rancher UI and in the cluster itself

Issue description:
A user is trying to update the CPU/Mem resources limits for one of the stateful sets (monitoring installed from apps marketplace) from the rancher UI and it's being reset again back to the old limits for some reason. When the resources are updated from the CLI using the kubectl edit command, it gets updated and then Rancher UI shows the update.

Support was able to duplicate this in their lab.

Business impact:
Editing the config from the UI should, in fact, push change down to the cluster. The user shouldn't have to do it from the kubectl CLI

Troubleshooting steps:

Changed Rancher logs to debug
reproduced the issue
no clues....

Repro steps:

step 1: on Rancher 2.7.5 install monitoring from the apps marketplace in downstream RKE1 cluster
step 2: go to workloads > statefulsets > "prometheus-rancher-monitoring-prometheus" > edit config
step 3: on prometheus tab, go to resources and changed CPU reservation from 750 mCPUs to 700mCPUs. Select save
step 4: observe that the statefulset was redeploying but the CPU reservation stayed at 750
step 5: kubectl edit <statfulset> change the prometheus CPU res from 750 to 700 and the change took and reflected in both the Rancher UI and in the cluster itself

Workaround:
Use the kubectl CLI on the cluster to edit the resource. Not ideal but it is a workaround

Actual behavior:
Editing the config in Rancher and it doesn't actually change the statefulset

Expected behavior:
An edit from Rancher should change the value in the statefulset

Addenda:

  • @nwmac confirmed that this still happens in 2.8.0
  • Belief is that you should be able to edit these, but the caveat is if you then redeployed the app via Helm or updated it via Helm, those changes would get overwritten
@nwmac
Copy link
Member

nwmac commented Nov 16, 2023

Doesn't seem to affect all Stateful Sets - we use PUT to apply changes, where as kubectl uses PATCH - we should switch to do the same - note this requires a backend change as per rancher/rancher#40712

@gaktive
Copy link
Member Author

gaktive commented Dec 13, 2023

Let's sync with backend first to check if PUT or PATCH is the expected mechanism here.

@gaktive gaktive added the size/5 Size Estimate 5 label Dec 13, 2023
@gaktive
Copy link
Member Author

gaktive commented Dec 13, 2023

Perhaps related code that touches near where this is: #9577 though this problem still happens in Chrome.

@gaktive
Copy link
Member Author

gaktive commented Dec 13, 2023

Similar efforts around permissions: #7897

@aalves08
Copy link
Collaborator

aalves08 commented Dec 18, 2023

After a bit of investigation, there seems to be some problem with the updates we are getting via websockets...

Here is a video where we do the changes on the UI, then I show each relevant part of the Networks tab:

Screen.Recording.2023-12-18.at.16.10.48.mov

Recreating the repro steps, when we successfully edit this prometheus-rancher-monitoring-prometheus statefulset, we do a PUT to /v1/apps.statefulsets/cattle-monitoring-system/prometheus-rancher-monitoring-prometheus, which has been successful and in both the payload and the response we see the corresponding changes.

What happens is that the resource get's overwritten on the UI side by the resource.change coming via websockets (socket k8s/clusters/local/v1/subscribe?sockId=3), where we can clearly see that the resource data comes with the old data it had on the resource cpu and memory reservation.

A full page refresh returns the old object (without the changes employed by the PUT method).

@richard-cox @nwmac Is this just a PUT vs PATCH thing? 🤔

@aalves08
Copy link
Collaborator

aalves08 commented Jan 5, 2024

Moved to backend blocked with creation of rancher/rancher#43916

@samjustus
Copy link

@aalves08 so this is blocked by both 43916 and #40712 ?

@aalves08
Copy link
Collaborator

aalves08 commented Mar 5, 2024

@samjustus at least rancher/rancher#43916 for sure. Not entirely sure about rancher/rancher#40712, but I trust Neil's judgement.

I would have to repro it all again and check the implications.

@samjustus
Copy link

@aalves08 don't worry about that - we will take a look at them

@aalves08
Copy link
Collaborator

aalves08 commented Mar 5, 2024

Thank you @samjustus

@nwmac nwmac added the QA/dev-automation Issues that engineers have written automation around so QA doesn't have look at this label Mar 7, 2024
@gaktive gaktive modified the milestones: v2.9.0, v2.10.0 May 24, 2024
@nwmac nwmac modified the milestones: v2.10.0, v2.11.0 Jul 4, 2024
@gaktive gaktive modified the milestones: v2.11.0, v2.10.0 Jul 4, 2024
@gaktive gaktive modified the milestones: v2.10.0, v2.11.0 Oct 2, 2024
@nwmac nwmac modified the milestones: v2.12.0, v2.11.0 Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JIRA kind/bug QA/dev-automation Issues that engineers have written automation around so QA doesn't have look at this size/5 Size Estimate 5
Projects
None yet
Development

No branches or pull requests

4 participants