Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Saving unchanged cluster leads to update status and cluster changes #6881

Closed
rmweir opened this issue Sep 13, 2022 · 20 comments · Fixed by #7140 or #7218
Closed

[Bug] Saving unchanged cluster leads to update status and cluster changes #6881

rmweir opened this issue Sep 13, 2022 · 20 comments · Fixed by #7140 or #7218
Assignees
Milestone

Comments

@rmweir
Copy link

rmweir commented Sep 13, 2022

Setup

  • Rancher version:
    2.6.6 and 2.6.7 backend pointing at latest UI for dev.

Describe the bug

Saving unchanged cluster leads to update status and cluster changes.

To Reproduce

  1. Create a cluster.
  2. Wait for cluster to go to active.
  3. Go to edit the cluster.
  4. Switch to yaml (I think this is actually optional).
  5. Hit save.

Result
Fields are added and the cluster goes into updating.

Expected Result

The cluster should remain unchanged and should not go into updating.

Additional context

<!--Add any other context about the problem here. -->
The diff between cluster before save and after save:
<  generation: 27
---
>  generation: 28
25c25
<  resourceVersion: "114966"
---
>  resourceVersion: "116054"
28c28,29
<  agentImageOverride: ""
---
>  agentEnvVars: []
>  agentImageOverride: null
30,32c31,36
<  description: ""
<  desiredAgentImage: ""
<  desiredAuthImage: ""
---
>  clusterTemplateRevisionName: null
>  defaultClusterRoleForProjectMembers: null
>  defaultPodSecurityPolicyTemplateName: null
>  description: null
>  desiredAgentImage: null
>  desiredAuthImage: null
50c54,57
<    nodelocal: {}
---
>    linearAutoscalerParams: {}
>    nodelocal:
>     updateStrategy: {}
>    updateStrategy: {}
98c105,113
<  scheduledClusterScan: {}
---
>  scheduledClusterScan:
>   enabled: false
>   scanConfig:
>    cisScanConfig:
>     overrideBenchmarkVersion: rke-cis-1.5
>     profile: permissive
>   scheduleConfig:
>    cronSchedule: 0 0 * * *
>    retention: 24
@rmweir rmweir added this to the v2.6.9 milestone Sep 13, 2022
@cbron
Copy link
Contributor

cbron commented Sep 13, 2022

Potentially a blocker, adding label to discuss.

@catherineluse
Copy link
Contributor

Just reproduced it. Also noticed the same thing happens if I save the cluster from the edit config form without switching to YAML.

@catherineluse
Copy link
Contributor

catherineluse commented Sep 14, 2022

@rmweir When I go to edit a cluster, click Edit as YAML, and click Show Diff, I see these changes to hostNamePrefix. This is for a 4-node cluster with two node pools:

Screen Shot 2022-09-14 at 11 21 56 AM

And this is the diff for the initial save of a single-node cluster:

Screen Shot 2022-09-14 at 12 09 06 PM

Should the UI wait to apply that change until after the user has changed something? Or is the problem that these values were not appropriately applied when the cluster was first provisioned?

After saving the cluster that way, if I save it again, it doesn't add more values or put the cluster in an updating state, because the values are only added if they don't already exist.

@zube zube bot added the team/area2 Hostbusters label Sep 15, 2022
@zube zube bot added the QA/XS label Sep 15, 2022
@catherineluse
Copy link
Contributor

catherineluse commented Sep 16, 2022

Gathered a little more information:

if ( !entry.pool.hostnamePrefix ) {
          entry.pool.hostnamePrefix = `${ prefix }-`;
        }
data() {
    if ( isEmpty(this.value?.spec?.localClusterAuthEndpoint) ) {
      set(this.value, 'spec.localClusterAuthEndpoint', {
        enabled: false,
        caCerts: '',
        fqdn:    '',
      });
    }
const DEFAULTS = {
  deleteEmptyDirData:              false, // Show; Kill pods using emptyDir volumes and lose the data
  disableEviction:                 false, // Hide; false = evict pods, true = delete pods
  enabled:                         false, // Show; true = Nodes must be drained before upgrade; false = YOLO
  force:                           false, // Show; true = Delete standalone pods, false = fail if there are any
  gracePeriod:                     -1, // Show; Pod shut down time, negative value uses pod default
  ignoreDaemonSets:                true, // Hide; true = work, false = never work because there's always daemonSets
  ignoreErrors:                    false, // Hide; profit?
  skipWaitForDeleteTimeoutSeconds: 0, // Hide; If the pod deletion time is older than this > 0, don't wait, for some reason
  timeout:                         120, // Show; Give up after this many seconds
};

@gaktive
Copy link
Member

gaktive commented Sep 20, 2022

Taking blocker off but leaving as a priority/0. This has been around for a few versions so alas, it's not a true regression but it's still weird.

@rmweir
Copy link
Author

rmweir commented Sep 22, 2022

@catherineluse after waiting a while the fields get over written and saving does the same thing again. I'm not sure, we would have to compare to what the behavior was before this started happening. I think the UI should not be changing anything about the cluster, if there is an empty/nil value it should be left as is.

@catherineluse
Copy link
Contributor

@rmweir What's the Rancher version where you remember it working properly?

@rmweir
Copy link
Author

rmweir commented Sep 22, 2022

@catherineluse I don't know which rancher version worked properly. I just know it's in 2.6.6+ at least.

@catherineluse
Copy link
Contributor

catherineluse commented Sep 23, 2022

Here's the behavior for editing an RKE2 cluster in 2.6.3:
Screen Shot 2022-09-22 at 7 36 35 PM

and 2.6.0:
Screen Shot 2022-09-22 at 8 57 42 PM

I couldn't reproduce it for RKE1.

Also, after saving the RKE2 cluster, it sometimes went into updating mode when saving from the form view, but not from the YAML view if the YAML diff showed no changes.

@gaktive gaktive modified the milestones: v2.6.9, v2.7.0 Sep 27, 2022
@gaktive
Copy link
Member

gaktive commented Oct 4, 2022

@catherineluse indicated that backend couldn't repro this consistently though they do see some issue. They've pushed their corresponding issue to 2.7.1, so we'll do the same.

@gaktive gaktive removed the priority/1 label Oct 4, 2022
@mantis-toboggan-md
Copy link
Member

Sure I can look @thaneunsoo; did you see this on RKE1, RKE2, and K3s?

@thaneunsoo
Copy link

@mantis-toboggan-md yes, for rke1,rke2,k3s

@mantis-toboggan-md
Copy link
Member

mantis-toboggan-md commented Oct 14, 2022

Interesting: RKE1 provisioning is done through the old UI so there's either the same problem in both UIs or there is another problem w/ editing but not changing anything on the backend... I'm not seeing the behaviour you are on my own setup but it is older; I'm making a fresh one and looking into this further now

@mantis-toboggan-md
Copy link
Member

@thaneunsoo I was able to repro this behavior with rke2 and k3s; a fix is now merged: #7218

What I saw specifically was that the rke2 or k3s cluster would go into an updating state after saving from a different view than the cluster had previously been saved in. So: provision a cluster without hitting 'view as yaml', go to 'edit config', then click 'view as yaml', save, the cluster updates despite you not changing anything. Again go to 'edit config', click 'view as yaml', save, the cluster does not change state. Go to 'edit config', save without clicking 'view as yaml', the cluster goes into updating...you get the idea

I'm having a harder time reproducing this consistently with RKE1. Could we potentially file a separate issue for that @thaneunsoo ?

@thaneunsoo
Copy link

Test Environment:

Rancher version: v2.7-head 9f1e043
Rancher cluster type: HA
Docker version: 20.10

Downstream cluster type: RKE2/K3s


Testing:

  1. Provision RKE2/K3s cluster
  2. Wait for cluster to become active
    a. Click Edit Config > Save
    b. Click Edit Config > View as YAML > Save

Results
Clusters do not go into Updating state. I did notice that the first I performed either of steps 2a or 2b, the cluster would go into In Progress for a split second and then go to active. If the user took their eyes off the screen for a second, they would miss this. This seems to be okay as there is editing in progress. Closing this issue as fixed

@zube zube bot closed this as completed Oct 19, 2022
@snasovich
Copy link
Contributor

@gaktive @nwmac , we will need this fixed in 2.6.10 as well. I've tried to create backport, but it looks like backport/forwardport bot is not available in this repo.
Could you please create a backport to ensure this is fixed in 2.6.10?
cc: @sowmyav27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment