-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating the in-cluster autopilot config while a helm chart is being created results in the chart being created twice #4047
Comments
I would expect this to be resolved on single-node if we added a sync.Mutex to
|
I tested adding a mutex to the But I'm not sure how that would happen besides a race where |
I tried an alternate solution, that being to call upgrade if a I got |
I'm becoming less sure that this is triggered by making an update to the cluster config while the chart apply is still ongoing - I should look at the list of things that can trigger a reconcile 😄 |
This may be from running Edit: looks like both restarting the service and editing the config can trigger the same bug |
What follows is a theory for a possible cause for this race condition: An install of a Helm Chart only happens when the What would happen if something patched the Helm Chart spec section while the Helm Install is ongoing ? That would trigger a Conflict when attempting to save the Chart Status in the cluster so the Status.ReleaseName would not be updated and in the next reconcile the Controller will attempt to install the chart once again. I inspect k0s logs and I noticed that this error is happening quite often with different charts:
This seems to indicates that SOMETHING has changed the Chart object while the chart was being reconciled by the controller and the Looking at the logic used to update those charts objects it seems like this situation is possible as it just dumps them in a directory and something else loads them into the cluster. So, in other words:
Does this make sense ? |
I manage to reproduce this by adding a new chart to the Cluster Config and then just after saving it I edited the manifest file for the chart in the {
"error": "can't install loadedChart `memcached`: cannot re-use a name that is still in use",
"updated": "2024-02-15 13:54:58.207462269 +0000 UTC m=+386.995226471",
"valuesHash": "201c25ad8c659602ec3934d0b3153586f112da4406a2b683587aef4a76390beb"
} Inspecting the k0s log this was logged:
It might be the case that the theory is actually right. |
@jnummelin I have raised a possible workaround for the problem on #4064, please let me know what you think. |
This is also reproducible with 1.29 and 1.28 by merely adjusting the chart values via the config object and applying k0s.yaml again. For example:
And then adding a value to the whoami-app chart (ingressClassName was missing above):
Applying this config does not work as I would expect (
|
Closing this, as #4064 has been merged and backported. Feel free to ping here or open another issue if the problem persists. |
Before creating an issue, make sure you've checked the following:
Platform
Version
v1.29.1+k0s.1
Sysinfo
k0s sysinfo
Machine ID: "ae13f7395464d07eda6ba68f7e94b56d64b68c4b3527b66ccb7ab4868f9c13b1" (from machine) (pass) Total memory: 15.6 GiB (pass) Disk space available for /var/lib/k0s: 24.3 GiB (pass) Name resolution: localhost: [::1 127.0.0.1] (pass) Operating system: Linux (pass) Linux kernel release: 6.2.0-1019-azure (pass) Max. file descriptors per process: current: 1048576 / max: 1048576 (pass) AppArmor: active (pass) Executable in PATH: modprobe: /usr/sbin/modprobe (pass) Executable in PATH: mount: /usr/bin/mount (pass) Executable in PATH: umount: /usr/bin/umount (pass) /proc file system: mounted (0x9fa0) (pass) Control Groups: version 2 (pass) cgroup controller "cpu": available (is a listed root controller) (pass) cgroup controller "cpuacct": available (via cpu in version 2) (pass) cgroup controller "cpuset": available (is a listed root controller) (pass) cgroup controller "memory": available (is a listed root controller) (pass) cgroup controller "devices": available (device filters attachable) (pass) cgroup controller "freezer": available (cgroup.freeze exists) (pass) cgroup controller "pids": available (is a listed root controller) (pass) cgroup controller "hugetlb": available (is a listed root controller) (pass) cgroup controller "blkio": available (via io in version 2) (pass) CONFIG_CGROUPS: Control Group support: no kernel config found (warning) CONFIG_NAMESPACES: Namespaces support: no kernel config found (warning) CONFIG_NET: Networking support: no kernel config found (warning) CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: no kernel config found (warning) CONFIG_PROC_FS: /proc file system support: no kernel config found (warning)What happened?
A helm chart was 'created' twice, resulting in an error. I believe this happened because the clusterconfig object was updated while the helm chart was being initially created. (Dynamic config is being used here)
Steps to reproduce
Expected behavior
Charts are not applied twice in parallel.
Actual behavior
Charts are applied twice in parallel, resulting in helm errors.
A chart that has a status.error
can't install loadedChart ```ingress-nginx```: cannot re-use a name that is still in use
cannot be further updated by k0s, because it will attempt to recreate the chart once again.Screenshots and logs
the clusterconfig, with errors:
the installed helm charts (the ingress-nginx chart has the error):
Additional context
No response
The text was updated successfully, but these errors were encountered: