-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HelmCharts red herring when k3s cluster supplies CRD first name conflict #831
Comments
I'm really confused about what these logs mean:
Why are 2 artifacts deleted for some charts? |
I do not think this is a source-controller issue, but possibly an issue with the helm-controller controlling the life-cycle of the chart. Can you provide a precise step-by-step instruction of the changes in combination with the changes to the |
It is reported, on the HelmChart conditions that users are unaware of. Hence all users need to learn how to debug HelmReleases https://fluxcd.io/docs/cheatsheets/troubleshooting/#how-to-debug-not-ready-errors |
So I could find the condition, if I had access to a HelmChart but it either hasn't been created or was garbage collected on either side of the error, making it impossible to read the conditions:
helmrelease:
This reproduces instantly without the "private repo step" on Flux 0.31.3+ oci dev work, Helm controller is at:
|
After installing those two resources on the cluster, first the helmrepository then the helmrelease, with a separate terminal running |
SC does not delete HelmCharts, this is a helm-controller bug. I'll let @hiddeco move this issue if that's the case. |
What are the helm-controller logs telling you? I find it unlikely the chart is instantly garbage collected, but rather think it is not created at all from some reason (which may have to do with source-controller being an RC?). |
I can't reproduce this with the latest SC OCI build apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: podinfo
namespace: helm-oci
spec:
interval: 30s
timeout: 60s
type: oci
url: oci://ghcr.io/kingdonb/podinfo/helm
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: podinfo
namespace: helm-oci
spec:
chart:
spec:
chart: podinfo
reconcileStrategy: ChartVersion
sourceRef:
kind: HelmRepository
name: podinfo
interval: 1m0s
---
apiVersion: v1
kind: Namespace
metadata:
name: helm-oci The cluster was boostraped with the latest CLI from the OCI branch using $ k -n helm-oci get hr
NAME AGE READY STATUS
podinfo 11s True Release reconciliation succeeded
$ k -n helm-oci get helmchart
NAME CHART VERSION SOURCE KIND SOURCE NAME AGE READY STATUS
helm-oci-podinfo podinfo * HelmRepository podinfo 21s True pulled 'podinfo' chart with version '6.1.14'
$ flux version
flux: v0.0.0-oci-09e7d00-1657707162
helm-controller: v0.22.1
image-automation-controller: v0.23.4
image-reflector-controller: v0.19.2
kustomize-controller: oci-7681bda9
notification-controller: v0.24.0
source-controller: oci-102e9a94 |
It seems to be an issue on k3s only, I'm not sure what causes this but I am using vcluster with host k8s at I can file it upstream with k3s if I can isolate it well enough to explain it in an issue for them, but it seems likely to be related to either k3s or the specific version of k3s that I'm using. I'll try some different permutations. |
I think it would be good to know what precisely appears to (not) happen in both controllers versus the others. The helm-controller logs would probably tell more, did you e.g. see a reconciliation error at the end of the HelmRelease reconciliation related to the chart? |
k3s has it's own helm-controller. Maybe it's interfering. |
This is a bare k3s (vcluster) with no services other than coredns I've captured a clean set of logs with the full history when the OCI Repository is private:
and I'll attach another set of clean logs based on when the chart OCI repo is public (next comment) – the state with the missing helmchart when reconciliation fails:
|
When the repository is public, this is the log:
In neither case (public or private, success or failure) do I actually see any HelmChart resources created on the cluster |
Just a note after reviewing these logs, this time the notice about garbage collecting was not posted. (But the HelmChart resources are still not shown as created on the cluster.) |
It did not get any tag, so not having a chart make sense. But then the helmrepository is fine, with the same token? What version of SC are you running? |
@kingdonb Could it be something with It is kinda weird that you never see the HelmChart even before any garbage collection is done. It should be present at least when the repository is private and it has the error. Could something else be deleting the helm chart? |
The normal behavior is to populate a HelmChart resource with the reason for failure in its conditions. This is the behavior on a regular kind cluster, it's different for some reason on this k3s cluster. If the HelmChart is not created then the user cannot read failed conditions on it. There is no token, this is a repo that was accidentally private. It works when marked public, but again the HelmChart is not created on the cluster for some reason. This is a build from the |
Sorry to waste so much time on this, there is no issue:
@souleb @somtochiama you had it right, this is a conflict with the helm controller provided by k3s, they have installed some CRDs on my vcluster that I didn't ask for, and when I Sorry for the noise everyone. |
Might still be good to have this noted somewhere as a K3s specific gotcha around |
Follow-up to cover common likely issue noticed while attempting to debug a red-herring issue in fluxcd/source-controller#831 Signed-off-by: Kingdon Barrett <kingdon@weave.works>
Follow-up to cover common likely issue noticed while attempting to debug a red-herring issue in fluxcd/source-controller#831 Signed-off-by: Kingdon Barrett <kingdon@weave.works>
Follow-up to cover common likely issue noticed while attempting to debug a red-herring issue in fluxcd/source-controller#831 Signed-off-by: Kingdon Barrett <kingdon@weave.works>
I was testing an OCI Chart Repository:
and noticed I forgot to make it public. The errors were confusing:
There is no indication that there was an auth failure until I check the logs of source-controller:
OK, there's the error. (An aside: I think this should be reported somewhere as a condition)
I made the chart public, and I'm explaining this in the sequence of events that I tested because I am not certain what will be important to reproduce the error
The release succeeded, but I was waiting to see
HelmChart
get created and never did observe it. The garbage collector has deleted it already. Not sure why, it seems a bug!The chart does work but it has been garbage collected apparently in error. Subsequent log messages do not show the chart being recreated and garbage collected again, but it is remembered by the source controller apparently:
The text was updated successfully, but these errors were encountered: