-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error GitRepository/flux-system.flux-system #87
Comments
Hi @Analect, I think I may have an idea of why you ran into this situation, but I need to confirm first. Let me try to reproduce this issue and get back to you. Thanks. |
@Analect Found the issue. A fix is pending via digitalocean/container-blueprints#18. |
Thanks @mtiutiu-heits . I was probably wrong when I said the flux-system components were not created. On checking
However the other Also, I tried running
|
Thanks @mtiutiu-heits ... are there any manual steps I can take to fix and reload/apply to the cluster? |
@Analect I don't know if it's ok to use both methods, meaning bootstrapping Flux CD via Terraform and then via the flux CLI. Did you uninstalled Flux CD first via: For fixing it manually, please follow below steps:
The above value for
Let me know if it helps. P.S.: Sorry for replying with both GitHub accounts (I'm a contractor, and one account is associated with the company that I work for). I forgot to switch accounts, chrome profiles, etc. Too many things to do sometimes 😄 . |
Thanks @v-ctiutiu
|
@Analect You can also override the Terraform module value for that variable in your main.tf file, like this (notice the last line):
|
I suspect that you have now a Flux CD environment with mixed stuff, meaning old CRDs from the old one as well. I see that it complains about CRDs version. What works best if it's not a big issue for you, is to uninstall Flux CD completely via Let me do this first in my current setup, and see if overriding the Thanks. |
@Analect Can you try this and let me know if it works:
|
Tried
So you suggest running
Does that require me to tear-down the existing cluser? |
OK. Ran:
It recreated the flux-system pods, but on running
I ran
|
To start fresh, and without deleting the whole cluster you need to:
Terraform should see the differences and re-create the missing parts only, meaning Flux CD components (if you still have the state file in your working directory, or on the S3 bucket). Let me know how it goes and if it fixes your issue. Thanks. |
@v-ctiutiu followed your instructions:
The namespace
This appeared to work. I notice this at the end of the output on running that script.
I reran:
... but running
|
Ok I reproduced your issue, and it seems that the main TF module from the I assume that you have locally the latest version for the flux CLI, right ? (or at least a very recent version) If so, the So, I uninstalled
After planning again and then applying, I got both resources:
And the output is:
Please test and let me know if it works for you as well. If it does, then I will create another PR for the Thanks. |
@v-ctiutiu . Thanks for your efforts with this. Having gone through all your steps above, unfortunately I still can't get this kustomization/flux-system to 'show up'.
Back in my github repo, The upgrade of the fluxcd provider to 0.8.1 required me to run Sorry, I realise I'm fumbling around a bit blindly here, but would be good to get this running as per starter-kit demo. Tks. |
I think this might be relevant to what is going wrong. "When using the Terraform provider for Flux, you have to manually remove the v1beta1 Kustomization from the TF state" with: I got:
When I run
I can see |
First of all - Great job! These are my latest notes and findings, after doing some more debugging and re-reading your replies. Before moving on with other explanations, let me emphasize two important things:
So far so great, but not quite. Sometimes I hate state machines, especially when not only one is present and need to be synchronized. The problem is that, if you act externally with some other tool and alter one of the two state machines, then the other one is not aware of the changes. In your case, Terraform is not aware of the fact that you bootstrapped Flux CD again via the CLI ( Before moving further, what I did was to re-create the initial scenario that you ran into:
I reproduced your issue - great! Now, what I did was to list the supported API versions for Flux CD: kubectl api-versions | grep flux And I got:
Looking at the above, you can see that kubectl get kustomizations -A
# The actual result:
NAMESPACE NAME READY STATUS AGE
flux-system flux-system True Applied revision: main/1b43faf02da567e415aae57a7ecda865fd5b8063 4m46s But then the question remains: why What I did next was to run Before I move on, let me quote the command and the output that you pasted in a previous reply:
What happens after you run the above is, flux client (or the CLI counterpart of Flux CD) will create new API definitions for the Flux components in your cluster besides the existing ones. In your case, the On the other hand, Terraform is not aware of this change, and it thinks that So, after running flux get all
# The actual result:
NAME READY MESSAGE REVISION SUSPENDED
gitrepository/flux-system True Fetched revision: main/685e... main/685ea... False
NAME READY MESSAGE REVISION SUSPENDED
kustomization/flux-system False apply failed: The CustomResourceDefinition "kustomizations.kustomize.toolkit.fluxcd.io" is invalid: status.storedVersions[1]: Invalid value: "v1beta2": must appear in spec.versions main/1b43faf02da567e415aae57a7ecda865fd5b8063 False Nothing new so far, but if I run
Looking at the above output, you can see that now I have two versions for Before moving further, I'm curious what's the output of running below command in your environment ? kubectl get kustomizations -A To stay consistent with the Starter Kit, you need to downgrade the flux client (or the CLI counterpart). As you already pointed in the last reply, like mentioning the Upgrade Flux to the v1beta2 API discussion from the Flux CD official repo, this is an upgrade scenario issue. Currently, Getting back to the main issue, the only viable solution that I see now is to downgrade the Flux client. I still don't know why it doesn't create the new On our end, I should add a note about this in the prerequisites section for the affected chapter (meaning to use an older flux client version). On the other hand, we plan to upgrade all the Starter Kit components very soon, so an upgrade section for each chapter is necessary after all. To fix your current installation this time (hopefully), please follow below steps:
After I ran the above steps it started to work immediately. Let me know if it does the same for you. Although I don't have a final answer your last question, I hope that I was able to give you some hints about why it behaves the way it is now. Thanks a lot for your patience and time. |
@v-ctiutiu . I greatly appreciate your efforts to explain what might be going on. I can see the power of the TF/flux combination, in terms of managing complexities in a Kubernetes cluster, but using these tools can also introduce a whole new set of complications! Ahead of running your steps above, I ran these commands, as per your explanation. It seems kustomizations resources were absent from the cluster ... and maybe that was down to me over-riding things with my
On running those steps above and downgrading the Flux CLI to version 0.17, I now get this on calling
And now I see:
I suppose what intrigues me a bit is that with this latest uninstall/init/plan & apply does result in an update to the If one gets into a tangle like this in future, is it ever a solution to delete either the flux state in the github repo or the TF state in the digitalocean spaces, as a means of resetting? |
@Analect To be honest, I don't have a real answer for it. What has happened here in the end, is more or less a migration issue, I assume. What I don't have an answer for yet is (because we have not verified these scenarios - was a little bit out of scope for the
What I found on the Maybe @stefanprodan, who is one of the main contributors of Flux CD can give us some hints of what has happened, or how to prevent this to happen in the future? To summarize:
My final conclusion and to avoid all the above, is that the As a side note, we flux = {
source = "fluxcd/flux"
version = "~> 0.2.0"
} @stefanprodan - If someone runs into this situation in the future, is it possible to Thanks a lot. |
I have been following the automation tutorial. https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers/tree/main/15-automate-with-terraform-flux
I've re-run it a few times (recreating clusters), but it seems to get stuck on creating all the necessary flux-system components
When I run
flux get all
, then I get:And
flux logs
gives:It seems the various git credentials added in the
main.tf
file are right since files got added to thegit_repository_sync_path
that I supplied. However, these logs above suggest a related problem, where it can't access the GitRepository for other purposes.In the Github PAT, I granted these permissions in scope. Maybe that's not sufficient?
If I look in
.terraform/modules/create-doks-with-terraform-flux/provider.tf
I see:There is no base64 encoding/decoding suggested here.
Googling here suggests that maybe if a github user is a person rather than an org, then
--personal
flag should be passed. I'm not sure if that's relevant here and if that is handled in this starter kit. Also it suggest checking the content of the flux-system secret on the cluster, which should equate to an encoded Github PAT supplied in themain.tf
. It's not clear to me how that is best done.Any thoughts on how I might get over this stumbling block? Tks
The text was updated successfully, but these errors were encountered: