Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forceGenerateTLS does not work for an upgrade #51

Closed
victornoel opened this issue Jan 27, 2020 · 3 comments · Fixed by #53
Closed

forceGenerateTLS does not work for an upgrade #51

victornoel opened this issue Jan 27, 2020 · 3 comments · Fixed by #53
Assignees

Comments

@victornoel
Copy link
Contributor

When doing an upgrade, if I set the config.forceGenerateTLS value to true, I get the following error:

Error: UPGRADE FAILED: secrets "pomerium-authenticate-tls" already exists

I suppose a workaround is to delete manually the secret beforehand, but we should:

  • at least document it
  • at best provide a safer upgrade path

I'm not exactly clear how to do that though.

I think the problem is maybe that when config.forceGenerateTLS is set to false, then maybe the resource should be generated so that helm continue to keep track of it? I'm not sure…

@travisgroth travisgroth self-assigned this Jan 28, 2020
@travisgroth
Copy link
Contributor

I thought we had that particular state documented, but it doesn't seem that way.

@victornoel Out of curiosity, are you using helm2 or helm3? There does appear to be a hook setting that takes care of this, which is something I learned about recently. It should be the default behavior in helm3. https://helm.sh/docs/topics/charts_hooks/#hook-deletion-policies

@victornoel
Copy link
Contributor Author

@travisgroth interesting!

So with helm 3 this should fix the problem : I am currently using helm 2 but we are in the process of migrating to helm 3.

There is another problem with using hooks for those resources generated via the template (i.e., each run of helm will change the resource): once you disabled forceGenerateTLS, when you re-run helm, it cleans up its "memory" of the resources it manages (basically it will remove the resources from its knowledge, but not from the cluster).

I think it's not very great it terms of declarativeness, another solution is the one used in some charts: they use a job resource hook that generates the certs from within the cluster. For example using bash, openssl and kubectl can be enough…

It's more complex to write, but then you don't have this strange situation where resources are to be un-generated by helm: if it's a job, once it is run, it will stay here. Then you need to find the right way to detect what triggers regeneration: via manual job deletion, manual job restart, or even the job is run everytime (and deleted on success by helm) but certs only generated if not already present, etc, etc, etc. I don't know what is the best solution though.

@travisgroth
Copy link
Contributor

So with helm 3 this should fix the problem : I am currently using helm 2 but we are in the process of migrating to helm 3.

We are as well but it's been...rough. Good luck! :)

re-run helm, it cleans up its "memory" of the resources it manages (basically it will remove the resources from its knowledge, but not from the cluster

Yes. It's very funny looking in the diff. I looked into that and diff blames helm core, and helm core blames diff. It is cosmetic in day to day use but I agree that it isn't ideal.

another solution is the one used in some charts: they use a job resource hook that generates the certs from within the cluster

I'm not sure that offloading to a job even addresses the whole problem set you're observing. The case we're facing during this upgrade path is going to need you to take some additional step for the upgrade. The closest helm gets to "create if doesn't exist" is the hooks we're using now.

Punting to a non-helm resource does allow for less confusing ergonomics around the conditional...but if you're regenerating, it'll still need to be informed. Cleanup is still a problem as well, since you're creating objects outside tiller. I haven't played around with directly setting ownerReferences in things like Secret but maybe that's part of the solution.

Short term, I'd like to at least put the hook annotation in to make this slightly less annoying to use with helm 2.

Medium term, I think the if-not-exists logic can be moved into the operator, if it doesn't show up in helm 3. I anticipate more logic going into code over time. We could probably also add in either docs or explicit flags for cert-manager, which might be good for a non-trivial portion of the user base.

Remaining concerns

  • Regeneration. I haven't deeply thought about how this might work internally, but a more declarative method could be found by using a version or generationID option rather than a simple true/false. Simplistically, you set a flag to current epoch-second or some other monotonic number of your choice. Easy to see if you've already done the current work. If you need it redone, increase the number.

  • Garbage collection. I haven't fully done the diligence on the kubernetes garbage collection and finalization ecosystem, but I feel like there's a solution in there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants