Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix no RESOURCE with the name NAME found #4871

Merged
merged 4 commits into from
Apr 5, 2019

Conversation

distorhead
Copy link
Contributor

@distorhead distorhead commented Nov 1, 2018

This is the fix for only one particular, but important case.

The case when a new resource has been added to the chart and there is an error in the chart, which leads to release failure. In this case after first failed release upgrade new resource will be created in the cluster. On the next release upgrade there will be the error: no RESOURCE with the name NAME found for this newly created resource from the previous release upgrade.

The root of this problem is in the side effect of the first release process. Release invariant says: if resouce exists in the kubernetes cluster, then it should exist in the release storage. But this invariant has been broken by helm itself -- because helm created new resources as side effect and not adopted them into release storage.

To maintain release invariant for such case during release upgrade operation all newly successfully created resources will be deleted in the case of an error in the subsequent resources update.

@helm-bot helm-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 1, 2018
@distorhead distorhead force-pushed the fix-no-resource-with-the-name-found branch from a740989 to 368f85d Compare November 1, 2018 16:36
@helm-bot helm-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 1, 2018
@distorhead distorhead force-pushed the fix-no-resource-with-the-name-found branch from 368f85d to 000296d Compare November 1, 2018 16:37
@helm-bot helm-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 1, 2018
@bacongobbler
Copy link
Member

This seems like a good approach to the problem. I'll start testing this with a few charts tomorrow. Thanks for the fix, @distorhead!

Copy link
Member

@bacongobbler bacongobbler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small change, otherwise works as expected when tested manually.

pkg/kube/client.go Outdated Show resolved Hide resolved
@distorhead distorhead force-pushed the fix-no-resource-with-the-name-found branch from 000296d to e675db2 Compare November 12, 2018 19:06
@helm-bot helm-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 12, 2018
@distorhead distorhead force-pushed the fix-no-resource-with-the-name-found branch from e675db2 to d01441b Compare November 12, 2018 19:29
@helm-bot helm-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 12, 2018
@bacongobbler
Copy link
Member

I'll bring this ticket up in today's dev call since I'm not entirely sure if there may be some cases where we do not want to delete newly created resources.

@mattfarina
Copy link
Collaborator

Note, AppVeyor is currently always failing. We can ignore this as it's unrelated to the PR.

@bacongobbler
Copy link
Member

^^ yes thank you :) currently working on that in #4897

Copy link
Member

@michelleN michelleN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change. It is incredibly valuable functionality but I propose we put this functionality behind a feature flag on helm upgrade

@bacongobbler
Copy link
Member

bacongobbler commented Nov 15, 2018

@michelleN I think that's a great idea. I'm also feeling incredibly wary about introducing anything related to resource deletion by default, especially on upgrade. Hiding this functionality behind a feature flag would allow users to opt into this (destructive) behaviour with little risk for users who aren't hitting this edge case. Good suggestion. 👍

@distorhead
Copy link
Contributor Author

@bacongobbler @michelleN

Ok, I see your point. What option name do you propose? Does cleanupOnFail bool sound good?

Some thoughts:

  1. Validation of resources before applying changes will solve the problem. This way user errors in templates will not result in inconsistent state of release.
  2. Implemented in this PR cleanupOnFail option will be useless for users, who already hit this error, because this option only prevents inconsistent state. But it will not resolve already existing inconsistent state, when it occurred suddenly.
  3. Option like recreateIfNotFoundInRelease for upgrade to delete resources instead of error No RESOURCE with the NAME found will also solve the problem. This way already existing inconsistent state will be resolved. But there will be a problem with this approach: resource may exists in kubernetes not because helm created it, but because it has been created by user or another software. So it is more dangerous option actually.
  4. Simply ignoring No RESOURCE with the NAME found as in https://github.com/jaredallard/helm/commit/79ac6a60f9d7c69bb18e010f54c7f5964498c06b may result in inconsistent state. Because already existing resouce in kubernetes may be in some random state and we should generate a patch, which will bring this resource to the state, that has been declared in the chart manifest. Also helm currently does not implement 3-way merge. It is better to implement 3-way merge then, which will solve not only this case.

In total:

  • cleanupOnFail option is the fastest way for now to solve the problem (despite the indicated problem in (2)).
  • Validation is the most safest way, but I do not found a fast way to implement it yet. I think validation is safe enough to be enabled by default.
  • 3-way merge is the most cool way to solve this and many other problems. But it requires more radical changes. Will 3-way merge be implemented in helm 3?

@bacongobbler
Copy link
Member

bacongobbler commented Nov 16, 2018

* Will 3-way merge be implemented in helm 3?

We're currently looking at the resource upgrade strategy for Helm 3. It may be a 3-way merge patch, or it could be something different. Either way we are certainly looking into ways to improve the current upgrade strategy, but we don't have any news or reports to share at this time.

helm upgrade --cleanup-on-fail=true definitely sounds like the right approach. That would be a good step forward for this PR. Let us know if you run into any bumps and we can take a look. :)

Copy link
Member

@bacongobbler bacongobbler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing "approved" status for the time being

@helm-bot helm-bot removed the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Nov 19, 2018
@distorhead distorhead force-pushed the fix-no-resource-with-the-name-found branch from 13af1a1 to d6fa8da Compare March 22, 2019 18:16
@helm-bot helm-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 22, 2019
@distorhead
Copy link
Contributor Author

@fbcbarbosa Merged your addition, thanks!

Also rebased whole branch on the upstream master.

Signed-off-by: Timofey Kirillov <timofey.kirillov@flant.com>
@distorhead distorhead force-pushed the fix-no-resource-with-the-name-found branch from b871cc6 to b252376 Compare March 22, 2019 18:27
@helm-bot helm-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 22, 2019
Copy link
Member

@bacongobbler bacongobbler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs another maintainer to review, but code LGTM.

@thomastaylor312
Copy link
Contributor

I am working on reviewing this and testing manually

Copy link
Contributor

@thomastaylor312 thomastaylor312 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! Just one nit that won't block merging. I'll come back and give it a green check once I do some manual testing with it

// and creates resources that don't already exists, updates resources that have been modified
// in the target configuration and deletes resources from the current configuration that are
// not present in the target configuration.
//
// Namespace will set the namespaces.
func (c *Client) Update(namespace string, originalReader, targetReader io.Reader, force bool, recreate bool, timeout int64, shouldWait bool) error {
func (c *Client) UpdateWithOptions(namespace string, originalReader, targetReader io.Reader, opts UpdateOptions) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads much cleaner and avoids breaking API changes when we need to add other things. Thanks for the great refactor!

pkg/kube/client.go Show resolved Hide resolved
…update

Signed-off-by: Timofey Kirillov <timofey.kirillov@flant.com>
Signed-off-by: Timofey Kirillov <timofey.kirillov@flant.com>
@distorhead distorhead force-pushed the fix-no-resource-with-the-name-found branch from b252376 to a8145d6 Compare April 4, 2019 18:50
Copy link
Contributor

@thomastaylor312 thomastaylor312 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, manual testing on this looks good! Thanks for all of the hard work

@bacongobbler bacongobbler merged commit 53d432f into helm:master Apr 5, 2019
thomastaylor312 added a commit to thomastaylor312/helm that referenced this pull request Sep 23, 2019
This ports the functionality of cleanup on fail to v3 as introduced in helm#4871. This has been tested manually
and would be a good candidate for a new acceptance test.

Signed-off-by: Taylor Thomas <taylor.thomas@microsoft.com>
@thomastaylor312 thomastaylor312 added the v3 port complete For completed v2->v3 ports label Sep 24, 2019
@distorhead distorhead deleted the fix-no-resource-with-the-name-found branch March 9, 2021 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs v3 fix size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. v3 port complete For completed v2->v3 ports
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet