Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update/create timing window #10

Open
curtisr7 opened this issue May 6, 2020 · 3 comments
Open

Update/create timing window #10

curtisr7 opened this issue May 6, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@curtisr7
Copy link

curtisr7 commented May 6, 2020

There is a problem with this plugin as it depends on two successful etcd backed operations to correctly map a release (update, create). If the process dies after the old release is updated to StatusSuperseded [1] but before creating the new revision[2], this release is toast as there aren't any Deployed revisions. This can also happen if etcd is having problems and the cfg.Releases.Create call fails.

I wonder if this command could be updated to check for a release with no Deployed revisions and handle accordingly? That would solve both cases without having too much special case handling.

[1] https://github.com/hickeyma/helm-mapkubeapis/blob/master/pkg/v3/release.go#L67
[2] https://github.com/hickeyma/helm-mapkubeapis/blob/master/pkg/v3/release.go#L82

@hickeyma hickeyma added the bug Something isn't working label May 6, 2020
@hickeyma
Copy link
Collaborator

hickeyma commented May 6, 2020

Thanks for raising this @curtisr7. You are correct, the operations are not atomic and therefore fragile to a small timing window.

From a robust solution perspective, I don't think there is a way with the Kubernetes APIs and etcd to make the 2 operations transactional. Maybe someone can shed more light on this?

This is also a similar issue in Helm with an upgrade (as far as I can tell): https://github.com/helm/helm/blob/master/pkg/action/upgrade.go#L130 and https://github.com/helm/helm/blob/master/pkg/action/upgrade.go#L137
The difference with Helm is that is adds the new release object first and then updates the previous release status. Would that order work better in this situation?

I wonder if this command could be updated to check for a release with no Deployed revisions and handle accordingly?

This would add overhead in the plugin as it would have to get all release versions and iterate over them. At the moment, the plugin just needs to get the latest with 1 API call to Helm.

Alternatively, you can change the status of the release in question back to "Deployed" before running the plugin again. You would use kubectl edit on the Secret/ConfigMap release version to change it.

@curtisr7
Copy link
Author

curtisr7 commented May 6, 2020

This would add overhead in the plugin as it would have to get all release versions and iterate over them. At the moment, the plugin just needs to get the latest with 1 API call to Helm.

I wonder if you could follow that same pattern? If the latest is superseded, map and create new? That said, I don't know what the plugin does if you run it against a release that is in this weird state.

Full disclosure, I'm not actually using this plugin. Though I'm doing something very similar in my codebase and ran into this problem.

@bacongobbler
Copy link
Member

One way to workaround this might be to record the in-cluster metadata somewhere persistent, like on your local hard drive. That way you at least have a physical copy which can be used to recover back to your prior state, or try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants