Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP for ability to switch-off Secrets/ConfigMaps updates for running Pods #1369

Merged
merged 1 commit into from Dec 6, 2019

Conversation

- introduce protection mechanism to avoid outages due to accidental
updates of existing Secrets/ConfigMaps
- improve cluster performance by reducing load on Kubernetes control
plance (mostly kube-apiserver) consumed by not-utilized feature

This comment has been minimized.

Copy link
@msau42

msau42 Nov 18, 2019

Member

With watch based secrets, do we actually have apiserver load issues?

I've actually seen that when you have a few hundred secrets, the node itself incurs significant cpu just maintaining steady state to diff between old and new secrets. There probably could be optimizations there by caching last updated time so that we only diff on a change

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 18, 2019

Author Member

With watch based secrets, do we actually have apiserver load issues?

We do (different ones, but still).
The issue is that with 5k nodes and 150k pods, we are getting to O(hundreds of thousands) open watches. And that is still causing some issues on apiserver (mostly connected to memory allocations at the go level).
So that would actually visibly improve our situation too.

rolling upgrades. For Secrets/ConfigMaps this is translating to
creating a new object and updating PodTemplate with the reference
to it. However, that doesn't protect users from outages caused by
accidental bad updates of existing Secrets and/or ConfigMaps.

This comment has been minimized.

Copy link
@yujuhong

yujuhong Nov 21, 2019

Member

To truly prevent this and follow the upgrade model you described, users should not update secrets/configmaps directly at all. Locking down access of that seems more straightforward?

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 22, 2019

Author Member

The comment was more about proposal, then motivation. I think motivation is actually fine. Resolving.

In this KEP, we are proposing introducing an ability to specify that
particular Secrets/ConfigMaps should not be updated at runtime for a
given Pods.
Given there are a lot of users not really taking advantage of automatic

This comment has been minimized.

Copy link
@yujuhong

yujuhong Nov 21, 2019

Member

Having immutable secrets/configmaps and/or locking down access to those objects still looks to me a better approach for preventing bad pushes by accident.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 22, 2019

Author Member

done.

We propose to extend `ConfigMapVolumeSource` and `SecretVolumeSource`
types with an additional field:
```go
DisableAutoUpdates bool

This comment has been minimized.

Copy link
@yujuhong

yujuhong Nov 21, 2019

Member

Hmm......this seems a little bit weird. The field will be added in the pod spec to deal with, essentially a performance issue. It's also hard to tell which version of the secret is in use by the pod. A more extreme example would be the node reboot, in which case, the pod would get the latest secret after the restart.

If the configmap/secret is immutable, I think we can optimize kubelet to not watch it. It'd also help prevent bad pushes.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Nov 21, 2019

Member

I agree w/ Yu-Ju, it would make a lot more sense (to me, at least) if the not-watching were actually an "immutable" flag on the config map. That way, you can also prevent users from mistakenly changing a config map that won't be updated.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 22, 2019

Author Member

done

- perform updates of files mounted to a Pod based on updates of
the Kubernetes object

TODO: Note that Kubelet may be restarted in the meantime, so we need to

This comment has been minimized.

Copy link
@yujuhong

yujuhong Nov 21, 2019

Member

Yep, this could further complicate kubelet's pod syncing flow, and the semantics in the API since we'd also need to decide when the secret/configmap may receive updates.

I think immutable secret/configmap is cleaner in the API, and would also justify the complexity in the implementation to support/optimize this. WDYT?

This comment has been minimized.

Copy link
@lavalamp

lavalamp Nov 21, 2019

Member

Yeah, that's a 2nd strong reason to make this a fact about the configmap/secret and not a fact about the mount.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 21, 2019

Author Member

What I was initially thinking is that I don't want to make the Secret/Configmap immutable, to enable e.g. changing finalizers, or OwnerReferences or sth like that.
But I think I agree with pointing my own TODO as an argument against.

What I think though is that we need to make the "conceptual spec" of it immutable and allow mutating of anything in ObjectMeta.
I will update the KEP when I find a bit time (though if you disagree with the above, please shout).

This comment has been minimized.

Copy link
@lavalamp

lavalamp Nov 21, 2019

Member

Yes, you can't make metadata immutable. You can make the data immutable, though--that's what I had in mind.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 22, 2019

Author Member

Updated

@wojtek-t wojtek-t force-pushed the wojtek-t:noupdatable_volumes branch from 1bff33e to d51975c Nov 22, 2019
Copy link
Member Author

wojtek-t left a comment

Updated the KEP
@lavalamp @yujuhong - PTAL

rolling upgrades. For Secrets/ConfigMaps this is translating to
creating a new object and updating PodTemplate with the reference
to it. However, that doesn't protect users from outages caused by
accidental bad updates of existing Secrets and/or ConfigMaps.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 22, 2019

Author Member

The comment was more about proposal, then motivation. I think motivation is actually fine. Resolving.

In this KEP, we are proposing introducing an ability to specify that
particular Secrets/ConfigMaps should not be updated at runtime for a
given Pods.
Given there are a lot of users not really taking advantage of automatic

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 22, 2019

Author Member

done.

We propose to extend `ConfigMapVolumeSource` and `SecretVolumeSource`
types with an additional field:
```go
DisableAutoUpdates bool

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 22, 2019

Author Member

done

- perform updates of files mounted to a Pod based on updates of
the Kubernetes object

TODO: Note that Kubelet may be restarted in the meantime, so we need to

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 22, 2019

Author Member

Updated

@wojtek-t wojtek-t force-pushed the wojtek-t:noupdatable_volumes branch 2 times, most recently from 128ee17 to f0d3f11 Nov 22, 2019
@saad-ali

This comment has been minimized.

Copy link
Member

saad-ali commented Nov 22, 2019

/assign

Copy link
Member

yujuhong left a comment

I like the new direction. We can run this through sig-node once it's ready.

In this KEP, we are proposing to introduce an ability to specify
that contents of a particular Secret/ConfigMap should be immutable
for its whole lifetime. For those Secrets/ConfigMap, Kubelets will
not be trying to watch/poll for changes to updated mouns for their

This comment has been minimized.

Copy link
@yujuhong

yujuhong Nov 22, 2019

Member

s/mouns/mounts

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 25, 2019

Author Member

done

Pods.
Given there are a lot of users not really taking advantage of automatic
updates of Secrets/ConfigMaps due to consequences described above, this
will allow then to:

This comment has been minimized.

Copy link
@yujuhong

yujuhong Nov 22, 2019

Member

s/then/them

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 25, 2019

Author Member

done

@wojtek-t wojtek-t force-pushed the wojtek-t:noupdatable_volumes branch from f0d3f11 to af750a2 Nov 25, 2019
Copy link
Member Author

wojtek-t left a comment

@yujuhong - thanks!

In this KEP, we are proposing to introduce an ability to specify
that contents of a particular Secret/ConfigMap should be immutable
for its whole lifetime. For those Secrets/ConfigMap, Kubelets will
not be trying to watch/poll for changes to updated mouns for their

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 25, 2019

Author Member

done

Pods.
Given there are a lot of users not really taking advantage of automatic
updates of Secrets/ConfigMaps due to consequences described above, this
will allow then to:

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 25, 2019

Author Member

done

Copy link
Member

thockin left a comment

This seems OK to me.

Initially I wondered if we wanted this more generally (e.g. in metadata) but I don't think it applies to most resources.

We propose to extend `ConfigMap` and `Secret` types with an additional
field:
```go
Immutable bool

This comment has been minimized.

Copy link
@answer1991

answer1991 Nov 27, 2019

Could be:

    Immutable *bool

Then we can develop a webhook admission to set users' Configmap/Secrets to be immutable as default

This comment has been minimized.

Copy link
@answer1991

answer1991 Nov 27, 2019

*bool can differ user specified value or the golang default bool value. And in Kubernetes, we can treat nil as false or set it to be false as default.

In our environment, if user does not set, our admission webhook will set Immutable to be true as default.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Nov 27, 2019

Author Member

Yes - it should be pointer changed.

Regarding webhook - yes we may do that, though it's more an extension, so I wouldn't put it into this proposal explicitly.

@wojtek-t wojtek-t force-pushed the wojtek-t:noupdatable_volumes branch from af750a2 to 364f630 Nov 27, 2019
@saad-ali

This comment has been minimized.

Copy link
Member

saad-ali commented Nov 28, 2019

This seems like a good idea from the storage perspective. It will allow for scaling ConfigMap and Secret volumes by eliminating the need for kubelet volume manager to periodically fetch Secret and ConfigMap objects from the Kubernetes API server.

/lgtm
/approve

Putting a hold on this to allow other approvers to also sign off. @wojtek-t feel free to remove hold when all approvals are in.

/hold

@@ -0,0 +1,196 @@
---
title: Immutable ephemeral volumes

This comment has been minimized.

Copy link
@yujuhong

yujuhong Dec 3, 2019

Member

Hmm...I think the title can just be "Immutable Secrets and ConfigMaps"

The secrets and configmaps won't be mutable, whether they are consumed as ephemeral volumes.
(It's also slightly unclear what the scope of ephemeral volumes is)

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Dec 6, 2019

Author Member

done

- introduce protection mechanism to avoid outages due to accidental
updates of existing Secrets/ConfigMaps
- improve cluster performance by reducing load on Kubernetes control
plance (mostly kube-apiserver) consumed by not-utilized feature

This comment has been minimized.

Copy link
@yujuhong

yujuhong Dec 3, 2019

Member

s/plance/plane

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Dec 5, 2019

Contributor

Hold on. I don't think it's the case that this is a not used feature. I would say at least 10% of operators and component integrations today depend on this. We should really try to clarify how many people use this before we make this statement.

Might be better to say "Many people would be willing to make this tradeoff for better scale", which is more generally true and avoids having to find out the exact numbers.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Dec 5, 2019

Author Member

Yeah - this sentence is unfortunate - if people wouldn't be using that, I would simply suggest deprecating this feature.
Some people depend on it (at least for some percentage of their secrets/configmaps).

Will update to what you suggest.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Dec 6, 2019

Author Member

done

@wojtek-t wojtek-t force-pushed the wojtek-t:noupdatable_volumes branch from 364f630 to 7dafabb Dec 6, 2019
@k8s-ci-robot k8s-ci-robot removed the lgtm label Dec 6, 2019
@lavalamp

This comment has been minimized.

Copy link
Member

lavalamp commented Dec 6, 2019

/lgtm

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Dec 6, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, saad-ali, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wojtek-t

This comment has been minimized.

Copy link
Member Author

wojtek-t commented Dec 6, 2019

/hold cancel

@k8s-ci-robot k8s-ci-robot merged commit 16170da into kubernetes:master Dec 6, 2019
2 of 3 checks passed
2 of 3 checks passed
tide Not mergeable. Should not have do-not-merge/hold label.
Details
cla/linuxfoundation wojtek-t authorized
Details
pull-enhancements-verify Job succeeded.
Details
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Dec 6, 2019
@derekwaynecarr

This comment has been minimized.

Copy link
Member

derekwaynecarr commented Dec 10, 2019

Can we clarify that the secret and/or configmap is writable via the API server but only its contents (i.e. keys and values) are immutable? For example, I should be able to label/annotate a secret even if immutable.

@wojtek-t

This comment has been minimized.

Copy link
Member Author

wojtek-t commented Dec 10, 2019

Can we clarify that the secret and/or configmap is writable via the API server but only its contents (i.e. keys and values) are immutable? For example, I should be able to label/annotate a secret even if immutable.

Sure - will add to #1397 tomorrow.
BTW - that is already tested in the e2e test that I have in my POC, so that works.

@wojtek-t

This comment has been minimized.

Copy link
Member Author

wojtek-t commented Dec 11, 2019

Clarified that in #1397 (opened for review).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.