Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConfigMap / Secret Orchestration #948

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

@kfox1111
Copy link

@kfox1111 kfox1111 commented Apr 10, 2019

This change adds a KEP for ConfigMap / Secret Orchestration support.

This change adds a KEP for ConfigMap / Secret Orchestration support.

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>
@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Apr 10, 2019

Hi @kfox1111. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Apr 10, 2019

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kfox1111
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: mattfarina

If they are not already assigned, you can assign the PR to them by writing /assign @mattfarina in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>
@justaugustus
Copy link
Member

@justaugustus justaugustus commented Apr 28, 2019

/ok-to-test

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>
@kfox1111
Copy link
Author

@kfox1111 kfox1111 commented May 21, 2019


Both will default to false for compatibility.

snapshot can be either true or false. Watch can only be true if snapshot is also true. This ensures immutability.
Copy link
Member

@hongchaodeng hongchaodeng May 26, 2019

Can you elaborate a bit more on "snapshot" field? What scenarios would use it? Why not just use the given ConfigMap?

Copy link
Author

@kfox1111 kfox1111 Feb 24, 2020

you may have things that should be snapshotted, and some things that should not. For example, say you have two secrets. One contains immutable, sensitive configuration stored in a secret (say a mysql connection string with a password). This should be marked snapshot and should not be dynamically updated out to the pod. There may be another secret though that must be updateable (for example, a cert-manager managed certificate) that you always want the newest version as the older version is no longer valid. Having a flag at the volume level allows choosing.

The only determination between choosing a configmap and a secret is how sensitive the material is.

Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

Why does watch require snapshot to be true? Also, wouldn't it make more sense to opt into this on workload-level (Deployment, DS, Statefulset) because if immutability is wanted, we already have the immutable configmap/secret fature?

Copy link
Author

@kfox1111 kfox1111 Oct 6, 2020

I'm not super excited about this, I left some comments. This needs a better API design and a GC plan to have a chance.

I'm not excited by the temporal dependencies this introduces -- the system will do different things depending on the update order between a deployment and a configmap. If you update both in a short amount of time it's particularly unclear what would happen.

I feel like the controller changes to implement this would be fairly complex. I recommend dropping the "watch" flag completely; users can touch the object with a no-op PATCH for a rollout on-demand, add that feature in a separate step.

It really was a first stab at a kep. Open to changes.

Maybe watch in addition to snapshot is biting off too much at once. It was a primary use case of the initial issue though so thought I should include it. Often KEP's don't go through unless they cover a lot of use cases it feels like. I think if I remove it someone would probably ask for it back.

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Aug 24, 2019

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@kfox1111
Copy link
Author

@kfox1111 kfox1111 commented Aug 26, 2019

/remove-lifecycle stale

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Nov 24, 2019

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@Bessonov
Copy link

@Bessonov Bessonov commented Nov 24, 2019

/remove-lifecycle stale

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Feb 22, 2020

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@Bessonov
Copy link

@Bessonov Bessonov commented Feb 23, 2020

/remove-lifecycle stale

Yeah, everyone should be notified that no body cares about this issue for the last 90d.

@Bessonov
Copy link

@Bessonov Bessonov commented Jun 7, 2020

/remove-lifecycle stale

## Proposal

The ConfigMapVolumeSource and SecretVolumeSource objects will get two new, optional fields:
* snapshot: boolean
Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

Copy link
Author

@kfox1111 kfox1111 Jul 17, 2020

It does not.

The 1.18 feature is about making a secret immutable.

This feature is taking about taking a snapshot at the time of change.

For example
Say the secret name was foo. At the time of snapshot, foo would be copied to a new secret (name managed by the controller) with a snapshot of the content of foo at the time. This means foo can easily change after deployment update but the deployment itself does not see the changes. The snapshots themselves should probably be marked immutable now that feature exists.

Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

It does overlap. Just start off with an immutable secret rather than creating an immutable copy of a mutable secret. As mentioned elsewhere that also has the advantage of not raising scalability or debuggability concerns as we never have to copy anything.

Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

I agree btw that the whole idea of having mutable configmaps/secrets somewhat short-circuits the idea of rolling out, because you can not roll back. Having a server-side solution for generating immutable copies on change out of mutable configmaps/secrets rather than doing this clientside via customize or helm might make sense but it is IMHO a an orthogonal problem from the other one. (or this one from the other, depends on your perspective).

Copy link
Author

@kfox1111 kfox1111 Oct 6, 2020

The idea here is more about orchestration. Yes, you could manually orchestrate the upgrade yourself using immutable configmaps/secrets. We're trying to automate it though so the user doesn't have to get into the details if they don't want. So I think it really is complementary.


The ConfigMapVolumeSource and SecretVolumeSource objects will get two new, optional fields:
* snapshot: boolean
* watch: boolean
Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

+1 on the name change, maybe rolloutOnChange? The pod spec change is more of a symptom to the actual goal


Both will default to false for compatibility.

snapshot can be either true or false. Watch can only be true if snapshot is also true. This ensures immutability.
Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

Why does watch require snapshot to be true? Also, wouldn't it make more sense to opt into this on workload-level (Deployment, DS, Statefulset) because if immutability is wanted, we already have the immutable configmap/secret fature?


The fields will be ignored by all objects other then Deployment, DaemonSet and StatefulSet.

DaemonSet/StatefulSet controllers will be modified such that:
Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

I personally think it might more sense to:

  • Have a dedicated controller
  • Make that controller update an annotation on the podTemplate based e.G. on configmap/secret resourceVersion and leave the actual rollout to the existing controllers

and as mentioned before, it will be easier to not create new configmaps/secrets and also raise less scalability/debuggability concerns

Copy link
Author

@kfox1111 kfox1111 Jul 17, 2020

So a new controller that also watches Deployments/DaemonSets/Statefulsets and triggers a rolling upgrade via annotation? That could be made to work I think. Wouldn't that put 2x the load on the apiserver though?

yes, it would be easier not to create new configmaps/secrets, but that sidesteps the actual users issue.
Users need to be able to easily update config. That is easiest done in place by editing the existing configmap. Sure you can make everyone upload immutable configmaps, uniquely named, and then use those for the roll forward/backwards. But that requires the user to: 1, keep track of unique names, 2. update the deployment when it changes, 3 manually garbage collect the configmap/secret when its done with.

How does watch work if you force the secret/configmap to be immutable? Theres no reason to watch then I think?

Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

That could be made to work I think. Wouldn't that put 2x the load on the apiserver though?

Why would it put double the load on the apiserver? Caches are shared within the kube-controller-manager.

How does watch work if you force the secret/configmap to be immutable? Theres no reason to watch then I think?

If you would use this with immutable configmap/secrets it will never do anything, yes

yes, it would be easier not to create new configmaps/secrets, but that sidesteps the actual users issue.
Users need to be able to easily update config

As mentioned before, yes, having managed, immutable configmaps/secrets that are created whenever a configmap is updated would also be nice but it is an orthogonal issue. I guess if you were to solve the managed configmaps/secrets you wouldn't need the watching anymore. But its a lot more debatable.

Copy link
Member

@kargakis kargakis Oct 15, 2020

Agreed the API proposed should be accomodated by its own controller.


The Deployment controller will be modified such that:
* During a "change", on seeing a snapshot=true flag on a configmap/secret, a copy of the configmap/secret will be made with a unique name. This unique name will be stored in the corresponding ReplicaSet.
* When a ReplicaSet is deleted, a deletion of the corresponding configmap/secret snapshots will also be issued.
Copy link
Member

@alvaroaleman alvaroaleman Jul 17, 2020

It can not be made to work on pods, pod volumes are immutable. I would also argue it should not be made on pods, because that doesn't allow to have any control in case the new revision causes issue.

I think more or less the same is true for Jobs, you should use immutable configmaps/secrets for them to get deterministic results

@kfox1111
Copy link
Author

@kfox1111 kfox1111 commented Jul 17, 2020

I don't know why, but github is letting me respond to some comments and not others... responding to the rest here:

Why does watch require snapshot to be true? Also, wouldn't it make more sense to opt into this on workload-level (Deployment, DS, Statefulset) because if immutability is wanted, we already have the immutable configmap/secret fature?

The way watch is working in this context, without a snapshot a watch wouldn't have anything to trigger off of in the described implementation if there wasn't a snapshotted new configmap name to push into the new replicaset.

It can not be made to work on pods, pod volumes are immutable. I would also argue it should not be made on pods, because that doesn't allow to have any control in case the new revision causes issue.

I think more or less the same is true for Jobs, you should use immutable configmaps/secrets for them to get deterministic results

I agree on pods and jobs. Thats why it was stated it only works on deployments, statefulsets and daemonsets. Only those three objects have a concept of "version" of the podtemplate where you can roll forwards and backwards. Its these "versions" that need immutable configmaps/secrets during the life of that "version". The simplest thing on the user I can think of is to version the configmap at the same time the podtempate is versioned and keep the lifecycle the same. ReplicaSet and configmaps get created at the same time, and get deleted at the same time.

Maybe an example will help.
User uploads configmap foo and then deployment foo. its set as watched and snapshotted.
when the deployment is created, configmap foo is copied to immutable configmap foo-1 and replicaset foo-1 is created pointing at configmap foo-1 in the podtempate

User then edits configmap foo
deployment notices it, copies configmap foo to immutable configmap foo-2 and creates replicaset foo-2 pointing at configmap foo-2.

All the pods then in replicaset foo-1 are always consistent, and all the pods in replicaset foo-2 are always consistent. You can roll forwards and back between foo-1 and foo-2 and it always works consistently. This doesn't work consistently today unless you are very careful and add a lot of manual, error prone steps.

Then if the user deletes replicaset foo-1 cause they are done with it (or the system does for them), configmap foo-1 gets garbage collected too.
If the user deletes the deployment, also all the snapshotted configmaps associated with the deployment go away too.

So, for the user, the cognitive burden is just, one configap with their config, and one deployment for orchestrating the app. Rolling forward/back just works. That is how they think it works when they first go into Kubernetes and then find out its much more complected today then that.

Copy link
Member

@lavalamp lavalamp left a comment

I'm not super excited about this, I left some comments. This needs a better API design and a GC plan to have a chance.

I'm not excited by the temporal dependencies this introduces -- the system will do different things depending on the update order between a deployment and a configmap. If you update both in a short amount of time it's particularly unclear what would happen.

I feel like the controller changes to implement this would be fairly complex. I recommend dropping the "watch" flag completely; users can touch the object with a no-op PATCH for a rollout on-demand, add that feature in a separate step.


snapshot can be either true or false. Watch can only be true if snapshot is also true. This ensures immutability.

The fields will be ignored by all objects other then Deployment, DaemonSet and StatefulSet.
Copy link
Member

@lavalamp lavalamp Oct 6, 2020

I don't think we should pollute actual pod objects with fields only applicable to templates.

Copy link
Member

@lavalamp lavalamp Oct 6, 2020

The easy alternative is for things that embed templates to include a list of secrets and a list of configmaps for this behavior.

A harder alternative is to duplicate the pod types so that there's a separate set of template types.

Copy link
Author

@kfox1111 kfox1111 Oct 6, 2020

Yeah. That would work too. For the user, having the flag right along with the volume feels the easiest thing to do API wise. But your right it does make the API less clean, as it shows up in a bunch of places it wouldn't be used.

I could get behind embedding a list of configmaps/secrets for the behavior. What api do you think that should be?
using old names. can change to whatever

snapshots:
  configMaps: []
  secrets: []
watch:
  configMaps: []
  secrets: []

Copy link
Member

@kargakis kargakis Oct 15, 2020

Agreed, we shouldn't be polluting the pod spec with things that only make sense for a subset of the workload APIs. We could have an object that includes a list of config map and secret refs and share that across the workload APIs that need to use it. FWIW, in Openshift, this type of object is called a trigger.

The fields will be ignored by all objects other then Deployment, DaemonSet and StatefulSet.

DaemonSet/StatefulSet controllers will be modified such that:
* During a "change", on seeing a snapshot=true flag on a configmap/secret, a copy of the configmap/secret will be made with a unique name. This unique name will be stored in the corresponding ControllerRevision.
Copy link
Member

@lavalamp lavalamp Oct 6, 2020

How will garbage collection be set up?

Copy link
Member

@lavalamp lavalamp Oct 6, 2020

I recommend making the name quasi-deterministic by e.g. hashing together the contents of the deployment and configmap.

It needs to explain how rollback/rollforward either finds or reconstructs the prior configmaps.

The config map copy needs to somehow help users find the original or something.

Copy link
Author

@kfox1111 kfox1111 Oct 6, 2020

Garbage collection would follow the lifecycle of the ReplicaSet. If the ReplicaSet gets deleted, so would the ConfigMap/Secret snapshot.

Copy link
Author

@kfox1111 kfox1111 Oct 6, 2020

Yeah, Hashing would work I think.

The ReplicaSets are retained. So rolling between them just updates the count.

We could put in an annotation pointing at the origional configmap name. But it isn't guaranteed to be there.

The fields will be ignored by all objects other then Deployment, DaemonSet and StatefulSet.

DaemonSet/StatefulSet controllers will be modified such that:
* During a "change", on seeing a snapshot=true flag on a configmap/secret, a copy of the configmap/secret will be made with a unique name. This unique name will be stored in the corresponding ControllerRevision.
Copy link
Member

@kargakis kargakis Oct 15, 2020

Why stored in the ControllerRevision? This seems antithetical to how the high level workloads are designed. We shouldn't be mucking with children objects (ControllerRevisions, ReplicaSets), these are managed by the controllers. The new configmap/secret snapshot should be set back into the Deployment/DS/StatefulSet so the rollout can be automatically triggered, right?

* During a "change", on seeing a snapshot=true flag on a configmap/secret, a copy of the configmap/secret will be made with a unique name. This unique name will be stored in the corresponding ControllerRevision.
* All pods created will reference the unique configmap/secret snapshot name, not the base name.
* When a ControllerRevision is deleted, a deletion of the corresponding configmap/secret snapshots will also be issued.
* When an object with a watch flag of true is created/updated, watches on the specified configmap/secret will be added. Any watch triggered will be treated as a "change" of the object.
Copy link
Member

@kargakis kargakis Oct 15, 2020

watch seems redundant to me, the new controller that fulfills this proposal should be able to watch changes to configmaps/secrets that need to be snapshot.


The Deployment controller will be modified such that:
* During a "change", on seeing a snapshot=true flag on a configmap/secret, a copy of the configmap/secret will be made with a unique name. This unique name will be stored in the corresponding ReplicaSet.
* When a ReplicaSet is deleted, a deletion of the corresponding configmap/secret snapshots will also be issued.
Copy link
Member

@kargakis kargakis Oct 15, 2020

How exactly is the delete issued? I think this proposal should leverage owner references and the existing garbage collector so we don't have to deal with GC on our own.

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Jan 13, 2021

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@alvaroaleman
Copy link
Member

@alvaroaleman alvaroaleman commented Jan 13, 2021

For posterity, one workaround for "I want my pod to rotate when config changes" is to put the config in a pod annotation and then use the downward api to get the content of that annotation into an env var or a file. I am very certain this was never intended to be used like that, but its the least bad workaround I am aware of.

@dharmab
Copy link

@dharmab dharmab commented Jan 13, 2021

@alvaroaleman
Copy link
Member

@alvaroaleman alvaroaleman commented Jan 13, 2021

Wouldn’t it then be simpler to put the config into an environment variable?

Yeah, but many applications do not support reading their config from an env var, the downward api also allows to get it into a file

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Feb 12, 2021

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@kfox1111
Copy link
Author

@kfox1111 kfox1111 commented Feb 15, 2021

Still a problem.

/remove-lifecycle rotten

@fejta-bot
Copy link

@fejta-bot fejta-bot commented May 16, 2021

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@kfox1111
Copy link
Author

@kfox1111 kfox1111 commented May 17, 2021

Still a problem.

/remove-lifecycle rotten

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Jun 16, 2021

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@kfox1111
Copy link
Author

@kfox1111 kfox1111 commented Jun 16, 2021

Still a problem.

/remove-lifecycle rotten

@pigletfly
Copy link
Member

@pigletfly pigletfly commented Jul 13, 2021

/cc

@k8s-ci-robot k8s-ci-robot requested a review from pigletfly Jul 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet