Add a method to generate a strategic merge patch #13007

jackgr · 2015-08-21T01:37:59Z

This is the first step in implementing #1702. It adds a method named CreateStrategicMergePatch to pkg/util/strategicmergepatch/patch.go that generates a strategic merge patch by taking the difference between before and after YAML or JSON strings. It will be used by apply (aka reconcile) to generate patches that can be used to update live configuration based on the difference between the current and previous configurations of a resource.

cc @bgrant0607, @ghodss

ghodss · 2015-08-21T01:58:46Z

pkg/util/strategicpatch/patch.go

+// If two map[string]interface{} are given, all elements must match.
+func matchesValue(av, bv interface{}) bool {
+	switch at := av.(type) {
+	case string:


I don't think you need explicit primitive casting - see http://play.golang.org/p/1xOcMJJ5fO.

You should be able to collapse this to

case string, float64, bool: return bt == at

k8s-bot · 2015-08-21T03:17:45Z

GCE e2e build/test passed for commit d9568b7bd5c5c3c2ab0d6616fff8767961ec8308.

ghodss · 2015-08-21T03:22:58Z

pkg/util/strategicpatch/patch.go

+var errBadJSONDoc = fmt.Errorf("Invalid JSON document")
+var errNoListOfLists = fmt.Errorf("Lists of lists are not supported")
+
+func CreateStrategicMergePatch(a, b []byte, dataStruct interface{}) ([]byte, error) {


I would prefer the variable names original, modified vs a, b to keep consistent with StrategicMergePatch's original, patch, but that's just a personal preference and up to you if you want to change it.

+1 for original, modified as var names

ghodss · 2015-08-21T03:35:09Z

This is awesome. Reconcile functionality and deployment objects together are going to enable us to switch from updating our clusters with kubectl to using git push + autoreconcile. Very very excited.

brendandburns · 2015-08-21T04:47:31Z

@kubernetes/kubectl

bgrant0607 · 2015-08-25T22:02:44Z

@jlowdermilk Do you have time to take a look?

bgrant0607 · 2015-08-25T22:03:46Z

cc @sparkprime

bgrant0607 · 2015-08-25T22:13:05Z

pkg/util/strategicpatch/patch.go

@@ -31,7 +31,352 @@ import (
 // lists should be merged or replaced.
 //
 // For more information, see the PATCH section of docs/api-conventions.md.
+
+// CreateStrategicMergePatch creates a patch that can be passed to StrategicMergePatch.


The idea here is that we'd diff the original config (stored in an annotation) and updated config, and then apply the patch to the live state?

That's a reasonable starting point.

We'll also want to be able to detect where the live state has diverged from the original (and perhaps new) configuration.

We'll need 2 diff modes:

One that detects all changes: changes, deletions, additions. Used to generate the patch, and maybe to determine whether anything has changed.

One that detects just changes and deletions. Additions would be ignored. This would be used to detect divergence. Initially we could barf if divergence were detected. Eventually we'll want --force to override the changes in the live state.

Export (#13038) could be the means by which a new config could be generated from the live state, if that's what the user wanted.

Agreed that we need a mode that detects changes and deletions, while ignoring additions. I've added a boolean parameter to CreateStrategicMergePatch called ignoreAdditions. The check for divergence can be made by calling it with this parameter set to true, and then looking at the result. If and only if it's empty, then there were no changes or deletions between original and modified.

Rather than bail if we detect any divergence between the original configuration to the live state, I think we want to bail only if we detect conflicts between such a divergence and the patch that the user now wants to apply. That's really the point of generating the strategic merge patch, right? We can apply the changes that the user is requesting on top of changes made by automated processes, if and only if there are no conflicts between the two change sets.

Ignoring for the moment the possibility of conflicts between different but related properties, we should bail if cases where the same properties are changed in conflicting ways. In other words, we should bail if there are any properties in both change sets that have different values. We're fine if one change set changed a property that the other change set didn't touch, and we're fine if both change sets changed the same property to the same value.

The OpenShift json merge implementation detects these simple conflicts by generating the patch required to go from original to live, and then comparing it with the patch being applied. However, it doesn't attempt to ignore additions, since they don't generate conflicts anyway. In other words, we didn't really the flag to ignore additions. If we don't have a compelling use case for it, then I think it would be cleaner to back it out and always generate a complete patch that includes additions, as well as deletions and modifications.

The OpenShift json merge implementation also provides a mechanism for detecting more complex conflicts involving incompatible changes to related properties: it allows the caller to register a set of preconditions and then plays them against any given patch before applying it. If the caller knows about constraints that span two or more properties, then they can write preconditions to look for violations of those constraints, and register them with the library. I'm inclined to include this feature, since it's small and simple, and provides significant power.

Let's walk through an example.

Original object stored as annotation: containers: - name: a image: x - name: b image: y New local object to reconcile: containers: - name: b image: y settings: flag1: true Current server object: containers: - name: a image: x - name: b image: y settings: flag2: true - name: c image: z Patch we should generate: containers: - name: a $patch: delete - name: b image: y settings: flag1: true

I am going to assume that, not only is the above correct, but that the above situation would not cause a conflict. flag1 and flag2 would simply merge together and both be set to "true". There are two situations I can think of where the data would be such that we could throw an error, if we want it to:

If "Current server object" had some information added to container "a", then we could decide whether that throws a conflict. I think it would be safe to say that it doesn't. It also would be hard to throw an error because it can't be deduced just by looking at the patch and the server object.

If "Current server object" didn't have flag1 set to true, but instead flag2 set to false. That would represent a true irreconcilable conflict. So we could either force-clobber, or we could try to detect and warn. If we do want to detect, as @jackgr says, I think we could do so without needing a non-additions API and instead detect by generating the patch and comparing it to the live server object.

In other words I think I'm agreeing with @jackgr that we don't need the additions API and a comparison of the live server object with the patch should be sufficient. @bgrant0607 can you outline a scenario where that is not sufficient?

@jackgr re: incompatible change constraints, should that kind of thing not just be implemented in the server-side validation of the final state of the objects post-patch, instead of being re-implemented and re-checked in this library?

Just had a long chat with @jackgr. It looks like we may be on the wrong track with this approach. Based on what is currently implemented in this PR, here is how kubectl apply would work:

Get the live object from the server along with the original annotation.

Compare the local object to the original annotation and create a patch of the differences between the two. Then send that patch to the server as a strategic merge patch.

The issue with this approach is the use case for when I run kubectl apply on a resource that I have not changed since the last configuration run, but the live server state has modified fields that are explicitly defined in my local object. This should not be a conflict - any and all fields specified in our local object should always simply override all the server object fields. The above approach does not accomplish this because it has no way to detect such changes to the server object. Therefore it does not give us true consistent reconciliation of the local config object to the server every time we run kubectl apply.

The alternative approach (originally outlined in #1702 (comment)) is as follows:

Get the live object from the server along with the original annotation.

Compare the local object with the original annotation to generate a new JSON object which is equivalent to the current local object but decorated with null's (or $patch: delete's) needed to remove any fields that were removed in the current local object but existed in the original annotation.

Submit that new JSON as a Strategic Merge patch.

In other words, the local config object IS the patch itself sans deletions of removed fields. So the only thing we need to add is a method like AddDeletionsToPatch, which would take the local object and the original annotation as parameters and return the local object decorated with the right deletions (e.g. mapKey: null and $patch: delete).

@bgrant0607 can you confirm that the second approach is what we actually want and we should go in that direction instead?

Actually, that's almost correct. The current approach can detect changes to the server object, since it computes the divergence in the live state from the original. However, it chooses to fail the update with a conflict if the live state contains changes that are not consistent with the local object.

Without this conflict detection, there isn't much point in an automated process ever making changes to the live state that deviate from the annotation, since they'll simply be discarded the next time the user runs kubectl apply.

Of course, that approach begs a question of timing. How often should the user run kubectl apply to ensure that the live state still conforms to their local configuration? Once an hour? Once a minute? Once a second?

Whatever the answer, it defines a window during which the user willingly lets the configuration deviate from their intent. Perhaps they should watch for changes, and quickly update the object if they detect any divergences? Of course, I'm being facetious.

If what we really is for the configuration to never diverge from what the user specified, then we should simply prevent callers from making conflicting changes in the first place. The annotation contains the configuration requested by the user, and the state is never allowed to deviate from it, other than through additions. This policy can be easily enforced on the server side.

To implement kubectl apply, we compare the live state with the annotation to compute the additions made by automated processes, add them to the local object, and then PUT it to the server.

If you haven't already, please read:
https://github.com/kubernetes/kubernetes/blob/master/docs/devel/api-conventions.md#spec-and-status (the spec part)
https://github.com/kubernetes/kubernetes/blob/master/docs/devel/api-conventions.md#defaulting
https://github.com/kubernetes/kubernetes/blob/master/docs/devel/api-conventions.md#late-initialization

"The specification is a complete description of the desired state, including configuration settings provided by the user, default values expanded by the system, and properties initialized or otherwise changed after creation by other ecosystem components (e.g., schedulers, auto-scalers), and is persisted in stable storage with the API object."

Examples of fields that are automatically set:

Default values: pod.spec.restartPolicy, pod.spec.containers[].imagePullPolicy

Admission control: pod.spec.containers[].resources.limits.cpu

Late initialization: pod.spec.nodeName, service.spec.clusterIP

Horizontal auto-scaling and rolling update: replicationcontroller.spec.replicas

Vertical auto-sizing: pod.spec.containers[].resources.requests.cpu

Some of these fields are immutable once set, and there are readonly fields in the metadata that can't be set by the user, either.

Due to all of these processes, it is not possible for a user to edit their original config file and feed it to kubectl replace, which expects a fully specified resource, at least not without using --force, which is very disruptive, since it deletes and recreates the resource.

THAT is the workflow that I want to enable: edit your configs and do kubectl apply -f myconfigs/. Everything else, such as conflict detection, is gravy.

Everything not explicitly specified by the user in their configs should be reserved for operational tweaking, both manual (kubectl scale, kubectl label, kubectl patch, etc.) and automated (auto-scaling, auto-sizing, automated deployments, etc.).

As I mentioned elsewhere, the user should not be re-running kubectl apply just to ensure conformance of the live state to their desired state. If the live state diverges, it's either user error, such as misconfiguring an auto-scaler, or intentional, which is where conflict detection upon the next update they try to do would be nice.

Also read http://kubernetes.io/v1.0/docs/user-guide/managing-deployments.html

I don't want users to have to figure out whether to use patch or replace, how to construct a patch, whether to use --force or not, etc.

And we can't expect users to explicitly (esp. manually) specify every field they don't care about. That would break forward compatibility -- we wouldn't be able to add new fields to the API without forking a new API version.

Please write all this up in a design doc, btw.

k8s-github-robot · 2015-08-27T22:59:51Z

Labelling this PR as size/XL

k8s-bot · 2015-08-31T18:23:46Z

GCE e2e build/test failed for commit b38bdecd56828c750117e74f2db92b3d523feb34.

k8s-bot · 2015-08-31T20:07:50Z

GCE e2e build/test passed for commit 9b8447bb8ce80a14fd38d0969655e5b62aea1a12.

jackgr · 2015-08-31T20:17:24Z

The changes that I just pushed in commit 9b8447b address all of the issues raised in the comments. However, per this reply, I think we should back out the flag to ignore additions, and add the logic from the OpenShift json merge implementation that checks for conflicts. It should be the last change to this PR, modulo bug fixes.

k8s-github-robot · 2015-09-01T00:07:29Z

Labelling this PR as size/XXL

smarterclayton · 2015-09-03T01:16:20Z

I have a question - I may have misunderstood parts of the ongoing discussion, so apologies in advance if this is clearly covered.

We're racing on a lot of client commands with controllers - so much so that repeatable scripting is becoming difficult. For example, a command that creates an RC and then a subsequent command that adds a label fails frequently because they are conflicting over update.

Will SMP be useable in the client commands to record enough of a stable diff that we can safely merge on the server? I assume so, but I've neglected this thread and others. So kubectl label should be reading, generating a patch (with before values included?) and sending that to PATCH, so that as long as another process doesn't modify the label field, the update will succeed?

ghodss · 2015-09-03T01:23:07Z

PATCH does not need the before values included (for maps by default and for lists because of SMP), and by default will overwrite. So as long as the specific label key you're setting isn't defined in the config that's getting reconciled, you should not have any issues.

jackgr · 2015-09-10T17:29:06Z

Changed, rebased, squashed and ready to merge if no further comments. This PR now just adds a method to create a strategic merge patch from the original and modified files.

cc @bgrant0607, @ghodss, @janetkuo

k8s-bot · 2015-09-10T17:34:14Z

GCE e2e build/test passed for commit 377fbf2c26dc1f1d0a05e2b92035a4048e06c3f4.

k8s-bot · 2015-09-10T18:13:02Z

GCE e2e build/test failed for commit 1cf66d4a574b8259cdaef3d692e61ef64da618d1.

janetkuo · 2015-09-10T18:59:58Z

@jackgr Looks like you need gofmt.

k8s-bot · 2015-09-10T21:59:20Z

GCE e2e build/test failed for commit 3ecb1ba.

janetkuo · 2015-09-10T22:06:58Z

@k8s-bot test this please
test failure looks irrelevant.

k8s-bot · 2015-09-10T22:31:27Z

GCE e2e build/test failed for commit 3ecb1ba.

janetkuo · 2015-09-10T23:33:02Z

@k8s-bot test this please

k8s-bot · 2015-09-11T00:23:44Z

GCE e2e build/test passed for commit 3ecb1ba.

janetkuo · 2015-09-11T00:34:42Z

@jlowdermilk @ghodss since the reconcile part is moved to the other PR, this one should be ready to merge now. I have another PR depending on this one so if you all agree we can merge it.

j3ffml · 2015-09-11T00:41:54Z

lgtm

k8s-github-robot · 2015-09-11T20:04:43Z

The author of this PR is not in the whitelist for merge, can one of the admins add the 'ok-to-merge' label?

k8s-github-robot · 2015-09-11T21:58:50Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-bot · 2015-09-11T22:27:01Z

GCE e2e build/test passed for commit 3ecb1ba.

k8s-github-robot · 2015-09-11T22:27:14Z

Automatic merge from SubmitQueue

Auto commit by PR queue bot

googlebot added the cla: yes label Aug 21, 2015

ghodss reviewed Aug 21, 2015
View reviewed changes

k8s-github-robot assigned brendandburns Aug 21, 2015

bgrant0607 assigned j3ffml and unassigned brendandburns Aug 25, 2015

bgrant0607 reviewed Aug 25, 2015
View reviewed changes

k8s-github-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 27, 2015

jackgr force-pushed the diff_and_patch branch from d9568b7 to b38bdec Compare August 31, 2015 17:46

jackgr force-pushed the diff_and_patch branch from b38bdec to 9b8447b Compare August 31, 2015 19:44

k8s-github-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 1, 2015

jackgr force-pushed the diff_and_patch branch from 377fbf2 to 1cf66d4 Compare September 10, 2015 17:26

jackgr mentioned this pull request Sep 10, 2015

Add method to apply strategic merge patch #13820

Merged

Add method to create strategic merge patch.

3ecb1ba

jackgr force-pushed the diff_and_patch branch from 1cf66d4 to 3ecb1ba Compare September 10, 2015 21:01

janetkuo added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 11, 2015

k8s-github-robot added the needs-ok-to-merge label Sep 11, 2015

janetkuo added ok-to-merge and removed needs-ok-to-merge labels Sep 11, 2015

k8s-github-robot pushed a commit that referenced this pull request Sep 11, 2015

Merge pull request #13007 from jackgr/diff_and_patch

5bafcc2

Auto commit by PR queue bot

k8s-github-robot merged commit 5bafcc2 into kubernetes:master Sep 11, 2015

This was referenced Oct 19, 2015

Configuration reconciliation (aka kubectl apply) #1702

Closed

UpdateApplyAnnotation stores large annotation on every kubectl operation #15878

Closed

kubectl apply umbrella issue #15894

Closed

jackgr deleted the diff_and_patch branch October 26, 2015 20:56

Add a method to generate a strategic merge patch #13007

Add a method to generate a strategic merge patch #13007

Conversation

jackgr commented Aug 21, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-bot commented Aug 21, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghodss commented Aug 21, 2015

brendandburns commented Aug 21, 2015

bgrant0607 commented Aug 25, 2015

bgrant0607 commented Aug 25, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-github-robot commented Aug 27, 2015

k8s-bot commented Aug 31, 2015

k8s-bot commented Aug 31, 2015

jackgr commented Aug 31, 2015

k8s-github-robot commented Sep 1, 2015

smarterclayton commented Sep 3, 2015

ghodss commented Sep 3, 2015

jackgr commented Sep 10, 2015

k8s-bot commented Sep 10, 2015

k8s-bot commented Sep 10, 2015

janetkuo commented Sep 10, 2015

k8s-bot commented Sep 10, 2015

janetkuo commented Sep 10, 2015

k8s-bot commented Sep 10, 2015

janetkuo commented Sep 10, 2015

k8s-bot commented Sep 11, 2015

janetkuo commented Sep 11, 2015

j3ffml commented Sep 11, 2015

k8s-github-robot commented Sep 11, 2015

k8s-github-robot commented Sep 11, 2015

k8s-bot commented Sep 11, 2015

k8s-github-robot commented Sep 11, 2015