-
Notifications
You must be signed in to change notification settings - Fork 38.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let the garbage collector use json merge patch when SMP is not supported #63386
Conversation
bb0424d
to
dbaff2b
Compare
/retest |
@@ -39,21 +39,27 @@ func deleteOwnerRefStrategicMergePatch(dependentUID types.UID, ownerUIDs ...type | |||
return []byte(patch) | |||
} | |||
|
|||
// TODO: remove this function when we can use strategic merge patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not plan on supporting smp on cr? or do we just plan on always allowing json patch in the gc, even if we start supporting smp on cr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// unavailable. | ||
return nil, fmt.Errorf("doesn't have a local cache for %s", gvr) | ||
// If local cache doesn't exist for mapping.Resource, send a GET request to API server | ||
apiResource, _, err := gc.apiResource(apiVersion, kind) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use mapping.Resource again? looks like it does the same thing as the beginning of this function https://github.com/roycaihw/kubernetes/blob/ebb54562d1789f0e198cad35ad34a0ae1175deb4/pkg/controller/garbagecollector/operations.go#L43-L50
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! I guess I was copypasting something and didn't realize the redundant.
return | ||
} | ||
_, err = gc.patchObject(dependent.identity, patch, types.MergePatchType) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not "if err != nil && !errors.IsNotFound(err)" like at line 534?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON merge patch cannot manipulate arrays (owner reference in this case). JSON merge patch replaces the existing array with an entire array (which we constructed in deleteOwnerRefJSONMergePatch
). So we don't need to handle error if the target owner reference doesn't exist here.
} | ||
_, err = gc.patchObject(item.identity, patch, types.MergePatchType) | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you catch the "try SMP, if is fails then try JSONMergePatch" logic into a function? Something like:
type jsonMergePatchFunc func() []byte
func (gc *GarbageCollector) patch(identity objectReference, smp []byte, jmp jsonMergePatchFunc) (*Unstructured, error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the advise! It's much more readable now. One nit is that I have to add a unnamed parameter to unblockOwnerReferencesJSONMergePatch
to implement the function interface https://github.com/roycaihw/kubernetes/blob/2117b46c1ae5bb4565bbce1eb0c88ed3fb9ac568/pkg/controller/garbagecollector/patch.go#L155
mapping, err := gc.restMapper.RESTMapping(fqKind.GroupKind(), fqKind.Version) | ||
if err != nil { | ||
return nil, newRESTMappingError(kind, apiVersion) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can be replaced by gc.apiResource()
. And we can remove the call to apiResource
at line 53
return nil, err | ||
} | ||
// Unstructed implements metav1 interface | ||
return resource, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just return gc.dynamicClient.Resource(apiResource).Namespace(namespace).Get(name, metav1.GetOptions{})
/lgtm /test pull-kubernetes-e2e-kops-aws |
If we could build a jsonmerge patch, why would we ever try to submit a strategic merge patch? |
jsonmerge must include resourceVersion to safely remove a single item from finalizers, so it is more subject to conflicts the client has to retry |
Realistically, how many conflicts do you think you'll get on deleted objects? Not that it isn't a concern, but it does seem rather dubious. I don't feel strongly, just seems like its optimizing for an edge. |
I had similar thoughts (start with jsonmerge only and client retry and measure whether it matters in practice). @lavalamp suggested SMP falling back to jsonmerge at #56348 (comment) |
_, err = gc.patchObject(item.identity, patch) | ||
ownerUIDs := append(ownerRefsToUIDs(dangling), ownerRefsToUIDs(waitingForDependentsDeletion)...) | ||
patch := deleteOwnerRefStrategicMergePatch(item.identity.UID, ownerUIDs...) | ||
_, err = gc.patch(item, patch, gc.deleteOwnerRefJSONMergePatch, ownerUIDs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get rid of the variadic variable, you can
type jsonMergePatchFunc func(*node) ([]byte, error)
gc.patch(item, patch, func (n *node) ([]byte, error) {
gc.deleteOwnerRefJSONMergePatch(n, ownerUIDs...)
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
patch := deleteOwnerRefPatch(dependent.identity.UID, owner.UID) | ||
_, err := gc.patchObject(dependent.identity, patch) | ||
patch := deleteOwnerRefStrategicMergePatch(dependent.identity.UID, owner.UID) | ||
_, err := gc.patch(dependent, patch, gc.deleteOwnerRefJSONMergePatch, owner.UID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should export the number of conflicts as a metric if we don't already.
Chao, can you make an issue?
…On Wed, May 30, 2018 at 1:37 PM Chao Xu ***@***.***> wrote:
@liggitt <https://github.com/liggitt> @deads2k
<https://github.com/deads2k> I don't know if there exists realistic
workload tests to measure the number conflicts. Last time I checked, the
kubemark load test only has replicasets and pods.
As a first step, maybe we can check how many conflicts there are when
running the entire e2e suite.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#63386 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAnglnOtOqpQqqsQmkdJgp7m0D6h1idKks5t3wMSgaJpZM4Twcy3>
.
|
if err != nil { | ||
return nil, err | ||
} | ||
expectedObjectMeta := ObjectMetaForPatch{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k @liggitt do you have better ideas to construct json merge patch? We used ObjectMetaForPatch
to hold patch contents instead of using ObjectMeta
directly because some optional fields in ObjectMeta
is not pointer, so they are serialized as <field name>: null
instead of being omitted, so the resulting json merge patch would be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so they are serialized as : null instead of being omitted, so the resulting json merge patch would be wrong.
seems like you need to ensure those fields are never null, e.g. expectedOwners := []metav1.OwnerReference{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which fields were problematic? non-pointer slice fields can be set to null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually only one problematic field: CreationTimestamp
.
meta := v1.ObjectMeta{}
meta.OwnerReferences = []v1.OwnerReference{
{UID: "freaf"},
}
meta.ResourceVersion = "1234"
data, err := json.Marshal(meta)
if err != nil {
panic(err)
}
fmt.Printf("data=%s\n", data)
Output is
data={"resourceVersion":"1234","creationTimestamp":null,"ownerReferences":[{"apiVersion":"","kind":"","name":"","uid":"freaf"}]}
Maybe we can manually remove the creationTimestamp
part, and add a unit test to make sure the patch only contains resourceVersion
and ownerReferences
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably use the unstructured object like this:
u := &unstructured.Unstructured{Object: map[string]interface{}{}}
u.SetResourceVersion(...)
u.SetOwnerReferences(...)
data, err := json.Marshal(u)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code mentioned above doesn’t use that reflective conversion path. Please don’t use Sprintf to form structured patches that contain any user controlled data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetOwnerReferences uses the conversion path
kubernetes/staging/src/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured/unstructured.go
Lines 181 to 184 in b5d21a9
func (u *Unstructured) SetOwnerReferences(references []metav1.OwnerReference) { | |
newReferences := make([]interface{}, 0, len(references)) | |
for _, reference := range references { | |
out, err := runtime.DefaultUnstructuredConverter.ToUnstructured(&reference) |
Is the concern about using Sprintf that we may change the json field name (ownerReferences
) in future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I was considering the marshaling, not the SetOwnerReferences call. I'd benchmark the difference before doing anything too complicated in the code over that performance concern.
The concern about sprintf is that data won't be escaped or formatted properly. Sprintf isn't a proper tool for marshaling/escaping structured data. I'd take a single-use typed struct like the current state of the PR over Sprintf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a benchmark for marshaling the patch but it's hard to tell how it will affect GC performance:
func BenchmarkReflective(b *testing.B) {
owners := []v1.OwnerReference{
{UID: "freaf"},
}
resourceVersion := "1234"
for i := 0; i < b.N; i++ {
u := &unstructured.Unstructured{Object: map[string]interface{}{}}
u.SetResourceVersion(resourceVersion)
u.SetOwnerReferences(owners)
_, err := json.Marshal(u)
if err != nil {
panic(err)
}
}
}
func BenchmarkSprintf(b *testing.B) {
owners := []v1.OwnerReference{
{UID: "freaf"},
}
resourceVersion := "1234"
for i := 0; i < b.N; i++ {
expectedOwnersJSON, err := json.Marshal(owners)
if err != nil {
panic(err)
}
_ = []byte(fmt.Sprintf(`{"metadata":{"resourceVersion":"%s","ownerReferences":%s}}`, resourceVersion, expectedOwnersJSON))
}
}
Result:
BenchmarkReflective-12 200000 7368 ns/op
BenchmarkSprintf-12 2000000 894 ns/op
Withdrew the last commit to go with the single-use typed struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt @lavalamp I still prefer using the single-user typed ObjectMetaForPatch
struct. If the current implementation looks good I can squash.
Some alternatives:
-
Use json patch instead of jsonmerge patch. The benefit is that the patch size will be smaller when we delete some owner references. The json patch only contains the owner references we want to delete, instead of the entire owner references array in json merge patch. For unblocking owner references json patch still has to include the entire array. If we want to avoid allocation, we can use Sprintf to construct the json patch. The downside is still that Sprintf isn't a proper tool for marshaling/escaping structured data. And json patch may require more careful implementation & testing regarding deleting element in array by indexes.
-
Allocate a map[string]interface{} and marshal it e.g.:
metadata := map[string]interface{} {
"metadata": map[string]interface{} {
"resourceVersion": string{resourceVersion},
"ownerReferences": []metav1.OwnerReference {owners},
},
}
patch, err := json.Marshal(metadata)
Can you not construct the patch manually? It only needs to do one thing, no?
…On Wed, May 30, 2018 at 8:44 PM Chao Xu ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pkg/controller/garbagecollector/patch.go
<#63386 (comment)>
:
> @@ -52,3 +147,24 @@ func (n *node) patchToUnblockOwnerReferences() ([]byte, error) {
dummy.ObjectMeta.UID = n.identity.UID
return json.Marshal(dummy)
}
+
+// Generate a JSON merge patch that unsets the BlockOwnerDeletion field of all
+// ownerReferences of node.
+// NOTE: The unnamed list of UID parameter is needed to implement jsonMergePatchFunc type. The input is not used.
+// This function will operate on all owner references.
+func (gc *GarbageCollector) unblockOwnerReferencesJSONMergePatch(n *node, _ ...types.UID) ([]byte, error) {
+ accessor, err := gc.getCachedMetadata(n.identity.APIVersion, n.identity.Kind, n.identity.Namespace, n.identity.Name)
+ if err != nil {
+ return nil, err
+ }
+ expectedObjectMeta := ObjectMetaForPatch{}
@deads2k <https://github.com/deads2k> @liggitt
<https://github.com/liggitt> do you have better ideas to construct json
merge patch? We used ObjectMetaForPatch to hold patch contents instead of
using ObjectMeta directly because some optional fields in ObjectMeta is
not pointer, so they are serialized as <field name>: null instead of
being omitted, so the resulting json merge patch would be wrong.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#63386 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAngloDXZlrSEpPLSsxhRDMxGrZRAMmvks5t3wSggaJpZM4Twcy3>
.
|
/approve
For the test.
…On Tue, Jun 5, 2018, 5:53 PM Jordan Liggitt ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In test/e2e/apimachinery/garbage_collector.go
<#63386 (comment)>
:
> @@ -998,6 +998,109 @@ var _ = SIGDescribe("Garbage collector", func() {
}
})
+ It("should support orphan deletion of custom resources", func() {
+ config, err := framework.LoadConfig()
+ if err != nil {
+ framework.Failf("failed to load config: %v", err)
+ }
+
+ apiExtensionClient, err := apiextensionsclientset.NewForConfig(config)
+ if err != nil {
+ framework.Failf("failed to initialize apiExtensionClient: %v", err)
+ }
+
+ // Create a random custom resource definition and ensure it's available for
+ // use.
+ definition := apiextensionstestserver.NewRandomNameCustomResourceDefinition(apiextensionsv1beta1.ClusterScoped)
Maybe GC works for namespaced owner and cluster-scoped dependents in most
cases.
Specifying incorrect ownerref data and racing the uid index is definitely
not safe, reliable, or supported. In cases where GC falls back to a live
lookup of the owner before deleting, I think it would get a 404 from the
incorrect (no namespace) reference, properly assume the pod with namespace
"" and name "foo" did not exist, and delete the cluster scoped object.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#63386 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAnglvxW4e1085nEb066PJHZjiq_4gh_ks5t5yf2gaJpZM4Twcy3>
.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: caesarxuchao, jennybuckley, lavalamp, roycaihw The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Review the full test history for this PR. Silence the bot with an |
1 similar comment
/retest Review the full test history for this PR. Silence the bot with an |
I added some labels. If this misses the release, we will cherrypick. I think it's important that GC works on CRs. |
/priority critical-urgent |
/remove-priority important-soon |
[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process @caesarxuchao @jennybuckley @roycaihw Pull Request Labels
|
/retest |
2 similar comments
/retest |
/retest |
/test all [submit-queue is verifying that this PR is safe to merge] |
/test pull-kubernetes-node-e2e |
@roycaihw: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue (batch tested with PRs 63386, 64624, 62297, 64847). If you want to cherry-pick this change to another branch, please follow the instructions here. |
What this PR does / why we need it:
Let garbage collector fallback to use json merge patch when strategic merge patch returns 415. This enables orphan delete on custom resources.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #56348
Special notes for your reviewer:
This PR is developed based on #56595. Ref #56606 for more information.
Release note:
/sig api-machinery