Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to empty PATCH when migrating resources #65

Closed
dprotaso opened this issue Apr 5, 2020 · 12 comments
Closed

Switch to empty PATCH when migrating resources #65

dprotaso opened this issue Apr 5, 2020 · 12 comments
Assignees

Comments

@dprotaso
Copy link
Contributor

dprotaso commented Apr 5, 2020

Looking here:

func (m *migrator) get(ctx context.Context, namespace, name string) (*unstructured.Unstructured, error) {
// if namespace is empty, .Namespace(namespace) is ineffective.
return m.client.
Resource(m.resource).
Namespace(namespace).
Get(ctx, name, metav1.GetOptions{})
}
func (m *migrator) put(ctx context.Context, namespace string, obj *unstructured.Unstructured) (*unstructured.Unstructured, error) {
// if namespace is empty, .Namespace(namespace) is ineffective.
return m.client.
Resource(m.resource).
Namespace(namespace).
Update(ctx, obj, metav1.UpdateOptions{})
}

I notice that the migrator uses Updates/PUTs to perform the storage migration. Is it necessary to do the full PUT? This presentation makes it seems like an empty patch will suffice.

@caesarxuchao
Copy link
Member

I can't remember why we chose to do a full GET-UPDATE cycle instead of an empty patch. @roycaihw @lavalamp do you remember?

@lavalamp
Copy link

Would an empty patch actually trigger the server to do a write?

@dprotaso
Copy link
Contributor Author

I've manually tested it with the tool I wrote for Knative here

The assumption I have is that you can only remove entries from the CRD's status.storageVersions list only if there are no resources being stored at that version.

After the tool patches all the resources I'm able to drop older storage versions - so I believe PATCH'ing causes a server write. If not then...

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 9, 2020
@roycaihw
Copy link
Member

you can only remove entries from the CRD's status.storageVersions list only if there are no resources being stored at that version.

I don't think the apiserver checks/tracks all the CRs before updating the CRD

@roycaihw
Copy link
Member

I cannot think of a reason for not doing an empty patch either. The only short-circuit that I'm aware of happens after encoding (and conversion). I tried empty-patching a stale CR (noop conversion). The write didn't get short-circuited. I assume it's because the encoding sets the apiVersion of the new object to be different from the old data.

@dprotaso
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 16, 2020
@roycaihw
Copy link
Member

/assign

Let me add some test and do the switch

@roycaihw
Copy link
Member

roycaihw commented Oct 5, 2020

Talked with @caesarxuchao. The reason we did UPDATE instead of PATCH was because UPDATE is more general. There can be aggregated apiservers who don't implement patchers (e.g. jsonPatcher) and therefore don't support PATCH.

We also discussed the existing UPDATE code, and found that we can improve the performance by avoiding retry on conflict #78 (note that we already skipped the initial GET before UPDATE)-- then the throughput for UPDATE v.s. PATCH would be:

  • UPDATE: list all items, one update request for each item
  • PATCH: list all items, one patch request for each item

So the performance would be similar, while UPDATE is easier for aggregated apiservers to support.

/close

@k8s-ci-robot
Copy link
Contributor

@roycaihw: Closing this issue.

In response to this:

Talked with @caesarxuchao. The reason we did UPDATE instead of PATCH was because UPDATE is more general. There can be aggregated apiservers who don't implement patchers (e.g. jsonPatcher) and therefore don't support PATCH.

We also discussed the existing UPDATE code, and found that we can improve the performance by avoiding retry on conflict #78 (note that we already skipped the initial GET before UPDATE)-- then the throughput for UPDATE v.s. PATCH would be:

  • UPDATE: list all items, one update request for each item
  • PATCH: list all items, one patch request for each item

So the performance would be similar, while UPDATE is easier for aggregated apiservers to support.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dprotaso
Copy link
Contributor Author

dprotaso commented Oct 6, 2020

Doesn't UPDATE require clients to send the entire resource?

@caesarxuchao
Copy link
Member

Doesn't UPDATE require clients to send the entire resource?

Yes. However, because the list operation already downloads the entire resource, so the UPDATE just doubles the bandwidth cost, it's not a difference in the order of magnitude.

Converting to PATCH does improve the performance, but it also means that we need to handle more corner cases like #65 (comment) mentioned, so I think it's not our current priority to convert to PATCH.

That said, if using UPDATE is causing performance bottleneck in production, please let us know, we are willing to improve the performance now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants