Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ListPager.EachListItem util #75849

Merged
merged 1 commit into from Apr 11, 2019

Conversation

@jpbetz
Copy link
Contributor

commented Mar 28, 2019

Introduce a ListPager.EachListItem convenience utility for incrementally processing chunked List results. This makes it easy for a client to only request list chunks from the apiserver that it can actually keep up with processing. EachListItem buffers up to PageBufferSize (default: 10) pages in the background to minimize foreground wait time.

Example usage:

podPager = pager.New(pager.SimplePageFunc(func(opts metav1.ListOptions) (runtime.Object, error) {                                                                                                                 
  return client.CoreV1().Pods().List(opts)                                                                                                                                                   
}))
podPager.EachListItem(ctx, metav1.ListOptions{}, func(obj runtime.Object) error {                                                                                                       
  if pod, ok := obj.(*v1.Pod); ok {
    doSomethingWithThePod(pod)
  } else { ...}
  ...
}

There are some warts with this approach that I'd like to find a way of smoothing over, namely: (1) having to creating the pager (2) having to cast each item to the correct type. But that should be possible to add via code generators in the future.

The name and function signature were picked to be consistent with the existing meta.EachListItem function.

Add ListPager.EachListItem utility function to client-go to enable incremental processing of chunked list responses

/kind feature
/priority important-longterm
/sig api-machinery

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Mar 28, 2019

@smarterclayton if you get a minute would you glance at the ListPager.EachListItem function I'm proposing here?

@jpbetz jpbetz changed the title Paginate node List in PodGC controller, add pager.EachListItem util Paginate node List in PodGC controller, add ListPager.EachListItem util Mar 28, 2019

@jpbetz jpbetz changed the title Paginate node List in PodGC controller, add ListPager.EachListItem util [WIP] Paginate node List in PodGC controller, add ListPager.EachListItem util Mar 29, 2019

@jpbetz jpbetz changed the title [WIP] Paginate node List in PodGC controller, add ListPager.EachListItem util Add ListPager.EachListItem util Apr 1, 2019

@jpbetz jpbetz force-pushed the jpbetz:pagination-podgc branch from 3cdd565 to 5de2b33 Apr 1, 2019

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 1, 2019

/cc @jingyih

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Apr 1, 2019

@jpbetz: GitHub didn't allow me to request PR reviews from the following users: jingyih.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @jingyih

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jpbetz jpbetz force-pushed the jpbetz:pagination-podgc branch 2 times, most recently from 295fa6a to 6708a75 Apr 1, 2019

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 2, 2019

/retest

1 similar comment
@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 2, 2019

/retest

@jpbetz jpbetz force-pushed the jpbetz:pagination-podgc branch from 6708a75 to bb6a508 Apr 2, 2019

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 2, 2019

/retest

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 2, 2019

/cc @smarterclayton @jingyih @wojtek-t Looking for reviewers for this, would any of you have time to give it a look?

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 4, 2019

Friendly nudge for review

@jingyih
Copy link
Contributor

left a comment

I think I generally understand the approach. Added a few comments. Thanks!

@@ -115,3 +127,86 @@ func (p *ListPager) List(ctx context.Context, options metav1.ListOptions) (runti
options.Continue = m.GetContinue()
}
}

// EachListItem invokes fn on each runtime.Object in the list. Any error immediately terminates the

This comment has been minimized.

Copy link
@jingyih

jingyih Apr 5, 2019

Contributor

It might be helpful to add more details here.

Function description says it invokes fn on each object in the list. In function signature, there is no list. So it has to be generated (retrieved) internally in this function.

The listing inside this function is similar but not identical to (*ListPager) List(). In the sense that it does not fall back to full list on resource expire error. But I feel user may assume inside this function, listpager is using the same mechanism to list? You already mentioned a "Expired" error may be returned. I just feel a little more clarification on the internal listing mechanism might be helpful.

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 5, 2019

Author Contributor

Thanks, I've rewritten this with more detail.

return nil
})
if err == stoppedErr {
err = nil // stoppedErr is an internal signal

This comment has been minimized.

Copy link
@jingyih

jingyih Apr 5, 2019

Contributor

I understand what this line does, returning due to been signaled to stop is not actually an error.

Do we need to define an internal error? Can we use context cancel function to signal eachListChunk() to stop? Upon closing stopC, fn can return nil (line 202), but the loop inside eachListChunk() will be stopped by ctx.Done().

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 5, 2019

Author Contributor

You're right. A nested context works here. Since the context error should be return iff the caller cancels the context, reasoning through when/how cancel error get's handled is a bit subtle and I've added a comment explaining it. I've also added test coverage for cancelation, both for then a fn error results in an early exit and when the caller calls cancel.

bgResultC <- err
}()

for o := range chunkC {

This comment has been minimized.

Copy link
@jingyih

jingyih Apr 5, 2019

Contributor

If listing chunk in the background fails (such as due to resource expired), we will still finish processing the existing chunks in the buffer before return?

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 5, 2019

Author Contributor

Yes, I think from an API contract perspective, when a error occurs listing chunks, i think it's valid to either (1) stop calling fn as soon as possible, or (2) call fn on as many items as have been successfully retrieved. I went with #1 since it was trivial to implement. I imagine in different situations a client might benefit more from one of these and in other situations the other. Thoughts?


stopC := make(chan struct{})
bgResultC := make(chan error, 1)
go func() {

This comment has been minimized.

Copy link
@liggitt

liggitt Apr 5, 2019

Member

defer runtime.HandleCrash() at the top of the goroutine?

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 5, 2019

Author Contributor

Good catch, I should do something with panics. I've gone with runtime.RecoverFromPanic here. Hopefully it's appropriate.

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 5, 2019

Author Contributor

runtime.RecoverFromPanic didn't work as I had expected (and from a quick grep, doesn't look like we use it in the k8s codebase), so I did a more direct recover.

This comment has been minimized.

Copy link
@wojtek-t

wojtek-t Apr 10, 2019

Member

Why you can't use runtime.HandleCrash() - it's pretty widely used in the codebase and allows to use additional custom handler too.

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 10, 2019

Author Contributor

I had initially misunderstood it to catch the panic and swallow it. But it does still crash. Updating the PR to use it now.

// PageBufferSize chunks of list items concurrently in the background. If the chunk attempt fails a "Expired"
// error may be returned.
func (p *ListPager) eachListChunkBuffered(ctx context.Context, options metav1.ListOptions, fn func(obj runtime.Object) error) error {
chunkC := make(chan runtime.Object, p.PageBufferSize)

This comment has been minimized.

Copy link
@liggitt

liggitt Apr 5, 2019

Member

error if p.PageBufferSize <= 0?

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 5, 2019

Author Contributor

Yes, set it to < since 0 size buffer is supported (tests demo this).


// eachListChunkBuffered invokes fn on each runtimeObject list chunk. It buffers up to
// PageBufferSize chunks of list items concurrently in the background. If the chunk attempt fails a "Expired"
// error may be returned.

This comment has been minimized.

Copy link
@liggitt

liggitt Apr 5, 2019

Member

need to document the return value ... I think if fn returns an error, processing stops and that error is returned. if fn does not return an error, any error encountered while fetching the list (including timeout or cancelled errors from the context) is returned

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 5, 2019

Author Contributor

Thanks, yes, I've rewritten this with full details on errors.

@jpbetz jpbetz force-pushed the jpbetz:pagination-podgc branch from bb6a508 to b351760 Apr 5, 2019

@jpbetz jpbetz force-pushed the jpbetz:pagination-podgc branch 2 times, most recently from 5079ed7 to 0c80995 Apr 5, 2019

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2019

@jingyih @liggitt Feedback applied. PTAL!

@jpbetz jpbetz force-pushed the jpbetz:pagination-podgc branch from 0c80995 to 6a64ee6 Apr 10, 2019

@liggitt

This comment has been minimized.

Copy link
Member

commented Apr 11, 2019

/lgtm
/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Apr 11, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit d5dbc00 into kubernetes:master Apr 11, 2019

17 checks passed

cla/linuxfoundation jpbetz authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-godeps Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
@@ -48,6 +50,9 @@ type ListPager struct {
PageFn ListPageFunc

FullListIfExpired bool

// Number of pages to buffer
PageBufferSize int32

This comment has been minimized.

Copy link
@tedyu

tedyu Apr 11, 2019

Contributor

I wonder if the buffer should be defined by the memory consumption instead of number of pages.

We can evaluate the size of a few chunks to dynamically determine the number of chunks to buffer, based on desired buffer size (in terms of memory).

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 11, 2019

Author Contributor

Interesting. Maybe wait and see how this approach works out and then optimize as needed from there? We previously had quite a bit of code doing full lists, and so as we transition to paginated lists and this sort of incremental processing my expectation is we'll reduce memory usage, particularly for object kinds that have large counts. If we still hit scalability/performance limits once this is in use, that would seem like a good time to look into optimizing this further .

This comment has been minimized.

Copy link
@tedyu

tedyu Apr 11, 2019

Contributor

Sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.