Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding RemainingItemCount to ListMeta #75993

Merged
merged 2 commits into from May 7, 2019

Conversation

@caesarxuchao
Copy link
Member

commented Apr 1, 2019

Adding ListMeta.RemainingItemCount. If the server has more data available for a LIST request, the ListOptions.RemainingItemCount is set to the number of remaining objects.

This doesn't add any additional traffic to etcd. The etcd server already included the count in the GET response.

Without this "count" feature, clients need to list through all pages to get a count.

Alternatives:

  1. etcd v3 API supports --count-only, and the Kubernetes storage layer does support Count(), so alternatively, we can add a COUNT verb to Kubernetes REST API, but that sounds like an overkill. With the API in this PR, clients can get a cheap count by setting listOptions.Limit=1.

/sig api-machinery
/kind api-change
/assign @jpbetz @lavalamp

-->

Adding ListMeta.RemainingItemCount. When responding a LIST request, if the server has more data available, and if the request does not contain label selectors or field selectors, the server sets the ListOptions.RemainingItemCount to the number of remaining objects.
@@ -42,8 +42,9 @@ type Versioner interface {
// UpdateList sets the resource version into an API list object. Returns an error if the object
// cannot be updated correctly. May return nil if the requested object does not need metadata
// from database. continueValue is optional and indicates that more results are available if
// the client passes that value to the server in a subsequent call.
UpdateList(obj runtime.Object, resourceVersion uint64, continueValue string) error
// the client passes that value to the server in a subsequent call. count is optional, and

This comment has been minimized.

Copy link
@lavalamp

lavalamp Apr 1, 2019

Member

count doesn't seem optional?

@@ -88,6 +88,8 @@ var _ = SIGDescribe("Servers with support for API chunking", func() {
lastRV = list.ResourceVersion
}
Expect(list.ResourceVersion).To(Equal(lastRV))
framework.Logf("CHAO: limit is %d, found is %d", opts.Limit, found)

This comment has been minimized.

Copy link
@lavalamp

lavalamp Apr 1, 2019

Member

debug statement :)


// the number of total matching objects in the storage starting from
// the continue key in the listOptions (not the continue in this list
// metadata). If the continue key in the listOptions is empty, count is

This comment has been minimized.

Copy link
@lavalamp

lavalamp Apr 1, 2019

Member

If no pagination was requested, then there's no need for the server to set this, is there?

Consider calling this field "RemainingItems". The description can then say "If this list is partial (due to pagination), then this field counts how many more items there are in the collection."

We can do math on the count etcd gives us to make sure the items sent in the list are not included in the remainingItems field. Mostly I just think it's difficult to understand at the moment.

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 2, 2019

Contributor

Given either a total count or a remaining count value, a client can calculate the other. So probably best to only introduce one of them. Remaining could be implemented like @lavalamp suggests, a total count could be implemented by getting the count from etcd on the initial list call (with or without a limit) and then remembering somehow (e.g. hiding it in the continue token) since subsequent etcd list calls will start at a range offset and won't provide the same total. Having a total count seems simple and consistent to me, although it's not actually needed when no limit is provided since it can be calculated.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Apr 2, 2019

Member

I think remaining is more useful, and more convenient, since we don't have to make additional etcd calls.

Since it's possible to start in the middle of a collection, the total count isn't very helpful: you can't take a total count and subtract the number of items you've seen/processed to get the remaining number of items, because items could have been added or removed prior to the page you're getting.

So, the useful thing for clients is to track how many items they have processed, and let the server tell them how many more items there are at the moment. You can use this to e.g. estimate a completion time.

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 2, 2019

Contributor

Continue provides a point-in-time view of a list; any items added or removed once the paginated list is initiated are not visible, so the item count stay the same across the entire paginated listing operation. No additional etcd calls are needed. But remaining will certainly work, count just feels like a more fundamental base fact to me.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Apr 2, 2019

Member

No-- it's not clear in the documentation above, but if your continue key has expired, the system will give you a new one along with the error message. You can use this feature to resume in the middle of a long list.

@fejta-bot

This comment has been minimized.

Copy link

commented Apr 2, 2019

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.


// the total number of matching keys in the storage. Count equals to
// the number of items in the list if Continue is empty.
optional int64 count = 4;

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 2, 2019

Contributor

Count might not be the best name for this given that it has a different meaning when continue is set. If we do use 'count', should clearly document what it means when continue is set here too.


// the number of total matching objects in the storage starting from
// the continue key in the listOptions (not the continue in this list
// metadata). If the continue key in the listOptions is empty, count is

This comment has been minimized.

Copy link
@jpbetz

jpbetz Apr 2, 2019

Contributor

Given either a total count or a remaining count value, a client can calculate the other. So probably best to only introduce one of them. Remaining could be implemented like @lavalamp suggests, a total count could be implemented by getting the count from etcd on the initial list call (with or without a limit) and then remembering somehow (e.g. hiding it in the continue token) since subsequent etcd list calls will start at a range offset and won't provide the same total. Having a total count seems simple and consistent to me, although it's not actually needed when no limit is provided since it can be calculated.

@caesarxuchao caesarxuchao force-pushed the caesarxuchao:add-count branch from afd0efd to 47e6551 Apr 2, 2019

@caesarxuchao caesarxuchao changed the title [PoC] Adding Count to ListMeta Adding RemainingItemCount to ListMeta Apr 2, 2019

@caesarxuchao caesarxuchao force-pushed the caesarxuchao:add-count branch from 47e6551 to 49bc55c Apr 2, 2019

@caesarxuchao

This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2019

Thanks for the feedback. I've updated the field to "RemainingItemCount", implementing the semantics lavalamp had suggested.

@caesarxuchao

This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2019

/retest

@caesarxuchao caesarxuchao force-pushed the caesarxuchao:add-count branch from 49bc55c to e4613f0 Apr 3, 2019

@k8s-ci-robot k8s-ci-robot added size/L and removed size/M labels Apr 3, 2019

@caesarxuchao

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

/retest

@caesarxuchao

This comment has been minimized.

Copy link
Member Author

commented Apr 8, 2019

/retest

// database. continueValue is optional and indicates that more results are available if the client
// passes that value to the server in a subsequent call. remainingItemCount indicates the number
// of remaining objects if the list is partial.
UpdateList(obj runtime.Object, resourceVersion uint64, continueValue string, remainingItemCount int64) error

This comment has been minimized.

Copy link
@lavalamp

lavalamp May 6, 2019

Member

I guess a 0 value for remainingItemCount will result in the field being not set at all?

This comment has been minimized.

Copy link
@caesarxuchao

caesarxuchao May 6, 2019

Author Member

Right. Added to the comment.

@caesarxuchao caesarxuchao force-pushed the caesarxuchao:add-count branch from dc581c3 to 27a8bb5 May 6, 2019

@caesarxuchao caesarxuchao force-pushed the caesarxuchao:add-count branch from 27a8bb5 to 32cc5e4 May 6, 2019

@caesarxuchao caesarxuchao force-pushed the caesarxuchao:add-count branch from 32cc5e4 to d001a0f May 6, 2019

@lavalamp

This comment has been minimized.

Copy link
Member

commented May 6, 2019

/approve
/lgtm

Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm label May 6, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 6, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: caesarxuchao, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@caesarxuchao

This comment has been minimized.

Copy link
Member Author

commented May 6, 2019

/retest

@k8s-ci-robot k8s-ci-robot merged commit 1fb8ed1 into kubernetes:master May 7, 2019

19 of 20 checks passed

pull-kubernetes-kubemark-e2e-gce-big Job triggered.
Details
cla/linuxfoundation caesarxuchao authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details

@liggitt liggitt added this to Completed, 1.15 in API Reviews May 7, 2019

@liggitt

This comment has been minimized.

Copy link
Member

commented May 8, 2019

this broke chunk paging in kubectl

before this PR:

$ kubectl get ns --chunk-size=1 --no-headers
default           Active   5m53s
kube-node-lease   Active   5m57s
kube-public       Active   5m57s
kube-system       Active   5m57s

after:

$ kubectl get ns --chunk-size=1 --no-headers
default   Active   7m37s
@liggitt

This comment has been minimized.

Copy link
Member

commented May 8, 2019

fixed in #77580

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented May 13, 2019

Why did this not have a KEP? I am really, really, really surprised at that.

I have some comments inline.

// list response. If the list request contained label or field selectors, then the number of
// remaining items is unknown and this field will be unset. If the list is complete (either
// because it is unpaginated or because this is the last page), then there are no more remaining
// items and this field will also be unset. Servers older than v1.15 do not set this field.

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton May 13, 2019

Contributor

This should go through at least one version as an alpha field and be flag gated.

You also need to describe what "do not set this field means".

You used the wrong field name in the godoc, it should be remainingItemCount.

You need to describe that this field is only set when chunking is requested (do not calling it pagination, that is not the name we use in doc).

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton May 13, 2019

Contributor

This should be alpha in 1.15, and I'd like to see a KEP for this and updates to the public doc that describe how this is used (probably update the chunking KEP).

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton May 13, 2019

Contributor

I'm not 100% sold on the name, remaining is novel in our use.

@lavalamp

This comment has been minimized.

Copy link
Member

commented May 13, 2019

Sorry, I thought this was mentioned in one of the storage migration KEPs (https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery). I agree adding it to the chunking KEP would be nice.

Since this is basically 100% an API change, I think the API review process is better than the KEP process. Requiring a KEP for every API change seems a little heavy.

I don't know that I agree that it needs to start out alpha, what will alpha teach us that we don't already know?

@caesarxuchao

This comment has been minimized.

Copy link
Member Author

commented May 14, 2019

@smarterclayton I sent kubernetes/community#3710 to update the original api chunking proposal and incorporated your suggestions regarding the API documentation. Sorry for not updating the proposal first.

I'll update the k/k code once kubernetes/community#3710 is merged.

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented May 14, 2019

@lavalamp

This comment has been minimized.

Copy link
Member

commented May 14, 2019

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented May 14, 2019

@lavalamp

This comment has been minimized.

Copy link
Member

commented May 15, 2019

I agree that APIs are forever, but if our alpha process doesn't have a chance of affecting the beta, it is just responsibility theater. But, I have no time to argue this and we can put on a show if it makes folks feel better. I do understand that sometimes the appearance of responsibility is needed along with actual responsibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.