Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust LIST work estimator to match current code #104599

Merged
merged 1 commit into from
Sep 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ func (e *listWorkEstimator) estimate(r *http.Request) WorkEstimate {
}
isListFromCache := !shouldListFromStorage(query, &listOptions)

count, err := e.countGetterFn(key(requestInfo))
numStored, err := e.countGetterFn(key(requestInfo))
switch {
case err == ObjectCountStaleErr:
// object count going stale is indicative of degradation, so we should
Expand All @@ -82,23 +82,23 @@ func (e *listWorkEstimator) estimate(r *http.Request) WorkEstimate {
return WorkEstimate{Seats: maximumSeats}
}

// TODO: For resources that implement indexes at the watchcache level,
// we need to adjust the cost accordingly
limit := numStored
if utilfeature.DefaultFeatureGate.Enabled(features.APIListChunking) && listOptions.Limit > 0 &&
listOptions.Limit < numStored {
limit = listOptions.Limit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true for listing from cache - see my comment in the doc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, I hadn't noticed. That seems like pretty strange behavior to me. Is it a bug? Given that we are struggling with the costs of long list results, maybe the cacher should limit its response in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised estimator to match current code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a bug? Given that we are struggling with the costs of long list results, maybe the cacher should limit its response in this case?

It was intentional (although I agree it isn't intuitive).
The reasoning behind it is that in large clusters, when node agents were listing stuff from etcd (say e.g. to restore after control-plane outage when those lists where somewhat coordinated across many nodes), it was completely killing etcd.
And given listing from watchcache is explicit opt-in, then ignoring limit is slightly less risky, as people need to explicitly opt-in for that behavior so hopefully they understand the consequences.

With P&F giving us a better protection (once we have a good list support) [and couple other things that happened over last years] we may be able to revise it, but that requires much deeper testing and should be its own effort.

[FTR - once solve, we would be able to actually graduate https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/365-paginated-lists to stable, as solving somehow the pagination aspect for watchcache is currently the only blocker IIRC.]

}

var estimatedObjectsToBeProcessed int64

switch {
case isListFromCache:
// if we are here, count is known
estimatedObjectsToBeProcessed = count
// TODO: For resources that implement indexes at the watchcache level,
// we need to adjust the cost accordingly
estimatedObjectsToBeProcessed = numStored
case listOptions.FieldSelector != "" || listOptions.LabelSelector != "":
estimatedObjectsToBeProcessed = numStored + limit
default:
// Even if a selector is specified and we may need to list and go over more objects from etcd
// to produce the result of size <limit>, each individual chunk will be of size at most <limit>.
// As a result. the work estimate of the request should be computed based on <limit> and the actual
// cost of processing more elements will be hidden in the request processing latency.
estimatedObjectsToBeProcessed = listOptions.Limit
if estimatedObjectsToBeProcessed == 0 {
// limit has not been specified, fall back to count
estimatedObjectsToBeProcessed = count
}
estimatedObjectsToBeProcessed = 2 * limit
}

// for now, our rough estimate is to allocate one seat to each 100 obejcts that
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,84 +61,97 @@ func TestWorkEstimator(t *testing.T) {
seatsExpected: maximumSeats,
},
{
name: "request verb is list, resource version not set",
requestURI: "http://server/apis/foo.bar/v1/events?limit=499",
name: "request verb is list, has limit and resource version is 1",
requestURI: "http://server/apis/foo.bar/v1/events?limit=399&resourceVersion=1",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Resource: "events",
},
counts: map[string]int64{
"events.foo.bar": 799,
"events.foo.bar": 699,
},
seatsExpected: 5,
seatsExpected: 8,
},
{
name: "request verb is list, continuation is set",
requestURI: "http://server/apis/foo.bar/v1/events?continue=token&limit=499&resourceVersion=1",
name: "request verb is list, limit not set",
requestURI: "http://server/apis/foo.bar/v1/events?resourceVersion=1",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Resource: "events",
},
counts: map[string]int64{
"events.foo.bar": 799,
"events.foo.bar": 699,
},
seatsExpected: 5,
seatsExpected: 7,
},
{
name: "request verb is list, has limit",
requestURI: "http://server/apis/foo.bar/v1/events?limit=499&resourceVersion=1",
name: "request verb is list, resource version not set",
requestURI: "http://server/apis/foo.bar/v1/events?limit=399",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Resource: "events",
},
counts: map[string]int64{
"events.foo.bar": 799,
"events.foo.bar": 699,
},
seatsExpected: 5,
seatsExpected: 8,
},
{
name: "request verb is list, resource version is zero",
requestURI: "http://server/apis/foo.bar/v1/events?resourceVersion=0",
name: "request verb is list, no query parameters, count known",
requestURI: "http://server/apis/foo.bar/v1/events",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Resource: "events",
},
counts: map[string]int64{
"events.foo.bar": 799,
"events.foo.bar": 399,
},
seatsExpected: 8,
},
{
name: "request verb is list, no query parameters, count known",
name: "request verb is list, no query parameters, count not known",
requestURI: "http://server/apis/foo.bar/v1/events",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Resource: "events",
},
countErr: ObjectCountNotFoundErr,
seatsExpected: maximumSeats,
},
{
name: "request verb is list, continuation is set",
requestURI: "http://server/apis/foo.bar/v1/events?continue=token&limit=399",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Resource: "events",
},
counts: map[string]int64{
"events.foo.bar": 799,
"events.foo.bar": 699,
},
seatsExpected: 8,
},
{
name: "request verb is list, no query parameters, count not known",
requestURI: "http://server/apis/foo.bar/v1/events",
name: "request verb is list, resource version is zero",
requestURI: "http://server/apis/foo.bar/v1/events?limit=299&resourceVersion=0",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Resource: "events",
},
countErr: ObjectCountNotFoundErr,
seatsExpected: maximumSeats,
counts: map[string]int64{
"events.foo.bar": 399,
},
seatsExpected: 4,
},
{
name: "request verb is list, resource version match is Exact",
requestURI: "http://server/apis/foo.bar/v1/events?resourceVersion=foo&resourceVersionMatch=Exact&limit=499",
name: "request verb is list, resource version is zero, no limit",
requestURI: "http://server/apis/foo.bar/v1/events?resourceVersion=0",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Expand All @@ -147,7 +160,20 @@ func TestWorkEstimator(t *testing.T) {
counts: map[string]int64{
"events.foo.bar": 799,
},
seatsExpected: 5,
seatsExpected: 8,
},
{
name: "request verb is list, resource version match is Exact",
requestURI: "http://server/apis/foo.bar/v1/events?resourceVersion=foo&resourceVersionMatch=Exact&limit=399",
requestInfo: &apirequest.RequestInfo{
Verb: "list",
APIGroup: "foo.bar",
Resource: "events",
},
counts: map[string]int64{
"events.foo.bar": 699,
},
seatsExpected: 8,
},
{
name: "request verb is list, resource version match is NotOlderThan, limit not specified",
Expand Down