Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Started drafting the skeleton of the request management filter #78966

Conversation

@MikeSpreitzer
Copy link
Member

commented Jun 13, 2019

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test

/kind feature

/kind flake

What this PR does / why we need it:
This PR introduces the top level of the new filter.

Which issue(s) this PR fixes:

This PR does not fix but contributes to kubernetes/enhancements#1040

Special notes for your reviewer:
This PR builds on #77048

Does this PR introduce a user-facing change?:

The full feature will definitely require a release note.
@MikeSpreitzer

This comment has been minimized.

Copy link
Member Author

commented Jun 13, 2019

@k8s-ci-robot k8s-ci-robot requested review from wojtek-t, yue9944882, lavalamp, adohe and deads2k and removed request for caesarxuchao Jun 13, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

@MikeSpreitzer: GitHub didn't allow me to request PR reviews from the following users: aaron-prindle, KevinWang15, wu8685.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/uncc @caesarxuchao
/uncc @brendanburns
/cc @yue9944882
/cc @lavalamp
/cc @deads2k
/cc @aaron-prindle
/cc @wojtek-t
/cc @wu8685
/cc @adohe
/cc @KevinWang15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fedebongio

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

/assign @aaron-prindle

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

@fedebongio: GitHub didn't allow me to assign the following users: aaron-prindle.

Note that only kubernetes members and repo collaborators can be assigned and that issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @aaron-prindle

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@aaron-prindle

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

for reference: #76846

@MikeSpreitzer

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2019

@aaron-prindle: yes, I know that the skeleton work is officially on your plate, and would be happy to see you take it from here. I see that you have been focused on doing some bottom-up work, and this PR actually originated as an attempt at a particularly constructive review of that. I wanted to work out my opinion of how we should organize our queuing and dispatching concepts to fit into the context of making an apiserver filter. That is the spirit in which this is offered.

@aaron-prindle

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2019

@MikeSpreitzer this is great! My original comment was just for keeping tabs on all the work being done in the Umbrella Tracking issue.

@MikeSpreitzer MikeSpreitzer force-pushed the MikeSpreitzer:feat/rate-limiting-skel branch from e7e6052 to bff37f2 Jul 16, 2019

@k8s-ci-robot k8s-ci-robot added size/L and removed size/XXL labels Jul 16, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jul 16, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MikeSpreitzer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@MikeSpreitzer

This comment has been minimized.

Copy link
Member Author

commented Jul 16, 2019

/test pull-kubernetes-integration

@MikeSpreitzer

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2019

My last push was rebasing after the API object types PR (#77048) merged.

handler http.Handler,
loopbackClientConfig *restclient.Config,
serverConcurrencyLimit int,
requestWaitLimit time.Duration,

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 18, 2019

Member

requestWaitLimit is this meant for a time-out for those requests pending in queues? there's already a time-out handler filter, we probably don't want them to overlap..

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 18, 2019

Author Member

This parameter is introduced in https://github.com/kubernetes/enhancements/blob/79119ce549cda9cdfdb532273475ea5090fa5415/keps/sig-api-machinery/20190228-priority-and-fairness.md#latency-protection . In particular, I think it must be set to something substantially shorter than the limit imposed by the timeout filter. It does not make sense to allow a request to wait in a queue for longer than the total time allowed for waiting+executing. In fact, I think the wait duration limit should be short enough that the majority of the overall duration limit is available for request execution.

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 22, 2019

Member

I think it must be set to something substantially shorter than the limit imposed by the timeout filter

i see, we can check that when validating the flags, what do you think will be the default value for this timeout?

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 23, 2019

Author Member

I suggest scaling it with the upstream timeout. I suggest setting the wait duration limit to 1/4 of the upstream timeout.

func WithRequestManagement(
handler http.Handler,
loopbackClientConfig *restclient.Config,
serverConcurrencyLimit int,

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 18, 2019

Member
 --max-mutating-requests-inflight int           The maximum number of mutating requests in flight at a given time. When the server exceeds this, it rejects requests. Zero for no limit. (default 200)
 --max-requests-inflight int                    The maximum number of non-mutating requests in flight at a given time. When the server exceeds this, it rejects requests. Zero for no limit. (default 400)

iiuc, the limit value reads from these flags, amiright? serverConcurrencyLimit = max-mutating-requests-inflight + max-requests-inflight


// TypedConfigObjectReference is a reference to a relevant config API object.
// No namespace is needed because none of these objects is namespaced.
type TypedConfigObjectReference struct {

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 18, 2019

Member

does it have to be a public/exported type?

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 18, 2019

Author Member

I think the division of this stuff among packages is still TBD, so I have not worried about making the right decision for each name yet.

configQueue workqueue.RateLimitingInterface

// numConfigWorkers is the configured number of workers for the config queue
numConfigWorkers int

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 18, 2019

Member

i think it's a bit overkilling.. it's a good idea to load the config(i.e. PL) changes in an asynchronous way but do we really need a configurable number of workers in the system? one worker will be sufficient i presume because we don't expect frequent too many PL and PL changes

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 18, 2019

Author Member

Yes, actually I think it is positively good to have just one worker --- that eliminates the need for synchronization among workers. I will update.
There will still be the need for some synchronization between the config queue worker and the handler for empty notifications from FairQueuingSystems, but that is a relatively narrow issue.

}

// Skip tracking long running events.
if longRunningRequestCheck != nil && longRunningRequestCheck(r, requestInfo) {

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 18, 2019

Member

we don't need to completely skip long-running requests.. we can start by processing them in a simple way in the beginning: how about we make the long-running requests consume one quota from the PL (and returns it on finishing) but it doesn't has to go through the queues(sharding+fair-queuing). i remember we agree that the point of introducing conconcurrency limit is to address the long-running calls, at least starting from a simple way.

This comment has been minimized.

close(finished)
}()
select {
case <-timedOut:

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 18, 2019

Member

let's strip the timeout from the system, another filter will deal with it

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 18, 2019

Author Member

Yes, the upstream timeout filter will return after the timeout expires. But in our filter we want an upper bound on execution time; my design analyses were predicated on that. The logic here ensures that there is such an upper bound, as far as this filter is concerned.

// that entry. If the priority level becomes desired again before
// this is called, the filter cancels by calling `DoOnEmpty(nil)`.
// The callback is invoked with no mutexes locked.
DoOnEmpty(func())

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 18, 2019

Member

If the priority level becomes desired again before
this is called, the filter cancels by calling DoOnEmpty(nil).

if a PL gets removed and created again instantly, it should be a fresh new priorityLevelStates instance. twiddling the PL between desired and undesired will make the system complex, isn't it? how do you think about rename this to Destroy(callback func()) to clarify the usage?


// priorityLevelStates maps the PriorityLevelConfiguration object
// name to the state for that level
priorityLevelStates map[string]*PriorityLevelState

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 18, 2019

Member

it's better if we use uid as key here? avoiding PL w/ the same name got removed and created instantly?

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 22, 2019

Author Member

Yes, that would be part of the implementation if we agree on that behavior. We have not agreed on what the desired behavior is. https://github.com/kubernetes/enhancements/pull/1098/files#r300193486

@MikeSpreitzer
Copy link
Member Author

left a comment

In addition to the inline comments: In kubernetes/enhancements#1098 we are debating how to handle delete and re-create of a priority level. Let's reach agreement on that, then we can make sure the code is consistent with our decision.

configQueue workqueue.RateLimitingInterface

// numConfigWorkers is the configured number of workers for the config queue
numConfigWorkers int

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 18, 2019

Author Member

Yes, actually I think it is positively good to have just one worker --- that eliminates the need for synchronization among workers. I will update.
There will still be the need for some synchronization between the config queue worker and the handler for empty notifications from FairQueuingSystems, but that is a relatively narrow issue.


// TypedConfigObjectReference is a reference to a relevant config API object.
// No namespace is needed because none of these objects is namespaced.
type TypedConfigObjectReference struct {

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 18, 2019

Author Member

I think the division of this stuff among packages is still TBD, so I have not worried about making the right decision for each name yet.

func WithRequestManagement(
handler http.Handler,
loopbackClientConfig *restclient.Config,
serverConcurrencyLimit int,
handler http.Handler,
loopbackClientConfig *restclient.Config,
serverConcurrencyLimit int,
requestWaitLimit time.Duration,

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 18, 2019

Author Member

This parameter is introduced in https://github.com/kubernetes/enhancements/blob/79119ce549cda9cdfdb532273475ea5090fa5415/keps/sig-api-machinery/20190228-priority-and-fairness.md#latency-protection . In particular, I think it must be set to something substantially shorter than the limit imposed by the timeout filter. It does not make sense to allow a request to wait in a queue for longer than the total time allowed for waiting+executing. In fact, I think the wait duration limit should be short enough that the majority of the overall duration limit is available for request execution.

}

// Skip tracking long running events.
if longRunningRequestCheck != nil && longRunningRequestCheck(r, requestInfo) {

This comment has been minimized.

close(finished)
}()
select {
case <-timedOut:

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 18, 2019

Author Member

Yes, the upstream timeout filter will return after the timeout expires. But in our filter we want an upper bound on execution time; my design analyses were predicated on that. The logic here ensures that there is such an upper bound, as far as this filter is concerned.

@MikeSpreitzer MikeSpreitzer force-pushed the MikeSpreitzer:feat/rate-limiting-skel branch from 774fad0 to dd9d906 Jul 18, 2019

A little more thought about concurrency in the reqmgmt filter
(1) removed the idea of multiple config queue workers; it will be
simpler and adequate if there is only 1.

(2) attempted to clarify the comment on FairQueuingSystem::DoOnEmpty
to (a) make the division of responsibilities clearer, (b) make a
precise statement of the semantics, and (c) be less prescriptive of
how the filter uses this method.

(this is the third edition of this change)

@MikeSpreitzer MikeSpreitzer force-pushed the MikeSpreitzer:feat/rate-limiting-skel branch from dd9d906 to dbe43fb Jul 18, 2019

@k8s-ci-robot k8s-ci-robot added size/XL and removed size/L labels Jul 23, 2019

@MikeSpreitzer MikeSpreitzer force-pushed the MikeSpreitzer:feat/rate-limiting-skel branch from 0e70ffe to 0e360cd Jul 23, 2019

Show resolved Hide resolved staging/src/k8s.io/apiserver/pkg/server/filters/reqmgmt-config.go Outdated

// OnAdd handles notification from an informer of object creation
func (reqMgmt *requestManagement) OnAdd(obj interface{}) {
reqMgmt.configQueue.Add(new(int))

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 23, 2019

Member

note that OnAdd will be called #priority-level + #flow-schema times when launching the system. it's the informer calling the method in the first round of list-watching..

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 23, 2019

Author Member

Add(new(int)) was a mistake; I have corrected it to Add(0), meaning that there will be at most one entry in the queue at any given time.

reqMgmt.plInformer.AddEventHandler(reqMgmt)
reqMgmt.fsInformer.AddEventHandler(reqMgmt)
stopCh := make(chan struct{})
go reqMgmt.plInformer.Run(stopCh)

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 23, 2019

Member

we don't need to start the informer manually. a post-start-hook in the kube-apiserver will actually do that.

genericApiServerHookName := "generic-apiserver-start-informers"
if c.SharedInformerFactory != nil && !s.isPostStartHookRegistered(genericApiServerHookName) {
err := s.AddPostStartHook(genericApiServerHookName, func(context PostStartHookContext) error {
c.SharedInformerFactory.Start(context.StopCh)
return nil
})
if err != nil {
return nil, err
}
}

klog.Error("Unable to do initial config sync")
return false
}
go reqMgmt.Run(stopCh)

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 23, 2019

Member

the system should be launched using a post-start-hook and closed using a pre-shutdown-hook. but we can refactor in a follow-up

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 23, 2019

Author Member

Where would I find how to register those hooks?

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 23, 2019

Author Member

I do not see how

handler = genericfilters.WithMaxInFlightLimit(handler, c.MaxRequestsInFlight, c.MaxMutatingRequestsInFlight, c.LongRunningFunc)
can add those hooks, it does not have access to the server.

if oldPriorities[plName] != nil {
problem = "undesired"
}
klog.Warningf("FlowSchema %s references %s priority level %s and will thus not match any requests", fsName, problem, plName)

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 23, 2019

Member

i think we should probably set a condition in the status subresource of the PL in this case? we can do it in a follow-up

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 23, 2019

Author Member

Remember there are multiple apiservers. I am dubious about each posting status.

Show resolved Hide resolved staging/src/k8s.io/apiserver/pkg/server/filters/reqmgmt-config.go Outdated
}
ok := reqMgmt.syncOne()
if !ok {
klog.Error("Unable to do initial config sync")

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 23, 2019

Member

as a convention of golang, https://github.com/golang/go/wiki/Errors#errors, all the error messages should not start w/ a capital letter

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 23, 2019

Author Member

There is a difference between (a) the string in an error and (b) a message written with glog or klog. Yes, the string in an error should start lower case, and that is what the cited doc says. What we have here is (b), where I think a normal English sentence is appropriate.

}
reqMgmt.curState.Store(rms)
reqMgmt.plInformer.AddEventHandler(reqMgmt)
reqMgmt.fsInformer.AddEventHandler(reqMgmt)

This comment has been minimized.

Copy link
@yue9944882

yue9944882 Jul 23, 2019

Member

we should move the event-handler registration below waiting for cache synced, the reason explained here #78966 (comment)

if ok := cache.WaitForCacheSync(stopCh, reqMgmt.plInformer.HasSynced, reqMgmt.fsInformer.HasSynced); !ok {

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 23, 2019

Author Member

With the notification handlers fixed there is no performance problem anymore.
The handlers need to get registered before the informers are started.

@MikeSpreitzer MikeSpreitzer force-pushed the MikeSpreitzer:feat/rate-limiting-skel branch from 0e360cd to dbe43fb Jul 23, 2019

@k8s-ci-robot k8s-ci-robot added size/L and removed size/XL labels Jul 23, 2019

@MikeSpreitzer

This comment has been minimized.

Copy link
Member Author

commented Jul 23, 2019

/retest

Simplified FairQueuingSystem interface
.. and introduced a tiny bit of starting to work with config objects.
@MikeSpreitzer

This comment has been minimized.

Copy link
Member Author

commented Jul 25, 2019

/retest

@yue9944882

This comment has been minimized.

Copy link
Member

commented Jul 25, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Jul 25, 2019

@k8s-ci-robot k8s-ci-robot merged commit 48fdc31 into kubernetes:feature-rate-limiting Jul 25, 2019

22 of 23 checks passed

tide Not mergeable. Needs lgtm label.
Details
cla/linuxfoundation MikeSpreitzer authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Skipped.
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-iscsi Skipped.
pull-kubernetes-e2e-gce-iscsi-serial Skipped.
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Skipped.
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
return
}

for {

This comment has been minimized.

Copy link
@aaron-prindle

aaron-prindle Jul 26, 2019

Contributor

I'm not sure I understand the purpose of the for loop here. Additionally I'm not sure I understand from the skeleton how code exits this loop.

This comment has been minimized.

Copy link
@aaron-prindle

aaron-prindle Jul 26, 2019

Contributor

I think we might want to return after the execute select cases and the reject cases

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Jul 26, 2019

Author Member

Oops! Right!
The purpose of the loop is just to be able to do a retry in the quiescing case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.