Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove presumptions about what decision has been made #105729

Merged

Conversation

MikeSpreitzer
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR updates queueSet to recognize that a request's decision might have already been made asynchronously due to context cancellation/timeout.

Which issue(s) this PR fixes:

Fixes #105338

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/sig api-machinery
/cc @tkashem
/cc @wojtek-t
@deads2k
@lavalamp

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 18, 2021
// It returns the request removed from the list.
type removeFromFIFOFunc func() *request
// The returned bool indicates whether a change was made.
type removeFromFIFOFunc func() bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't personally change this - the fact that it returns a request might be quite useful.

And you don't need it to do the changes you want to do - instead of checking if it is true, you just need to check if it is !=nil.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, a removeFromFIFOFunc returns the *request in question regardless of whether it was in the fifo.

Copy link
Contributor

@tkashem tkashem Oct 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i overlooked it - i opened a PR keeping the same signature type removeFromFIFOFunc func() *request - #105738. let me know if you want it, otherwise i will close it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am folding it into this PR, thanks.

@@ -592,14 +593,16 @@ func (qs *queueSet) removeTimedOutRequestsFromQueueLocked(queue *queue, fsName s
return false
})

for _, req := range rejects {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of moving it here and not handling it above on per-request basis?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"walkFunc must not remove the given request from the fifo"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a more efficient code base if we change fifo::Walk to allow removal of the current *request and no other fifo mutations; this would exchange the slice that escapes with a stack-allocated variable holding the next *request to enumerate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The force-push to 62eba5f54bc makes this change, and removes the redundant func enumerate_fifo.

@wojtek-t wojtek-t self-assigned this Oct 18, 2021

// walkFunc is called for each request in the list in the
// oldest -> newest order.
// ok: if walkFunc returns false then the iteration stops immediately.
// walkFunc must not remove the given request from the fifo.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, since there are no comments about other fifo mutations, the careful programmer will be unsure whether any are allowed.

@MikeSpreitzer
Copy link
Member Author

The force-push to e290a6ababf makes a removeFromFIFOFunc return the *request if it was removed and nil otherwise.

@MikeSpreitzer
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 19, 2021
@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 19, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 20, 2021
@MikeSpreitzer
Copy link
Member Author

The force-push to 0bc77f6b709 is a rebase onto master.

@MikeSpreitzer
Copy link
Member Author

The force-push to 06b1658a72d adds a missing period in the comment on removeFromFIFOFunc.

@shivanshuraj1333
Copy link
Contributor

/retest

metrics.AddRequestsInQueues(req.ctx, qs.qCfg.Name, req.fsName, -1)
req.NoteQueued(false)
if arrivalLimit.After(req.arrivalTime) {
if req.decision.Set(decisionReject) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

if req.decision.Set(decisionReject) && req.removeFromQueueLocked() != nil {

if request == nil { // This should never happen. But if it does...
return false
dequeued := false
for qs.totRequestsWaiting != 0 && qs.totSeatsInUse < qs.dCfg.ConcurrencyLimit {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH - I much more preferred the previous pattern - i.e.

  • this function actually dispatching at most one item
  • having additional function doing a for-loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively - can just switch this to a single if:

if qs.totRequestWaiting == 0 || qs.totSeatsInUse >= qs.dCfg.ConcurrencyLimit {
  return false
}

// The rest of the code comes here and is also not indented. 

@MikeSpreitzer
Copy link
Member Author

The force-push to e70999b addresses the latest review.

@wojtek-t
Copy link
Member

/lgtm
/approve

/retest

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 20, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MikeSpreitzer, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit c269494 into kubernetes:master Oct 20, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Oct 20, 2021
@MikeSpreitzer MikeSpreitzer deleted the do-not-assume-decision branch October 21, 2021 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

queueSet.findDispatchQueueLocked() possibly panics
6 participants