Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couple code cleanups for APF code #103820

Merged
merged 2 commits into from
Aug 5, 2021

Conversation

wojtek-t
Copy link
Member

NONE

/kind cleanup
/priority important-longterm
/sig api-machinery

/assign @MikeSpreitzer @tkashem

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 21, 2021
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver labels Jul 21, 2021
go func() {
defer runtime.HandleCrash()
qs.goroutineDoneOrBlocked()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing I want to do in subsequent PR is to ensure that production code isn't really tracking goroutines at all (it doesn't use this information) by using non-tracking promise in prod and tracking one (build on top of non-tracking one) in tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The production code already is not really tracking goroutines --- see https://github.com/kubernetes/apiserver/blob/b94b411ffc0d129451a850a78bd39b9457a15518/pkg/util/flowcontrol/apf_filter.go#L85

By removing the updates through the counter interface, this makes it impossible to test queueset with a fake clock. It is important to be able to test with a fake clock:

  • so that we can control timing precisely, and
  • so that we can run tests faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - I wasn't clear exactly.
I'm aware we're using no-op tracker in production code.

But I'm really not a fan of having test-only related code in production code. This complicates production code.
All the test-related code should really be separate out from the code that is running in production.

By removing the updates through the counter interface, this makes it impossible to test queueset with a fake clock. It is important to be able to test with a fake clock:

I disagree with this one. Currently none of the tests is exercising this. But even if at some point we extend our testing to take advantage of it, what we should do is:

  • make a class function from this goroutine [and leave it without counter]
  • in tests, define our own TestingQueuset (or whatever name we choose) that will be basically inheriting everything from the QueueSet, we will just redefing this single function

I can live with complicate tests, but I really want to have production code as simple as possible [it's not just about P&F of course].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we factor out this goroutine plus old line 298, the resulting surface seems odder (less simple) to me than the one we have now. What is the meaning of this goroutine plus old line 298? It's just a random bit of implementation. Counting active goroutines, on the other hand, is a simple self-contained semantic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/kubernetes/apiserver/blob/b94b411ffc0d129451a850a78bd39b9457a15518/pkg/util/flowcontrol/fairqueuing/queueset/queueset_test.go#L653 does exercise this.

It exercises that code path, but it doesn't exercise the counter:

  • clk is not used anywhere after initialization
  • counter is only updated through the text but never check
    [as the test shows - it passed consistently after removing it]

If we factor out this goroutine plus old line 298, the resulting surface seems odder (less simple) to me than the one we have now. What is the meaning of this goroutine plus old line 298? It's just a random bit of implementation. Counting active goroutines, on the other hand, is a simple self-contained semantic.

I disagree with this. When I'm reading the code I would like to be able to quickly understand what exactly is happening in production. And mixing test code with production code and just initializing it differently is not the best practice.


// WriteOnceOnly represents a variable that is initially not set and
// can be set once.
type WriteOnceOnly interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case anyone is not aware, the Go language has improved in a relevant way here since these interfaces were defined. In the bad old days, Go did not allow a diamond-shaped inheritance graph among interfaces. That is why this is written in mixin-style. Now, Go supports diamon-shaped inheritance. It would be possible to maintain the producer/consumer distinction here with fewer interface definitions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't be introducing interfaces for the sake of introducing them. Let's not overcomplicate stuff and not introduce things that we don't even use/need anywhere.

}
}

func TestLockingWriteOnce(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why delete this test code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of couse it shouldn't be deleted - reverted. Thanks for catching.

go func() {
defer runtime.HandleCrash()
qs.goroutineDoneOrBlocked()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The production code already is not really tracking goroutines --- see https://github.com/kubernetes/apiserver/blob/b94b411ffc0d129451a850a78bd39b9457a15518/pkg/util/flowcontrol/apf_filter.go#L85

By removing the updates through the counter interface, this makes it impossible to test queueset with a fake clock. It is important to be able to test with a fake clock:

  • so that we can control timing precisely, and
  • so that we can run tests faster.

@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 22, 2021
@@ -125,8 +125,7 @@ type configController struct {
requestWaitLimit time.Duration

// This must be locked while accessing flowSchemas or
// priorityLevelStates. It is the lock involved in
// LockingWriteMultiple.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remark

It is the lock involved in LockingWriteMultiple

got misplaced in past refactoring. It belongs on the lock in queueSet and refers to the promise in every request in the queueSet. It wouldn't hurt to reiterate the point in the comment on the decision field of request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went step further and simplified it further (we don't really take advantage of "locking" part of the interface anywhere).

So to make it clear for everyone - I updated the interface itself to not be thread-safe [which is how it is used - we care about protecting it ourselves in queueset]

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 23, 2021
@wojtek-t
Copy link
Member Author

@MikeSpreitzer - PTAL

lock.Lock()
defer lock.Unlock()
if !wr.IsSet() {
t.Error("IsSet()==false after second Set")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/second Set/second Get/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

decision promise.LockingWriteOnce
//
// The field is NOT thread-safe and should be protected by the
// external lock.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/external/queueset's/
(this type is not designed or available for use outside this package)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@MikeSpreitzer MikeSpreitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK except for a couple of minor comments.

Copy link
Member Author

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MikeSpreitzer - PTAL

lock.Lock()
defer lock.Unlock()
if !wr.IsSet() {
t.Error("IsSet()==false after second Set")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

decision promise.LockingWriteOnce
//
// The field is NOT thread-safe and should be protected by the
// external lock.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@wojtek-t
Copy link
Member Author

/retest

Copy link
Member

@MikeSpreitzer MikeSpreitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 30, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MikeSpreitzer, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@MikeSpreitzer
Copy link
Member

/retest

@k8s-ci-robot k8s-ci-robot merged commit e3b01a6 into kubernetes:master Aug 5, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Aug 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants