New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "Surface more information about plugin scores in scheduler" #99914
Revert "Surface more information about plugin scores in scheduler" #99914
Conversation
This reverts commit d09a841.
@Huang-Wei: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/priority critical-urgent /assign @damemi |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Huang-Wei The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
can you explain the regression? |
Which issue? |
Yes, I referred to #99913. Sorry, there was a typo.
Here is the magic: kubernetes/test/integration/framework/master_utils.go Lines 131 to 136 in 90851a0
But using which logging level is not the key point - #99411 is beneficial only when the desired logging level is enabled, but when that level is enabled, performance downgrades. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance drop is unexpected, though I would like to continue working on this to find a more efficient way to surface information like this.
I think we made some decisions in the original PR that have affected this, such as duplicating the calculation loop in order to separate out the logging information into its own function.
Due to that, we are essentially doing much of this work twice (which would explain the ~50% drop in throughput). I think that some reasonable level of efficiency drop is acceptable, because like @ahg-g said this is not a default logging level. This much is not acceptable.
Thanks for catching this and opening the reversion @Huang-Wei. I'll work to get a new PR with performance testing. Any other suggestions for how to improve this are welcome.
/lgtm
Thanks @damemi ! As code freeze is around the corner, reverting may be the most viable option to unblock other PRs. And it's easy to reproduce the perf drop, reverting can buy us more time to find out the root cause. |
/retest |
I don't think the performance drop is unexpected. But I don't think it's related to the calculation. It's probably about IO. We have 5000 nodes in those tests. That's a lot of writes. I just didn't think that tests would run at level 4 :( |
What type of PR is this?
/kind regression
/sig scheduling
What this PR does / why we need it:
This PR reverts #99411.
Tried locally that change v(4) to v(5) works, but that doesn't fix the root cause - users who customized logging level to 5 or greater can still encounter this issue. So I proposed to revert #99411 and dig out why it regresses.
Which issue(s) this PR fixes:
Fixes #99913