Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

koord-scheduler: support debug scores #604

Merged

Conversation

eahydra
Copy link
Member

@eahydra eahydra commented Sep 8, 2022

koord-scheduler: support debug scores

Users can enable debug scores via flags -s, --debug-scores or set on demand via the following commands:

curl -X PUT leaderSchedulerIP:port/debug/flags/s --data '100'

Signed-off-by: Joseph joseph.t.lee@outlook.com

Ⅰ. Describe what this PR does

Users can enable debug scores via flags -s, --debug-scores or set on demand via the following commands:

curl -X PUT <leaderSchedulerIP>:<port>/debug/flags/s --data '100'

and the scheduler will logging the following content with markdown style:

| # | Pod | Node | Score | ImageLocality | InterPodAffinity | LoadAwareScheduling | NodeAffinity | NodeNUMAResource | NodeResourcesBalancedAllocation | NodeResourcesFit | PodTopologySpread | Reservation | TaintToleration |
| --- | --- | --- | ---:| ---:| ---:| ---:| ---:| ---:| ---:| ---:| ---:| ---:| ---:|
| 0 | default/curlimage-545745d8f8-rngp7 | cn-hangzhou.10.0.4.51 | 577 | 0 | 0 | 87 | 0 | 0 | 96 | 94 | 200 | 0 | 100 |
| 1 | default/curlimage-545745d8f8-rngp7 | cn-hangzhou.10.0.4.50 | 574 | 0 | 0 | 85 | 0 | 0 | 96 | 93 | 200 | 0 | 100 |
| 2 | default/curlimage-545745d8f8-rngp7 | cn-hangzhou.10.0.4.19 | 541 | 0 | 0 | 55 | 0 | 0 | 95 | 91 | 200 | 0 | 100 |
| 3 | default/curlimage-545745d8f8-rngp7 | cn-hangzhou.10.0.4.18 | 487 | 0 | 0 | 15 | 0 | 0 | 90 | 82 | 200 | 0 | 100 |
# Pod Node Score ImageLocality InterPodAffinity LoadAwareScheduling NodeAffinity NodeNUMAResource NodeResourcesBalancedAllocation NodeResourcesFit PodTopologySpread Reservation TaintToleration
0 default/curlimage-545745d8f8-rngp7 cn-hangzhou.10.0.4.51 577 0 0 87 0 0 96 94 200 0 100
1 default/curlimage-545745d8f8-rngp7 cn-hangzhou.10.0.4.50 574 0 0 85 0 0 96 93 200 0 100
2 default/curlimage-545745d8f8-rngp7 cn-hangzhou.10.0.4.19 541 0 0 55 0 0 95 91 200 0 100
3 default/curlimage-545745d8f8-rngp7 cn-hangzhou.10.0.4.18 487 0 0 15 0 0 90 82 200 0 100

Ⅱ. Does this pull request fix one issue?

implements #462

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

@codecov
Copy link

codecov bot commented Sep 8, 2022

Codecov Report

Merging #604 (780357d) into main (d1811e5) will decrease coverage by 0.28%.
The diff coverage is 77.19%.

@@            Coverage Diff             @@
##             main     #604      +/-   ##
==========================================
- Coverage   68.95%   68.67%   -0.29%     
==========================================
  Files         178      180       +2     
  Lines       20847    20997     +150     
==========================================
+ Hits        14376    14419      +43     
- Misses       5478     5582     +104     
- Partials      993      996       +3     
Flag Coverage Δ
unittests 68.67% <77.19%> (-0.29%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/scheduler/frameworkext/framework_extender.go 0.00% <0.00%> (ø)
pkg/scheduler/frameworkext/debug_scores.go 86.27% <86.27%> (ø)
...eduler/plugins/coscheduling/controller/podgroup.go 68.96% <0.00%> (-1.98%) ⬇️
pkg/koordlet/resmanager/cpu_suppress.go 70.30% <0.00%> (-0.50%) ⬇️
pkg/util/system/common.go 52.38% <0.00%> (ø)
pkg/util/system/resctrl.go 31.93% <0.00%> (ø)
pkg/util/system/common_linux.go 62.80% <0.00%> (ø)
pkg/util/system/util_test_tool.go 54.94% <0.00%> (ø)
pkg/runtimeproxy/server/docker/handler.go 37.23% <0.00%> (ø)
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@eahydra eahydra added this to the v0.7 milestone Sep 8, 2022
@eahydra eahydra linked an issue Sep 8, 2022 that may be closed by this pull request
@eahydra
Copy link
Member Author

eahydra commented Sep 9, 2022

/hold

@eahydra
Copy link
Member Author

eahydra commented Sep 9, 2022

/hold cancel

pkg/util/routes/flags.go Outdated Show resolved Hide resolved
Users can enable debug scores via flags -s, --debug-scores or
set on demand via the following commands:

curl -X PUT <leaderSchedulerIP>:<port>/debug/flags/s --data '100'

Signed-off-by: Joseph <joseph.t.lee@outlook.com>
Copy link
Member

@jasonliu747 jasonliu747 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@hormes
Copy link
Member

hormes commented Sep 13, 2022

Should this be put into a separate file, so that it may be possible to read this file through the http interface in the future?

@eahydra
Copy link
Member Author

eahydra commented Sep 13, 2022

Should this be put into a separate file, so that it may be possible to read this file through the http interface in the future?

Very good advice. In fact, I have also considered this problem, but I want to provide a simple method to meet the needs of the most basic debug scores. Later, we will look at the feedback and needs of the community before deciding how to persist and query the results.

@hormes
Copy link
Member

hormes commented Sep 13, 2022

/approve

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hormes, jasonliu747, saintube

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@koordinator-bot koordinator-bot bot merged commit b7f57f1 into koordinator-sh:main Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[proposal] koord-scheduler should support debugging API
4 participants