Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

koordlet:move audit log to resource executor after resource update #1055

Merged
merged 4 commits into from
Mar 13, 2023

Conversation

Re-Grh
Copy link
Contributor

@Re-Grh Re-Grh commented Feb 27, 2023

Ⅰ. Describe what this PR does

to solve the problem of duplicate audit logs by moving it to the executor module after resource update

Ⅱ. Does this pull request fix one issue?

fixes #1018
The audit log redundancy issue has been addressed by leveraging the executor's built-in cache, which has changed the recording frequency from 10 seconds to 60 seconds. Meanwhile, the adjustbecpuset policy execution shows frequent CPU rebinding , both klog and audit log will print.

截屏2023-03-10 10 15 44

and klog shows here

截屏2023-03-10 10 44 33

and i will start a new issue since it looks like we need to modify the function applyCPUSetWithNonePolicy(beCPUSet, oldCPUSet) to improve the stability of CPUs for the BE Group's calculation policy.

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

@Re-Grh Re-Grh changed the title move audit log to resource executor after resource update koordlet:move audit log to resource executor after resource update Feb 27, 2023
@codecov
Copy link

codecov bot commented Feb 27, 2023

Codecov Report

Patch coverage: 44.18% and no project coverage change.

Comparison is base (7fafbe3) 66.98% compared to head (8329606) 66.98%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1055   +/-   ##
=======================================
  Coverage   66.98%   66.98%           
=======================================
  Files         263      263           
  Lines       28975    29036   +61     
=======================================
+ Hits        19409    19450   +41     
- Misses       8206     8226   +20     
  Partials     1360     1360           
Flag Coverage Δ
unittests 66.98% <44.18%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...oordlet/runtimehooks/protocol/container_context.go 5.33% <0.00%> (-0.04%) ⬇️
.../koordlet/runtimehooks/protocol/kubeqos_context.go 0.00% <0.00%> (ø)
pkg/koordlet/runtimehooks/protocol/pod_context.go 6.95% <0.00%> (ø)
pkg/koordlet/runtimehooks/protocol/protocol.go 3.57% <0.00%> (ø)
pkg/koordlet/util/system/common.go 58.92% <0.00%> (-3.34%) ⬇️
pkg/util/cpuset/cpuset.go 70.90% <0.00%> (-2.34%) ⬇️
pkg/koordlet/resourceexecutor/resctrl_updater.go 44.09% <11.11%> (-1.81%) ⬇️
pkg/koordlet/resourceexecutor/cgroup.go 60.44% <46.15%> (-2.84%) ⬇️
.../koordlet/runtimehooks/hooks/groupidentity/rule.go 81.30% <57.14%> (-0.30%) ⬇️
pkg/koordlet/resourceexecutor/updater.go 45.03% <60.00%> (+1.60%) ⬆️
... and 4 more

... and 3 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

pkg/koordlet/resmanager/cpu_suppress.go Outdated Show resolved Hide resolved
pkg/koordlet/resmanager/cpu_suppress.go Outdated Show resolved Hide resolved
pkg/koordlet/resourceexecutor/updater.go Outdated Show resolved Hide resolved
pkg/koordlet/resourceexecutor/updater.go Outdated Show resolved Hide resolved
pkg/koordlet/resourceexecutor/updater.go Outdated Show resolved Hide resolved
@@ -227,6 +256,7 @@ type NewResourceUpdaterFunc func(resourceType sysutil.ResourceType, parentDir st
type ResourceUpdaterFactory interface {
Register(g NewResourceUpdaterFunc, resourceTypes ...sysutil.ResourceType)
New(resourceType sysutil.ResourceType, parentDir string, value string) (ResourceUpdater, error)
NewWithAudit(resourceType sysutil.ResourceType, parentDir string, value string, e *audit.EventHelper) (ResourceUpdater, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the origin New() can be abandoned and rename NewWithAudit with New
Is there anyone still using New()?

pkg/koordlet/resmanager/cpu_suppress.go Show resolved Hide resolved
func CommonCgroupUpdateFunc(resource ResourceUpdater) error {
c := resource.(*CgroupResourceUpdater)
_ = audit.V(5).Reason(ReasonUpdateCgroups).Message("update %v to %v", resource.Path(), resource.Value()).Do()
return sysutil.CgroupFileWriteIfDifferent(c.parentDir, c.file, c.value)
dirty, err := cgroupFileWriteIfDifferent(c.parentDir, c.file, c.value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dirty->updated

@@ -227,6 +256,7 @@ type NewResourceUpdaterFunc func(resourceType sysutil.ResourceType, parentDir st
type ResourceUpdaterFactory interface {
Register(g NewResourceUpdaterFunc, resourceTypes ...sysutil.ResourceType)
New(resourceType sysutil.ResourceType, parentDir string, value string) (ResourceUpdater, error)
NewWithAudit(resourceType sysutil.ResourceType, parentDir string, value string, e *audit.EventHelper) (ResourceUpdater, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is still some V5 logs in other pkg, which need to be modified.

@zwzhang0107
Copy link
Contributor

problem about applyCPUSetWithNonePolicy?

@@ -151,7 +151,7 @@ func UpdateResctrlTasksFunc(resource ResourceUpdater) error {
// 122
// 123
// 124
_ = audit.V(5).Reason(ReasonUpdateResctrl).Message("update %v to %v", resource.Key(), resource.Value()).Do()
_ = audit.V(3).Reason(ReasonUpdateResctrl).Message("update %v to %v", resource.Key(), resource.Value()).Do()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V5

if err != nil {
klog.Warningf("suppressBECPU failed to apply be cpu suppress policy, err: %s", err)
return
if !compareIntSlice(beCPUSet, oldCPUSet) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

f.lock.RLock()
defer f.lock.RUnlock()
g, ok := f.registry[resourceType]
if !ok {
return nil, fmt.Errorf("resource type %s not registered", resourceType)
}
return g(resourceType, parentDir, value)
u, err := g(resourceType, parentDir, value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass EventHelper to g()

} else if updated {
_ = audit.V(3).Reason(ReasonUpdateCgroups).Message("update %v to %v", resource.Path(), resource.Value()).Do()
}
return nil
}

func CommonDefaultUpdateFunc(resource ResourceUpdater) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewCommonDefaultUpdater

@@ -81,7 +81,7 @@ func caculateMemoryConfig(strategy *slov1alpha1.SystemStrategy, nodeMemory int64
if sysutil.ValidateResourceValue(&minFreeKbytes, "", sysutil.MinFreeKbytes) {
valueStr := strconv.FormatInt(minFreeKbytes, 10)
file := sysutil.MinFreeKbytes.Path("")
resource, err := resourceexecutor.NewCommonDefaultUpdater(file, file, valueStr)
resource, err := resourceexecutor.NewCommonDefaultUpdater(file, file, valueStr, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should pass eventHelper, not nil

r, err := sysutil.GetCgroupResource(resourceType)
if err != nil {
return nil, err
}
if e == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need this

func NewCommonDefaultUpdaterWithUpdateFunc(key string, file string, value string, updateFunc UpdateFunc) (ResourceUpdater, error) {
func NewCommonDefaultUpdaterWithUpdateFunc(key string, file string, value string, updateFunc UpdateFunc, e *audit.EventHelper) (ResourceUpdater, error) {
if e == nil {
e = &audit.EventHelper{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm this

if err != nil {
return err
}
if updated && c.eventHelper.Event.Reason != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check nil

@Re-Grh Re-Grh force-pushed the audit branch 2 times, most recently from 2734589 to 663cf6d Compare March 9, 2023 09:31
@zwzhang0107
Copy link
Contributor

check the final result

Signed-off-by: Re-Grh <1271013391@qq.com>
Signed-off-by: Re-Grh <1271013391@qq.com>
Signed-off-by: Re-Grh <1271013391@qq.com>
@zwzhang0107
Copy link
Contributor

/approve
/lgtm

@zwzhang0107
Copy link
Contributor

/assign @FillZpp

@hormes
Copy link
Member

hormes commented Mar 13, 2023

/approve

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hormes, zwzhang0107

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@koordinator-bot koordinator-bot bot merged commit 4c2d190 into koordinator-sh:main Mar 13, 2023
@zwzhang0107 zwzhang0107 added this to the v1.2 milestone Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[proposal]move audit log to resource executor
5 participants