Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent: correct CPUShares and CPUWeight value #8341

Merged
merged 1 commit into from Dec 18, 2023

Conversation

jongwu
Copy link
Contributor

@jongwu jongwu commented Oct 31, 2023

If cgroup driver is systemd, CPUShares, for cgroup v1, should be at least 2 [1] and CPUWeight for cgroup v2, should be at least 1 [2].

Fixes: #8340
Signed-off-by: Jianyong Wu jianyong.wu@arm.com

[1] https://github.com/systemd/systemd/blob/d19434fbf81db04d03c8cffa87821f754a86635b/src/basic/cgroup-util.h#L122
[2] https://github.com/systemd/systemd/blob/d19434fbf81db04d03c8cffa87821f754a86635b/src/basic/cgroup-util.h#L91

@katacontainersbot katacontainersbot added the size/tiny Smallest and simplest task label Oct 31, 2023
Copy link
Contributor

@cmaf cmaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jongwu

@bergwolf
Copy link
Member

bergwolf commented Nov 2, 2023

/test

Copy link
Member

@bergwolf bergwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@bergwolf
Copy link
Member

bergwolf commented Nov 8, 2023

/test

@amshinde amshinde added the backport Change from a newer branch / repository that was backported label Nov 9, 2023
@jongwu
Copy link
Contributor Author

jongwu commented Nov 29, 2023

Can we merge this? @bergwolf

@jongwu
Copy link
Contributor Author

jongwu commented Dec 8, 2023

/test

@jongwu
Copy link
Contributor Author

jongwu commented Dec 11, 2023

Hello, can we get this merged?

// Minimum value of CPUShares should be 2, see https://github.com/systemd/systemd/blob/d19434fbf81db04d03c8cffa87821f754a86635b/src/basic/cgroup-util.h#L122
if shares >= 2 {
properties.push(("CPUShares", Value::U64(shares)));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is basically ignoring an invalid value received from the runtime... not correcting anything unlike the title seems to suggest. Is it appropriate to continue execution ? FWIW the memory subsystem bails out when receiving an invalid value. See https://github.com/kata-containers/kata-containers/blob/main/src/agent/rustjail/src/cgroups/systemd/subsystem/memory.rs#L42 .

Copy link
Contributor

@cmaf cmaf Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jongwu I agree with @gkurz here, could you update? Bailing would probably be appropriate here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, cgroupsv2 treats 0 as a hint to use the default value. Maybe better to do the same in cgroupsv1 for consistency ? WDYT @jongwu @cmaf ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the default value of cpuShares? 100?

Copy link
Member

@gkurz gkurz Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the default value of cpuShares? 100?

See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/sched.h#n409

/*
 * A weight of 0 or 1 can cause arithmetics problems.
 * A weight of a cfs_rq is the sum of weights of which entities
 * are queued on this cfs_rq, so a weight of a entity should not be
 * too large, so as the shares value of a task group.
 * (The default weight is 1024 - so there's no practical
 *  limitation from this.)
 */
#define MIN_SHARES		(1UL <<  1)
#define MAX_SHARES		(1UL << 18)

You can :

  • convert runtime input of 0 to 1024
  • bail out if value is not in the MIN_SHARES...MAX_SHARES range

@@ -81,7 +84,10 @@ impl Cpu {
) -> Result<()> {
if let Some(shares) = cpu_resources.shares {
let weight = shares_to_weight(shares);
properties.push(("CPUWeight", Value::U64(weight)));
// CPUWeigth can not be 0. See https://github.com/systemd/systemd/blob/d19434fbf81db04d03c8cffa87821f754a86635b/src/basic/cgroup-util.h#L91
Copy link
Member

@gkurz gkurz Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT shares_to_weight() already takes care of never returning 0.

It even converts an input of 0 shares to 100, which is the default value for weight.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to cpu shares, cpu weight has a validity range.

See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/cgroup-v2.rst#n646

All weights are in the range [1, 10000] with the default at 100.  This
allows symmetric multiplicative biases in both directions at fine
enough granularity while staying in the intuitive range.

Maybe bail out if value is out of range, which happens if passed cpu shares is > 262144, for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Changed.

Copy link
Member

@gkurz gkurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking this PR because it has a lot of approvals already but I'd really like to understand why invalid values are ignored instead of being corrected or at least logged.

@jongwu @bergwolf @cmaf @amshinde ?

@jongwu
Copy link
Contributor Author

jongwu commented Dec 13, 2023

Thanks @gkurz -, Updated.

If cgroup driver is systemd, CPUShares, for cgroup v1, should be at
least 2 [1] and CPUWeight for cgroup v2, should be at least 1 [2].

Fixes: kata-containers#8340
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>

[1] https://github.com/systemd/systemd/blob/d19434fbf81db04d03c8cffa87821f754a86635b/src/basic/cgroup-util.h#L122
[2] https://github.com/systemd/systemd/blob/d19434fbf81db04d03c8cffa87821f754a86635b/src/basic/cgroup-util.h#L91
@katacontainersbot katacontainersbot added size/small Small and simple task and removed size/tiny Smallest and simplest task labels Dec 15, 2023
Copy link
Member

@gkurz gkurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @jongwu !

@gkurz
Copy link
Member

gkurz commented Dec 18, 2023

/test

@gkurz gkurz merged commit 2987d3e into kata-containers:main Dec 18, 2023
165 of 242 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Change from a newer branch / repository that was backported ok-to-test size/small Small and simple task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"CPUShares is out of range" when start kata using docker
6 participants