-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator: set smp based on resource request #1416
operator: set smp based on resource request #1416
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. Lot's of questions, though.
) []string { | ||
var smp int64 = 1 | ||
if !limits.Cpu().IsZero() { | ||
limits.Cpu().RoundUp(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the CPU request is non-integer, or, at least, below 1, we should run overprovisioned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do that for developer mode. Are you thinking of doing so for both modes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g., if CPU request is set to 200m in non-developerMode, then overprovisioned might help it play nicely with other processes on the machine. It's not a recommended config. I don't think this is a showstopper.
@@ -477,6 +489,7 @@ func overprovisioned(developerMode bool, limits corev1.ResourceList) []string { | |||
"--default-log-level=info", | |||
"--reserve-memory 0M", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought not having reserve-memory meant TLS didn't work? Or has memory
already accounted for that? It's different to developerMode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, they have been different. I know reserve-memory
is set when memory
isn't, to reserve memory for OS use. For the case where memory
is set I don't specify reserve-memory
, but I'm not sure what happens if memory
is set to the full memory capacity of the machine.
("reserve-memory", bpo::value<std::string>(), "memory reserved to OS (if --memory not specified)")
https://github.com/vectorizedio/seastar/blob/master/src/core/reactor.cc#L3431-L3432
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. If non-developerMode works fine with TLS then all good. I never got to the bottom of it, but when I was testing I recall that we added --reserve-memory when rpk was taught to pass in TLS. My guess what that rpk consumes some memory before forking Redpanda. I think we tested with 1 core and 1GiB - 1GiB being the minimum per core (2GiB is recommended), and Redpanda failed to start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I remember, seastar have some hiccup in our CI where we are constraint to only 100Mb in resource.request.memory
and DeveloperMode on. In such low constraint the seastar initialization failed.
@dimitriscruz this should take into account 2GB/core and side down. So the logic should be
or smth like that. |
@@ -468,7 +480,7 @@ func overprovisioned(developerMode bool, limits corev1.ResourceList) []string { | |||
"--overprovisioned", | |||
// sometimes a little bit of memory is consumed by other processes than seastar | |||
"--reserve-memory " + redpandav1alpha1.ReserveMemoryString, | |||
"--smp=1", | |||
"--smp=" + strconv.FormatInt(smp, 10), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make sure is at least 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the updated version when in developer mode, smp is either positive or not passed as an argument.
requests.Cpu().RoundUp(0) | ||
smp = requests.Cpu().Value() | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smp should be at least 1 here. some assertion or normalization is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
further cpu tunning based on available Memory is needed too.
47c6316
to
ea1b29e
Compare
92e1d00
to
9de74b7
Compare
ed922cb
to
8b7a65b
Compare
requests := r.Spec.Resources.Requests.DeepCopy() | ||
requests.Cpu().RoundUp(0) | ||
requestedCores := requests.Cpu().Value() | ||
if !r.Spec.Configuration.DeveloperMode && r.Spec.Resources.Requests.Memory().Value() < requestedCores*MinimumMemoryPerCore { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@senior7515 it's possible to run with 1GiB/core - should this allow 1GiB/core or require 2GiB/core?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible. we can downgrade to 1G/core, but not recommended for prod. It's very little memory - around 500MB free for request flow.
I'm flexible i think we need to change the 'check' in syschecks.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memory check I see is for a minimum of 1GiB (https://github.com/vectorizedio/redpanda/blob/dev/src/v/syschecks/syschecks.cc#L40)
Do we want to change that to 2GiB and then require the same through the operator, or the other way round (1GiB everywhere)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think 2G/core is recommended. We'd need to test the alternative to have a different recommendation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're bumping up against what a good default is vs. allowing users to override specific things. If we had more escape hatches for allowing the user to override a specific setting, this would be less of a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have developerMode
that ignores the memory/cpu ratio and now converts the cpu request to an smp value.
We could introduce minimumMemoryCPURatio: 2 (default)
and if one really wants production mode with a non-recommended ratio they could adjust it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that shadowing the actual configuration params with slightly different implications is useful. Can we support this: #1388 in the operator?
I'm happy with this PR as it stands (possibly weaken the webhook to 1GiB/core), and to make more direct configuration possible in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8b7a65b
to
64e3711
Compare
|
||
requests.Cpu().RoundUp(0) | ||
requestedCores := requests.Cpu().Value() | ||
requestedMemory := requests.Memory().Value() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this PR needs some fixes to pass CI, could you subtract from requestedMemory
around 10% of the requestedMemory
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're asking to subtract 10% so the --memory
doesn't end up taking the whole machine. Instead, the caller can ensure the passed requested memory is 90% (or other percent they prefer) of the machine capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm asking for add another layer of cushion. Because in the container there are more processes than Redpanda under heavy lead Redpanda will be oom killed. If we set request and limit for the stateful to 10 Gi I would like to have Redpanda memory
argument to be 10% than 10Gi. I'm ok with doing this in the next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we set request and limit for the stateful to 10 Gi
With this PR we're not setting a Limit anymore, only Request and then --memory
I would like to have Redpanda memory argument to be 10% than 10Gi.
- It's hard to decide on the value of "10%", why not "15%". In that case we'd need to not hardcode it and expose a parameter.
- Most importantly, we already have "requests", so the user can enter a request that is 90% (or similar) of their capacity. We already do this when using the operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It's hard to decide on the value of "10%", why not "15%". In that case we'd need to not hardcode it and expose a parameter.
The values was calculated by me when the OOM Killed event was reported via dmesg. User should not provide this value, because this is Redpanda container problem.
- Most importantly, we already have "requests", so the user can enter a request that is 90% (or similar) of their capacity. We already do this when using the operator.
The requests on the cgroup level is different knob than Redpanda --memory
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR passes the requested memory as the --memory
The smp is set such that each core has at least 2GB of memory. Use request instead of limit, so setting a limit is not required.
64e3711
to
b4204f1
Compare
In redpanda-data#1416 some changes were made, but I don't think it was intended to remove `--memory` flag in `developerMode`. This PR changes the behaviour: * When `requests.memory` or `redpanda.memory` is set, `--memory` is passed regardless of `developerMode` * When `--memory` is set, also set `--reserve-memory=0M`. There is already a 10% buffer calculated in `requests.RedpandaMemory()` Setting both `--memory` and `--reserve-memory` is undocumented, and shouldn't be used. However, there is insufficient information to correctly calculate what `--reserve-memory` should be. By setting just `--memory`, seastar will reserve at least 1.5Gi for the OS, which doesn't make sense in a container.
In redpanda-data#1416 some changes were made, but I don't think it was intended to remove `--memory` flag in `developerMode`. This PR changes the behaviour: * When `requests.memory` or `redpanda.memory` is set, `--memory` is passed regardless of `developerMode` * When `--memory` is set, also set `--reserve-memory=0M`. There is already a 10% buffer calculated in `requests.RedpandaMemory()` Setting both `--memory` and `--reserve-memory` is undocumented, and shouldn't be used. However, there is insufficient information to correctly calculate what `--reserve-memory` should be. By setting just `--memory`, seastar will reserve at least 1.5Gi for the OS, which doesn't make sense in a container.
In redpanda-data#1416 some changes were made, but I don't think it was intended to remove `--memory` flag in `developerMode`. This PR changes the behaviour: * When `requests.memory` or `redpanda.memory` is set, `--memory` is passed regardless of `developerMode` * When `--memory` is set, also set `--reserve-memory=0M`. There is already a 10% buffer calculated in `requests.RedpandaMemory()` Setting both `--memory` and `--reserve-memory` is undocumented, and shouldn't be used. However, there is insufficient information to correctly calculate what `--reserve-memory` should be. By setting just `--memory`, seastar will reserve at least 1.5Gi for the OS, which doesn't make sense in a container. (cherry picked from commit 25f1055)
In redpanda-data#1416 some changes were made, but I don't think it was intended to remove `--memory` flag in `developerMode`. This PR changes the behaviour: * When `requests.memory` or `redpanda.memory` is set, `--memory` is passed regardless of `developerMode` * When `--memory` is set, also set `--reserve-memory=0M`. There is already a 10% buffer calculated in `requests.RedpandaMemory()` Setting both `--memory` and `--reserve-memory` is undocumented, and shouldn't be used. However, there is insufficient information to correctly calculate what `--reserve-memory` should be. By setting just `--memory`, seastar will reserve at least 1.5Gi for the OS, which doesn't make sense in a container.
…esource-to-smp operator: set smp based on resource request
In redpanda-data#1416 some changes were made, but I don't think it was intended to remove `--memory` flag in `developerMode`. This PR changes the behaviour: * When `requests.memory` or `redpanda.memory` is set, `--memory` is passed regardless of `developerMode` * When `--memory` is set, also set `--reserve-memory=0M`. There is already a 10% buffer calculated in `requests.RedpandaMemory()` Setting both `--memory` and `--reserve-memory` is undocumented, and shouldn't be used. However, there is insufficient information to correctly calculate what `--reserve-memory` should be. By setting just `--memory`, seastar will reserve at least 1.5Gi for the OS, which doesn't make sense in a container.
…-resource-to-smp operator: set smp based on resource request
In redpanda-data#1416 some changes were made, but I don't think it was intended to remove `--memory` flag in `developerMode`. This PR changes the behaviour: * When `requests.memory` or `redpanda.memory` is set, `--memory` is passed regardless of `developerMode` * When `--memory` is set, also set `--reserve-memory=0M`. There is already a 10% buffer calculated in `requests.RedpandaMemory()` Setting both `--memory` and `--reserve-memory` is undocumented, and shouldn't be used. However, there is insufficient information to correctly calculate what `--reserve-memory` should be. By setting just `--memory`, seastar will reserve at least 1.5Gi for the OS, which doesn't make sense in a container.
Cover letter
Currently, the operator reads the CPU limit from the Cluster CR but the only way to affect the
smp
redpanda argument is throughdeveloperMode
: if set to truesmp
is set to 1, otherwise it's empty (consuming all cores).This change
smp
depending on the resources requested in the CR.--memory
, such that the provided amount is guaranteed on the nodeDeveloper mode : true
As before, but
smp
is computed based on the requested cores (instead of always being set to 1). If the CPU request is empty,smp
remains empty, and Redpanda uses all cores.Developer mode : false
smp
is then set to the minimum of that and the requested number of cores.request
instead oflimit
because we have to guarantee the resources are available when providing the arguments to Redpanda.Fixes #1378
Example
Gives
Setting cpu to 2 (or 2000m), sets
smp
to 2.Release notes
Provided resource requests in CR determine Redpanda's
smp
value