Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator: set smp based on resource request #1416

Merged
merged 2 commits into from
Jun 7, 2021

Conversation

dimitriscruz
Copy link
Contributor

@dimitriscruz dimitriscruz commented May 18, 2021

Cover letter

Currently, the operator reads the CPU limit from the Cluster CR but the only way to affect the smp redpanda argument is through developerMode: if set to true smp is set to 1, otherwise it's empty (consuming all cores).

This change

  • sets the smp depending on the resources requested in the CR.
  • uses the requested memory instead of limit to set --memory, such that the provided amount is guaranteed on the node
  • continues passing the resources/limits to the StatefulSet resource configuration (cgroup resources) as provided in the Cluster CR

Developer mode : true
As before, but smp is computed based on the requested cores (instead of always being set to 1). If the CPU request is empty, smp remains empty, and Redpanda uses all cores.

Developer mode : false

  • Given the requested memory, we compute the maximum number of cores - we want at least 2GB per core. The smp is then set to the minimum of that and the requested number of cores.
  • We use request instead of limit because we have to guarantee the resources are available when providing the arguments to Redpanda.

Fixes #1378

Example

  resources:
    requests:
      cpu: 100m
      memory: 2Gi [4Gi]
...
    developerMode: false

Gives

    Args:
      redpanda
      start
      --check=false
      --advertise-rpc-addr=..
      --default-log-level=info
      --reserve-memory 0M
      --smp=1
      --memory=2147483648 [4294967296]
    Requests:
      cpu:     100m
      memory:  2Gi [4Gi]

Setting cpu to 2 (or 2000m), sets smp to 2.

Release notes

Provided resource requests in CR determine Redpanda's smp value

Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Lot's of questions, though.

) []string {
var smp int64 = 1
if !limits.Cpu().IsZero() {
limits.Cpu().RoundUp(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the CPU request is non-integer, or, at least, below 1, we should run overprovisioned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do that for developer mode. Are you thinking of doing so for both modes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g., if CPU request is set to 200m in non-developerMode, then overprovisioned might help it play nicely with other processes on the machine. It's not a recommended config. I don't think this is a showstopper.

@@ -477,6 +489,7 @@ func overprovisioned(developerMode bool, limits corev1.ResourceList) []string {
"--default-log-level=info",
"--reserve-memory 0M",
Copy link
Member

@BenPope BenPope May 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought not having reserve-memory meant TLS didn't work? Or has memory already accounted for that? It's different to developerMode.

Copy link
Contributor Author

@dimitriscruz dimitriscruz May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they have been different. I know reserve-memory is set when memory isn't, to reserve memory for OS use. For the case where memory is set I don't specify reserve-memory, but I'm not sure what happens if memory is set to the full memory capacity of the machine.

 ("reserve-memory", bpo::value<std::string>(),  "memory reserved to OS (if --memory not specified)")

https://github.com/vectorizedio/seastar/blob/master/src/core/reactor.cc#L3431-L3432

Copy link
Member

@BenPope BenPope May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. If non-developerMode works fine with TLS then all good. I never got to the bottom of it, but when I was testing I recall that we added --reserve-memory when rpk was taught to pass in TLS. My guess what that rpk consumes some memory before forking Redpanda. I think we tested with 1 core and 1GiB - 1GiB being the minimum per core (2GiB is recommended), and Redpanda failed to start.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember, seastar have some hiccup in our CI where we are constraint to only 100Mb in resource.request.memory and DeveloperMode on. In such low constraint the seastar initialization failed.

@emaxerrno
Copy link
Contributor

@dimitriscruz this should take into account 2GB/core and side down. So the logic should be

  1. Do all the CPU adjustment
  2. Figure out memory
  3. downscale CPU if > 2GB
  4. if < 2GB; drop to developer mode

or smth like that.

@@ -468,7 +480,7 @@ func overprovisioned(developerMode bool, limits corev1.ResourceList) []string {
"--overprovisioned",
// sometimes a little bit of memory is consumed by other processes than seastar
"--reserve-memory " + redpandav1alpha1.ReserveMemoryString,
"--smp=1",
"--smp=" + strconv.FormatInt(smp, 10),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make sure is at least 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the updated version when in developer mode, smp is either positive or not passed as an argument.

requests.Cpu().RoundUp(0)
smp = requests.Cpu().Value()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smp should be at least 1 here. some assertion or normalization is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

further cpu tunning based on available Memory is needed too.

@dimitriscruz dimitriscruz force-pushed the cpu-resource-to-smp branch 2 times, most recently from 47c6316 to ea1b29e Compare May 19, 2021 02:15
@dimitriscruz dimitriscruz changed the title operator: set smp depending on resource request/limit operator: set smp depending on resource request May 19, 2021
@dimitriscruz dimitriscruz changed the title operator: set smp depending on resource request operator: set smp based on resource request May 19, 2021
@dimitriscruz dimitriscruz force-pushed the cpu-resource-to-smp branch 2 times, most recently from 92e1d00 to 9de74b7 Compare May 19, 2021 03:06
@dimitriscruz dimitriscruz force-pushed the cpu-resource-to-smp branch 2 times, most recently from ed922cb to 8b7a65b Compare May 19, 2021 04:17
@dimitriscruz dimitriscruz marked this pull request as ready for review May 19, 2021 06:14
@dimitriscruz dimitriscruz requested a review from a team as a code owner May 19, 2021 06:14
requests := r.Spec.Resources.Requests.DeepCopy()
requests.Cpu().RoundUp(0)
requestedCores := requests.Cpu().Value()
if !r.Spec.Configuration.DeveloperMode && r.Spec.Resources.Requests.Memory().Value() < requestedCores*MinimumMemoryPerCore {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@senior7515 it's possible to run with 1GiB/core - should this allow 1GiB/core or require 2GiB/core?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible. we can downgrade to 1G/core, but not recommended for prod. It's very little memory - around 500MB free for request flow.

I'm flexible i think we need to change the 'check' in syschecks.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory check I see is for a minimum of 1GiB (https://github.com/vectorizedio/redpanda/blob/dev/src/v/syschecks/syschecks.cc#L40)

Do we want to change that to 2GiB and then require the same through the operator, or the other way round (1GiB everywhere)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think 2G/core is recommended. We'd need to test the alternative to have a different recommendation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're bumping up against what a good default is vs. allowing users to override specific things. If we had more escape hatches for allowing the user to override a specific setting, this would be less of a problem.

Copy link
Contributor Author

@dimitriscruz dimitriscruz May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have developerMode that ignores the memory/cpu ratio and now converts the cpu request to an smp value.

We could introduce minimumMemoryCPURatio: 2 (default) and if one really wants production mode with a non-recommended ratio they could adjust it.

Copy link
Member

@BenPope BenPope May 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that shadowing the actual configuration params with slightly different implications is useful. Can we support this: #1388 in the operator?

I'm happy with this PR as it stands (possibly weaken the webhook to 1GiB/core), and to make more direct configuration possible in another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we support this: #1388 in the operator?

I just sent a PR with a simpler version of 1388: #1491

@dimitriscruz dimitriscruz force-pushed the cpu-resource-to-smp branch from 8b7a65b to 64e3711 Compare May 26, 2021 04:12

requests.Cpu().RoundUp(0)
requestedCores := requests.Cpu().Value()
requestedMemory := requests.Memory().Value()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this PR needs some fixes to pass CI, could you subtract from requestedMemory around 10% of the requestedMemory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're asking to subtract 10% so the --memory doesn't end up taking the whole machine. Instead, the caller can ensure the passed requested memory is 90% (or other percent they prefer) of the machine capacity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm asking for add another layer of cushion. Because in the container there are more processes than Redpanda under heavy lead Redpanda will be oom killed. If we set request and limit for the stateful to 10 Gi I would like to have Redpanda memory argument to be 10% than 10Gi. I'm ok with doing this in the next PR.

Copy link
Contributor Author

@dimitriscruz dimitriscruz Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set request and limit for the stateful to 10 Gi

With this PR we're not setting a Limit anymore, only Request and then --memory

I would like to have Redpanda memory argument to be 10% than 10Gi.

  1. It's hard to decide on the value of "10%", why not "15%". In that case we'd need to not hardcode it and expose a parameter.
  2. Most importantly, we already have "requests", so the user can enter a request that is 90% (or similar) of their capacity. We already do this when using the operator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It's hard to decide on the value of "10%", why not "15%". In that case we'd need to not hardcode it and expose a parameter.

The values was calculated by me when the OOM Killed event was reported via dmesg. User should not provide this value, because this is Redpanda container problem.

  1. Most importantly, we already have "requests", so the user can enter a request that is 90% (or similar) of their capacity. We already do this when using the operator.

The requests on the cgroup level is different knob than Redpanda --memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR passes the requested memory as the --memory

The smp is set such that each core has at least 2GB of memory.
Use request instead of limit, so setting a limit is not required.
@dimitriscruz dimitriscruz force-pushed the cpu-resource-to-smp branch from 64e3711 to b4204f1 Compare June 4, 2021 15:09
@dimitriscruz dimitriscruz merged commit edf8031 into redpanda-data:dev Jun 7, 2021
@dimitriscruz dimitriscruz deleted the cpu-resource-to-smp branch June 7, 2021 19:17
BenPope added a commit to BenPope/redpanda that referenced this pull request Feb 23, 2022
In redpanda-data#1416 some changes were made, but I don't think it was intended
to remove `--memory` flag in `developerMode`.

This PR changes the behaviour:
* When `requests.memory` or `redpanda.memory` is set, `--memory` is
  passed regardless of `developerMode`
* When `--memory` is set, also set `--reserve-memory=0M`. There is
  already a 10% buffer calculated in `requests.RedpandaMemory()`

Setting both `--memory` and `--reserve-memory` is undocumented, and
shouldn't be used. However, there is insufficient information to
correctly calculate what `--reserve-memory` should be. By setting
just `--memory`, seastar will reserve at least 1.5Gi for the OS,
which doesn't make sense in a container.
BenPope added a commit to BenPope/redpanda that referenced this pull request Feb 23, 2022
In redpanda-data#1416 some changes were made, but I don't think it was intended
to remove `--memory` flag in `developerMode`.

This PR changes the behaviour:
* When `requests.memory` or `redpanda.memory` is set, `--memory` is
  passed regardless of `developerMode`
* When `--memory` is set, also set `--reserve-memory=0M`. There is
  already a 10% buffer calculated in `requests.RedpandaMemory()`

Setting both `--memory` and `--reserve-memory` is undocumented, and
shouldn't be used. However, there is insufficient information to
correctly calculate what `--reserve-memory` should be. By setting
just `--memory`, seastar will reserve at least 1.5Gi for the OS,
which doesn't make sense in a container.
BenPope added a commit to BenPope/redpanda that referenced this pull request Feb 24, 2022
In redpanda-data#1416 some changes were made, but I don't think it was intended
to remove `--memory` flag in `developerMode`.

This PR changes the behaviour:
* When `requests.memory` or `redpanda.memory` is set, `--memory` is
  passed regardless of `developerMode`
* When `--memory` is set, also set `--reserve-memory=0M`. There is
  already a 10% buffer calculated in `requests.RedpandaMemory()`

Setting both `--memory` and `--reserve-memory` is undocumented, and
shouldn't be used. However, there is insufficient information to
correctly calculate what `--reserve-memory` should be. By setting
just `--memory`, seastar will reserve at least 1.5Gi for the OS,
which doesn't make sense in a container.

(cherry picked from commit 25f1055)
daisukebe pushed a commit to daisukebe/redpanda that referenced this pull request Mar 4, 2022
In redpanda-data#1416 some changes were made, but I don't think it was intended
to remove `--memory` flag in `developerMode`.

This PR changes the behaviour:
* When `requests.memory` or `redpanda.memory` is set, `--memory` is
  passed regardless of `developerMode`
* When `--memory` is set, also set `--reserve-memory=0M`. There is
  already a 10% buffer calculated in `requests.RedpandaMemory()`

Setting both `--memory` and `--reserve-memory` is undocumented, and
shouldn't be used. However, there is insufficient information to
correctly calculate what `--reserve-memory` should be. By setting
just `--memory`, seastar will reserve at least 1.5Gi for the OS,
which doesn't make sense in a container.
joejulian pushed a commit to joejulian/redpanda that referenced this pull request Mar 10, 2023
…esource-to-smp

operator: set smp based on resource request
joejulian pushed a commit to joejulian/redpanda that referenced this pull request Mar 10, 2023
In redpanda-data#1416 some changes were made, but I don't think it was intended
to remove `--memory` flag in `developerMode`.

This PR changes the behaviour:
* When `requests.memory` or `redpanda.memory` is set, `--memory` is
  passed regardless of `developerMode`
* When `--memory` is set, also set `--reserve-memory=0M`. There is
  already a 10% buffer calculated in `requests.RedpandaMemory()`

Setting both `--memory` and `--reserve-memory` is undocumented, and
shouldn't be used. However, there is insufficient information to
correctly calculate what `--reserve-memory` should be. By setting
just `--memory`, seastar will reserve at least 1.5Gi for the OS,
which doesn't make sense in a container.
joejulian pushed a commit to joejulian/redpanda that referenced this pull request Mar 24, 2023
…-resource-to-smp

operator: set smp based on resource request
joejulian pushed a commit to joejulian/redpanda that referenced this pull request Mar 24, 2023
In redpanda-data#1416 some changes were made, but I don't think it was intended
to remove `--memory` flag in `developerMode`.

This PR changes the behaviour:
* When `requests.memory` or `redpanda.memory` is set, `--memory` is
  passed regardless of `developerMode`
* When `--memory` is set, also set `--reserve-memory=0M`. There is
  already a 10% buffer calculated in `requests.RedpandaMemory()`

Setting both `--memory` and `--reserve-memory` is undocumented, and
shouldn't be used. However, there is insufficient information to
correctly calculate what `--reserve-memory` should be. By setting
just `--memory`, seastar will reserve at least 1.5Gi for the OS,
which doesn't make sense in a container.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider using ceil(cpu_request) to set the smp
4 participants