New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resources.cpu.requests + GOMAXPROCS #501
Comments
As mentioned on the mailing list, I'm not sure we're actually able to do this, as we would need to know that then machine that the Prometheus pod is scheduled onto is going to have a certain amount of cores. Aside from that I'm not sure combining the number of GOMAXPROCS through limit is the right way to go, my feeling is that parallelization/cpushares should be treated distinctly. |
I wrote on the mailing list once more. I'm still questioning whether we should use the same field to configure cpu utilization and paralellization. I feel we might want to configure them separately. Wdyt? Also I understand the desire to set these kinds of things, but setting |
Hmm makes sense to me that
(As my strawman argument, Prometheus Operator uses I'm running two Prometheuses, and based on observed usage on 16 vCPU machines, I eventually set:
I want my Prometheuses to run efficiently on these machines but also I want to accurately request CPU for efficient scheduling + cluster utilization. As a user, I'd like for Prometheus Operator to set Is there a set of properties that includes Or is |
What problem does this pull request address? While I see the theoretical points, it would be great to get some profiling data to determine that this added complexity has practical benefit. Did you run into any problems with not setting GOMAXPROCS? |
I'm pretty such GOMAXPROCS is not the right tool to solve the problem. As this issue has gone pretty stale, I'd say we can re-discuss the actual problem at hand and see how we can solve it. Closing here, but feel free to re-open or continue the discussion. |
Have a look at this thread from Monzo: golang/go#19378 An easy approach for this discussion here would be to use ResourceFieldRef based on limits.cpu in the manifest (https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/#use-container-fields-as-values-for-environment-variables). What do you think? |
For those interested and related to this issue: did some benchmarking when CFS quota is not aligned with GOMAXPROCS. Details here: uber-go/automaxprocs#12 Before this gets into the wrong direction: these benchmarks show that there can be a potential performance impact. But as always, it's focused and synthetic benchmarks. Stress testing against prometheus has to be done to validate whether this is really required. |
@embano1 thanks for sharing! I gave this to our scalability team, we will evaluate it and might run some tests to validate whether that could have significant impact. |
That would be great, thx a ton!
|
Just wanted to add another datapoint. On 12 CPU VM with container limits set to 4, same query runs:
I think main slowdown comes from CFS quata period, where it starves process for cpu time for 100ms every time it runs over allocated quota. |
@redbaron Thanks for testing it! Out of curiosity do you have the possibility to disable CFS quotas globally? That way we could validate if it is that, that is causing it. Also are you just setting the limit to 4 or request and limit? If both, then I'd actually expect that guaranteed QoS would configure the container to its own cpuset. I'd be curious to see if the still phenomenon still occurs when on a dedicated cpuset. Could you try this if you haven't already? :) |
I don't see a way currently to configure Prometheus CR so that operator configures env for Prometheus main container in the StatefulSet. Am I missing something? When supported this would allow community to experiment , and depending on result defaults change might be easier to justify. WDYT, would it make sense to support configuring env in prometheusSpec for a start? |
Found a workaround - scaled down prometheus operator, then edited StatefulSet to add env. Runtime config got changed, but sadly, configuring GOMAXPROCS didn't make a difference. |
you could use strategic merge patch to inject env variables in the prometheus container |
In your Prometheus CRD: spec:
containers:
- name: prometheus
env:
- name: "GOMAXPROCS"
value: "4" |
Latest Prometheus releases can also adjust GOMAXPROCS to the CPU limits with the |
The Prometheus documentation states that OS threads is set by GOMAXPROCS. The default for
GOMAXPROCS
:The Prometheus Operator uses
resources.memory.requests
to set-local.target.heap.size
for efficient memory usage.Similary, setting
resources.cpu.requests
should setGOMAXPROCS
for efficient CPU usage.With Kubernetes + Docker, the default value of
GOMAXPROCS
is the number of cores available to the Kubernetes minion, not the number of cores available to the container as by--cpu-quota
nor--cpu-shares
in Docker. This is becauseruntime.NumCPU()
is used to set the default for GOMAXPROCS.For example, on a Kubernetes minion with 32 cores,
GOMAXPROCS
will be set to 32. Ideally, if we setresources.cpu.limits
to 4, thenGOMAXPROCS
should be set to 4.See discussion on prometheus-users.
The text was updated successfully, but these errors were encountered: