Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MON-3304: Add option to specify resource limits for all components #2067

Merged
merged 1 commit into from Aug 17, 2023

Conversation

rexagod
Copy link
Member

@rexagod rexagod commented Aug 7, 2023

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Aug 7, 2023

@rexagod: This pull request references MON-3168 which is a valid jira issue.

In response to this:

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 7, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 7, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 7, 2023
@rexagod rexagod force-pushed the MON-3168 branch 2 times, most recently from b4bb772 to 3bcce4e Compare August 7, 2023 23:20
@rexagod rexagod changed the title MON-3168: Add option to specify resource limits for all components MON-3304: Add option to specify resource limits for all components Aug 7, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Aug 7, 2023

@rexagod: This pull request references MON-3304 which is a valid jira issue.

In response to this:

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rexagod rexagod force-pushed the MON-3168 branch 2 times, most recently from be6cdd8 to 187b71f Compare August 8, 2023 13:50
@rexagod rexagod marked this pull request as ready for review August 8, 2023 15:38
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 8, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Aug 8, 2023

@rexagod: This pull request references MON-3304 which is a valid jira issue.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested review from marioferh and sthaha August 8, 2023 15:39
Comment on lines -4070 to -4074
type spec struct {
replicas int32
affinity *v1.Affinity
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused.

@rexagod rexagod force-pushed the MON-3168 branch 2 times, most recently from ddfcec4 to 2ef0c67 Compare August 8, 2023 20:02
pkg/manifests/types.go Outdated Show resolved Hide resolved
Documentation/api.md Outdated Show resolved Hide resolved
Documentation/api.md Outdated Show resolved Hide resolved
pkg/manifests/types.go Outdated Show resolved Hide resolved
pkg/manifests/types.go Outdated Show resolved Hide resolved
Documentation/api.md Outdated Show resolved Hide resolved
Documentation/api.md Outdated Show resolved Hide resolved
pkg/manifests/types.go Outdated Show resolved Hide resolved
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
@@ -7,6 +7,7 @@
- [#1950](https://github.com/openshift/cluster-monitoring-operator/pull/1950) Disable CORS headers on Thanos querier by default and add a flag to enable them back.
- [#1963](https://github.com/openshift/cluster-monitoring-operator/pull/1963) Add nodeExporter settings for network devices list.
- [#2049](https://github.com/openshift/cluster-monitoring-operator/pull/2049) Remove Kube*QuotaOvercommit alerts.
- [#2067](https://github.com/openshift/cluster-monitoring-operator/pull/2067) Add options to specify resource requests and limits for all components.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, finally :)

but IIUC, these new fields would override the default requests we already set.

Is it really what we want to do? Force users to define both requests and limits even when they only want to set limits?

IIRC, when only limits are defined, Kubernetes will set requests to the same values than limits.

Copy link

@juzhao juzhao Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. if we don't set requests for the components, it will use the default requests as defined in each deployment/daemonset
  2. https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
Note: If you specify a limit for a resource, but do not specify any request, and no admission-time mechanism has applied a default request for that resource, then Kubernetes copies the limit you specified and uses it as the requested value for the resource.
  1. specify request/limit, the component should use the specified settings

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @juzhao for confirming that "if only limits are set, requests are set equal to limits".

Here are my concerns in two scenarios:

  • User only sets the limits via this feature field without knowing that the default requests are not taken into account anymore and that the requests will be equal to the limits they provided (they may have some capacity issues) (unexpected behaviour)
  • User took the care to copy the default requests in this feature field, but I a change was done that makes a component need less or more requests, we take care of changing the default requests, but the requests in the feature field are not up to date anymore. (maintenance burden)

Copy link

@juzhao juzhao Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @juzhao for confirming that "if only limits are set, requests are set equal to limits".

Here are my concerns in two scenarios:

* User only sets the limits via this feature field without knowing that the default requests are not taken into account anymore and that the requests will be equal to the limits they provided (they may have some capacity issues) (unexpected behaviour)

I think this is expected behavior, by design of Kubernetes, user should know it

* User took the care to copy the default requests in this feature field, but I a change was done that makes a component need less or more requests, we take care of changing the default requests, but the requests in the feature field are not up to date anymore. (maintenance burden)

example: prometheus-adapter default resources.requests
https://github.com/openshift/cluster-monitoring-operator/blob/release-4.14/assets/prometheus-adapter/deployment.yaml#L73-L76

  1. if use does not set resources.requests in cluster-monitoring-config configmap, prometheus-adapter will sue the default resources.requests(if the default value can not let prometheus-adapter pod start up, then it's a bug, should be fixed in CMO)
  2. if user only set resources.limits, resources.requests use the same value as resources.limits,
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    k8sPrometheusAdapter:
      resources:
        limits:
          cpu: 200m
          memory: 500Mi
  1. set resources.limits and resources.requests, pod should use the specified value
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    k8sPrometheusAdapter:
      resources:
        requests:
          cpu: 20m
          memory: 200Mi
        limits:
          cpu: 200m
          memory: 500Mi

when set resources.limits and resources.requests, user should know clearly of the prometheus-adapter cpu/memory usage, then set resources.limits and resources.requests to a proper value, otherwise, pod would not start up

@jan--f
Copy link
Contributor

jan--f commented Aug 16, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 16, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jan--f, rexagod

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2023
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD c46c74a and 2 for PR HEAD ae4369d in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD edbb549 and 1 for PR HEAD ae4369d in total

@rexagod
Copy link
Member Author

rexagod commented Aug 16, 2023

/retest

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 529349f and 0 for PR HEAD ae4369d in total

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 17, 2023

@rexagod: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-single-node ae4369d link false /test e2e-aws-ovn-single-node

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit ebaa7e3 into openshift:master Aug 17, 2023
15 of 16 checks passed
@rexagod
Copy link
Member Author

rexagod commented Aug 17, 2023

/jira refresh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants