Skip to content

ROX-15980 observability resource requests and limits#993

Merged
ludydoo merged 3 commits intomainfrom
ROX-15980-observability-resources-requests-and-limits
May 2, 2023
Merged

ROX-15980 observability resource requests and limits#993
ludydoo merged 3 commits intomainfrom
ROX-15980-observability-resources-requests-and-limits

Conversation

@ludydoo
Copy link
Copy Markdown
Contributor

@ludydoo ludydoo commented Apr 27, 2023

Sets the resource requests and limits for observability components

The values were derived from the observed metrics on prometheus/grafana.

@ludydoo ludydoo requested review from kylape and porridge April 27, 2023 11:57
@ludydoo ludydoo temporarily deployed to development April 27, 2023 11:57 — with GitHub Actions Inactive
@ludydoo ludydoo temporarily deployed to development April 27, 2023 11:57 — with GitHub Actions Inactive
@ludydoo ludydoo temporarily deployed to development April 27, 2023 12:58 — with GitHub Actions Inactive
@ludydoo ludydoo temporarily deployed to development April 27, 2023 12:58 — with GitHub Actions Inactive
@ludydoo ludydoo temporarily deployed to development April 27, 2023 12:58 — with GitHub Actions Inactive
@ludydoo
Copy link
Copy Markdown
Contributor Author

ludydoo commented Apr 27, 2023

/retest

@ludydoo ludydoo requested a review from stehessel April 27, 2023 13:39
@ludydoo ludydoo temporarily deployed to development April 27, 2023 13:49 — with GitHub Actions Inactive
@ludydoo ludydoo temporarily deployed to development April 27, 2023 13:49 — with GitHub Actions Inactive
@ludydoo ludydoo temporarily deployed to development April 27, 2023 13:49 — with GitHub Actions Inactive
resources:
requests:
cpu: 1500m
memory: 18Gi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need 18Gi on all clusters? Perhaps the request should be smaller than the limit to accommodate smaller systems?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured 18Gi would be a minimum, since for stage it currently hovers around ~10Gi with not that many centrals on it..

Screenshot 2023-04-27 at 4 19 55 PM

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's ok if the actual usage is above the request, right? As long as it's smaller than the limit.

Copy link
Copy Markdown
Contributor Author

@ludydoo ludydoo Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though if request < limits, it's not a guaranteed QOS and the pod might be descheduled. My initial thought on prometheus, alertmanager & so on, is that since it drives all of our alerts, I would consider this as a critical component that should have a guaranteed QOS. But it's just my personal opinion. Please feel free to suggest other values to put as requests / limits.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. It's a pity it takes so many resources, but I guess it is what it is.

@ludydoo ludydoo requested a review from stehessel April 27, 2023 15:48
Copy link
Copy Markdown
Collaborator

@porridge porridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this fail on deployment if there's a typo somewhere here, or do we need to check by hand if the deployed values match our intention?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 28, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ludydoo, porridge, stehessel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ludydoo ludydoo merged commit 8620c55 into main May 2, 2023
@ludydoo ludydoo deleted the ROX-15980-observability-resources-requests-and-limits branch May 2, 2023 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants