Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASMPET-6915: change cray-opa to daemonset and use a newer image #3315

Merged
merged 1 commit into from
Apr 5, 2024

Conversation

bo-quan
Copy link
Contributor

@bo-quan bo-quan commented Apr 5, 2024

Summary and Scope

Due to limitations of server-side load balancing in kubernetes, especially with OPA as it uses GRPC protocol leveraging persistent connections, we often run into situations where only 1 or 2 OPA ingressgateway pods are used. This has exposed OPA memory leakage bug found in older OPA envoy plugin versions. This PR uses a cray-opa chart by changing OPA deployment to daemonset, using a kubernetes beta feature that improves load balancing, with a newer OPA envoy plugin version v0.62.0 that has fixes for a memory leakage issue (open-policy-agent/opa#5320).

Issues and Related PRs

List and characterize relationship to Jira/Github issues and other pull requests. Be sure to list dependencies.

  • Resolves CASMPET-6915
  • Change will also be needed in <insert branch name here>
  • Future work required by [issue id](issue link)
  • Documentation changes required in [issue id](issue link)
  • Merge with/before/after <insert PR URL here>

Testing

List the environments in which these changes were tested.

Tested on:

  • <development system>
  • Local development environment
  • Virtual Shasta

Test description:

How were the changes tested and success verified? If schema changes were part of this change, how were those handled in your upgrade/downgrade testing?

  • Were the install/upgrade-based validation checks/tests run (goss tests/install-validation doc)?
  • Were continuous integration tests run? If not, why?
  • Was upgrade tested? If not, why?
  • Was downgrade tested? If not, why?
  • Were new tests (or test issues/Jiras) created for this change?

Risks and Mitigations

Are there known issues with these changes? Any other special considerations?

Pull Request Checklist

  • Version number(s) incremented, if applicable
  • Copyrights updated
  • License file intact
  • Target branch correct
  • Testing is appropriate and complete, if applicable
  • HPC Product Announcement prepared, if applicable

@bo-quan bo-quan requested a review from a team as a code owner April 5, 2024 20:12
@mtupitsyn mtupitsyn merged commit c304e94 into release/1.5 Apr 5, 2024
2 checks passed
@mtupitsyn mtupitsyn deleted the CASMPET-6915-1.5 branch April 5, 2024 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants