Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add autoscaling eviction annotation to catalog pods #2669

Conversation

exdx
Copy link
Member

@exdx exdx commented Feb 21, 2022

Signed-off-by: Daniel Sover dsover@redhat.com

Description of the change:
Closes #2666

Because the catalog pods are standalone and not backed by a Deployment/ReplicaSet, it causes issues when draining the underlying node, when then affects autoscaling. This additional annotation on the catalog pods will enable the cluster autoscaler to evict catalog pods when draining a node.

Motivation for the change:
Enable cluster autoscaling to work

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive
  • Tests marked as [FLAKE] are truly flaky
  • Tests that remove the [FLAKE] tag are no longer flaky

@perdasilva
Copy link
Collaborator

/approve @exdx could you please rebase?

@openshift-ci
Copy link

openshift-ci bot commented Feb 25, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: exdx, perdasilva

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 25, 2022
@exdx exdx force-pushed the fix/catalog-autoscaling-annotation branch from 9a67c3b to b75bbd8 Compare February 25, 2022 15:54
@exdx
Copy link
Member Author

exdx commented Feb 25, 2022

/approve @exdx could you please rebase?

done -- I think this is a reasonable stop-gap solution for our upstream users until we come up with a better story around how we want to support catalog pods being evicted from nodes during a drain or node scale down event. If we come up with a more robust solution, this annotation still should not be prohibitive.

@exdx exdx force-pushed the fix/catalog-autoscaling-annotation branch from b75bbd8 to ced96dc Compare April 28, 2022 20:15
Copy link

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the autoscaler side, +1, thank you for adding this.

@oceanc80
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 29, 2022
@exdx exdx force-pushed the fix/catalog-autoscaling-annotation branch from ced96dc to cf6c8e2 Compare April 29, 2022 15:00
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 29, 2022
@exdx
Copy link
Member Author

exdx commented Apr 29, 2022

Rebasing off HEAD of master

…oper draining of nodes

Signed-off-by: Daniel Sover <dsover@redhat.com>
@exdx exdx force-pushed the fix/catalog-autoscaling-annotation branch from cf6c8e2 to 2968449 Compare April 29, 2022 15:30
@oceanc80
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 29, 2022
@openshift-merge-robot openshift-merge-robot merged commit dbd5b6f into operator-framework:master Apr 29, 2022
@kfox1111
Copy link

Does this make node draining work too? In that past that hasnt worked. Though I haven't tested on the newest olms.

@exdx
Copy link
Member Author

exdx commented Apr 29, 2022

Does this make node draining work too? In that past that hasnt worked. Though I haven't tested on the newest olms.

Unfortunately no, the catalogsource pod preventing node draining is still an issue. We are tracking that effort in an RFE https://issues.redhat.com/browse/RFE-2737.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Catalog pods prevent cluster scaling down
6 participants