Skip to content

Conversation

@sbernauer
Copy link
Member

@sbernauer sbernauer commented Aug 14, 2025

Description

https://stackable-workspace.slack.com/archives/C08GM6S8Z8D/p1755112180910679

Tested by spinning up an ZookeeperCluster with 21 nodes and setting the expiration annotation of all of them at once to the past.
With this PR no error messages, only INFOs

2025-08-19T07:53:19.458655Z INFO stackable_commons_operator::restart_controller::pod: Tried to evict Pod, but wasn't allowed to do so, as it would violate the Pod's disruption budget. Retrying later pod=Pod.v1./simple-zk-server-default-17.default error=ApiError: Cannot evict pod as it would violate the pod's disruption budget.: TooManyRequests (ErrorResponse { status: "Failure", message: "Cannot evict pod as it would violate the pod's disruption budget.", reason: "TooManyRequests", code: 429 })

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Helm chart can be installed and deployed operator works
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible
  • Links to generated (nightly) docs added
  • Release note snippet added

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added
  • Links to generated (nightly) docs added
  • Release note snippet added
  • Add type/deprecation label & add to the deprecation schedule
  • Add type/experimental label & add to the experimental features tracker

@sbernauer sbernauer self-assigned this Aug 14, 2025
@sbernauer sbernauer moved this to Development: In Progress in Stackable Engineering Aug 14, 2025
@sbernauer sbernauer changed the title chore: Reduce severity of Pod eviciton errors chore: Reduce severity of Pod eviction errors Aug 14, 2025
@sbernauer sbernauer moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Aug 19, 2025
Copy link
Member

@NickLarsenNZ NickLarsenNZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one minor thing (we should align to Semantic Conventions as much as possible.

https://opentelemetry.io/docs/specs/semconv/registry/attributes/k8s/#k8s-pod-name

@NickLarsenNZ NickLarsenNZ moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Aug 21, 2025
@sbernauer sbernauer requested a review from NickLarsenNZ August 21, 2025 14:10
NickLarsenNZ
NickLarsenNZ previously approved these changes Aug 29, 2025
Copy link
Member

@NickLarsenNZ NickLarsenNZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the suggestion based on the earlier comments

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>
Copy link
Member

@NickLarsenNZ NickLarsenNZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sbernauer sbernauer added this pull request to the merge queue Sep 2, 2025
@sbernauer sbernauer moved this from Development: In Review to Development: Done in Stackable Engineering Sep 2, 2025
Merged via the queue into main with commit d40910b Sep 2, 2025
17 checks passed
@sbernauer sbernauer deleted the chore/avoid-eviction-errors branch September 2, 2025 10:33
@lfrancke lfrancke moved this from Development: Done to Done in Stackable Engineering Sep 8, 2025
@sbernauer sbernauer added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Sep 22, 2025
@sbernauer
Copy link
Member Author

Release notes

Reduce severity of Pod eviction errors. Previously, the operator would produce lot's of Cannot evict pod as it would violate the pod's disruption budget errors. With this fix, the error is reduced to an info instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

customer-request release/25.11.0 release-note Denotes a PR that will be considered when it comes time to generate release notes.

Projects

Development

Successfully merging this pull request may close these issues.

5 participants