Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1986757: Set timeoutSeconds for keepalived liveness probe #2703

Merged

Conversation

cybertron
Copy link
Member

timeoutSeconds defaults to 1 second, but our liveness probe attempts
to wait for 5 seconds. This seems to be causing frequent liveness
probe timeouts in some environments which triggers unnecessary
restarts.

This patch sets timeoutSeconds to 5 to match the expectations of
the probe.

- Description for the changelog
Fixed an issue with the keepalived liveness probe that could cause spurious timeouts.

timeoutSeconds defaults to 1 second, but our liveness probe attempts
to wait for 5 seconds. This seems to be causing frequent liveness
probe timeouts in some environments which triggers unnecessary
restarts.

This patch sets timeoutSeconds to 5 to match the expectations of
the probe.
@openshift-ci openshift-ci bot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Aug 4, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 4, 2021

@cybertron: This pull request references Bugzilla bug 1986757, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (vvoronko@redhat.com), skipping review request.

In response to this:

Bug 1986757: Set timeoutSeconds for keepalived liveness probe

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cybertron
Copy link
Member Author

/test e2e-ovirt
/test e2e-vsphere
/cc @mandre @Gal-Zaidman @jcpowermac @patrickdillon

@yboaron
Copy link
Contributor

yboaron commented Aug 5, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 5, 2021
@yboaron
Copy link
Contributor

yboaron commented Aug 5, 2021

/retest

Copy link
Member

@mandre mandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@yboaron
Copy link
Contributor

yboaron commented Aug 8, 2021

/retest

3 similar comments
@kikisdeliveryservice
Copy link
Contributor

/retest

@kikisdeliveryservice
Copy link
Contributor

/retest

@yboaron
Copy link
Contributor

yboaron commented Aug 12, 2021

/retest

@kikisdeliveryservice
Copy link
Contributor

/test e2e-metal-ipi

@cybertron
Copy link
Member Author

All of the affected platforms are now passing. This is a pretty simple change that isn't platform-specific so I think it could probably go in (although it would be even better if someone from the remaining two platforms could chime in).

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 16, 2021
Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an approval and also a hold to be removed by @cybertron once he feels it's ready and has necessary approvals.

/hold

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2021
@Gal-Zaidman
Copy link
Contributor

/lgtm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 17, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cybertron, Gal-Zaidman, kikisdeliveryservice, mandre, yboaron

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [kikisdeliveryservice]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yboaron
Copy link
Contributor

yboaron commented Aug 22, 2021

/retest

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 23, 2021
@cybertron
Copy link
Member Author

/test e2e-agnostic-upgrade

1 similar comment
@mandre
Copy link
Member

mandre commented Aug 25, 2021

/test e2e-agnostic-upgrade

@cybertron
Copy link
Member Author

The failing jobs all appear to be permafails and none of them even deploy the changed template. I think we just need to skip them.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

14 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 27, 2021

@cybertron: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-aws-upgrade-single-node 88f8799 link /test e2e-aws-upgrade-single-node
ci/prow/e2e-aws-disruptive 88f8799 link /test e2e-aws-disruptive
ci/prow/e2e-aws-workers-rhel7 88f8799 link /test e2e-aws-workers-rhel7

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 00f349e into openshift:master Aug 28, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 28, 2021

@cybertron: All pull requests linked via external trackers have merged:

Bugzilla bug 1986757 has been moved to the MODIFIED state.

In response to this:

Bug 1986757: Set timeoutSeconds for keepalived liveness probe

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yboaron
Copy link
Contributor

yboaron commented Aug 31, 2021

/cherry-pick release-4.8

@openshift-cherrypick-robot

@yboaron: new pull request created: #2741

In response to this:

/cherry-pick release-4.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants