New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure StackdriverLogging Windows service to restart on failure. #93765
Conversation
Welcome @jeremyje! |
Hi @jeremyje. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test pull-kubernetes-e2e-windows-gce |
@jeremyje: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @pjh |
/ok-to-test |
/priority important-soon |
/test pull-kubernetes-e2e-windows-gce |
/test pull-kubernetes-e2e-windows-gce |
/retest |
I've created #94138 where would I propose this in the sig-windows chat room? |
I found the chery-pick guide: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-release/cherry-picks.md Someone had sent it to me in the past so I had it bookmarked. It is not referenced anywhere under the actual contributor guide, https://github.com/kubernetes/community/tree/master/contributors/guide :( |
@pjh Given the issues with upgrading to Fluent Bit I think we should submit this change anyway just in case the upgrade does not happen within the v1.20 cut off. |
/retest
…On Wed, Nov 11, 2020, 9:26 PM Kubernetes Prow Robot < ***@***.***> wrote:
@jeremyje <https://github.com/jeremyje>: The following tests *failed*,
say /retest to rerun all failed tests:
Test name Commit Details Rerun command
pull-kubernetes-e2e-windows-gce ef5da75
<ef5da75>
link
<https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/93765/pull-kubernetes-e2e-windows-gce/1292912136309182465> /test
pull-kubernetes-e2e-windows-gce
pull-kubernetes-node-e2e 26cdcde
<26cdcde>
link
<https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/93765/pull-kubernetes-node-e2e/1326752815418183680/> /test
pull-kubernetes-node-e2e
Full PR test history
<https://prow.k8s.io/pr-history?org=kubernetes&repo=kubernetes&pr=93765>. Your
PR dashboard
<https://prow.k8s.io/pr?query=is%3Apr%20state%3Aopen%20author%3Ajeremyje>.
Please help us cut down on flakes by linking to
<https://git.k8s.io/community/contributors/devel/sig-testing/flaky-tests.md#filing-issues-for-flaky-tests>
an open issue
<https://github.com/kubernetes/kubernetes/issues?q=is:issue+is:open> when
you hit one in your PR.
Instructions for interacting with me using PR comments are available here
<https://git.k8s.io/community/contributors/guide/pull-requests.md>. If
you have questions or suggestions related to my behavior, please file an
issue against the kubernetes/test-infra
<https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:>
repository. I understand the commands that are listed here
<https://go.k8s.io/bot-commands>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#93765 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAMS32FVWQNVJPHZURXK3SPNWXNANCNFSM4PXBYMCQ>
.
|
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jeremyje, pjh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
/retest |
2 similar comments
/retest |
/retest |
@jeremyje: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest Review the full test history for this PR. Silence the bot with an |
What type of PR is this?
/kind bug
What this PR does / why we need it:
This change configures the
StackdriverLogging
Windows service to automatically restart if it fails. This improves reliability of emitting Stackdriver log output when the host has disruptions that crash the logging agent. The service is configured to restart 1 second after first failure and then 10 seconds for subsequent failures.Currently, there's a race between GCE/GKE metadata server being reachable and this service starting up. If the metadata server is unavailable it'll crash and never emit logs until manually started.
This change also improves the manual e2e testing documentation.
Which issue(s) this PR fixes:
Fixes #94138
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
NONE
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: