Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1750433: 1/3 etcd-member pods crashloopback #1112

Merged
merged 1 commit into from Sep 12, 2019

Conversation

@RobertKrawitz
Copy link
Contributor

commented Sep 12, 2019

- What I did
Make the etcd pod privileged to work around issue of etcd-member sporadically not starting due to avc denial (not long term fix).

- How to verify it
Inspect running etcd processes, ensure they are running privileged.

- Description for the changelog

Fix etcd sporadically going into crash loop backoff on startup on one or more masters.

@openshift-ci-robot

This comment has been minimized.

Copy link

commented Sep 12, 2019

@RobertKrawitz: This pull request references Bugzilla bug 1750433, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Bug 1750433: 1/3 etcd-member pods crashloopback

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@RobertKrawitz

This comment has been minimized.

Copy link
Contributor Author

commented Sep 12, 2019

@@ -105,6 +109,8 @@ contents:
--listen-client-urls=https://0.0.0.0:2379 \
--listen-peer-urls=https://0.0.0.0:2380 \
--listen-metrics-urls=https://0.0.0.0:9978 \
securityContext:

This comment has been minimized.

Copy link
@rphillips

rphillips Sep 12, 2019

Contributor

the prior line has a \... does there need to be an extra line between 111 and 112 here?

This comment has been minimized.

Copy link
@RobertKrawitz

RobertKrawitz Sep 12, 2019

Author Contributor

Shouldn't matter (and line 165 has the same issue). The securityContext is outdented, which should close off the previous YAML construct.

@rphillips

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2019

/lgtm

@runcom

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

/approve

@openshift-ci-robot

This comment has been minimized.

Copy link

commented Sep 12, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RobertKrawitz, rphillips, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot

This comment has been minimized.

Copy link

commented Sep 12, 2019

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 2cb39a1 into openshift:master Sep 12, 2019
8 checks passed
8 checks passed
ci/prow/e2e-aws Job succeeded.
Details
ci/prow/e2e-aws-op Job succeeded.
Details
ci/prow/e2e-aws-scaleup-rhel7 Job succeeded.
Details
ci/prow/e2e-aws-upgrade Job succeeded.
Details
ci/prow/images Job succeeded.
Details
ci/prow/unit Job succeeded.
Details
ci/prow/verify Job succeeded.
Details
tide In merge pool.
Details
@openshift-ci-robot

This comment has been minimized.

Copy link

commented Sep 12, 2019

@RobertKrawitz: All pull requests linked via external trackers have merged. Bugzilla bug 1750433 has been moved to the MODIFIED state.

In response to this:

Bug 1750433: 1/3 etcd-member pods crashloopback

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hexfusion

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

This went through pretty fast before I could see what was going on.

This goes against what was concluded by @joelsmith #526

I just want to make sure this change is necessary.

/cc @sjenning

@openshift-ci-robot openshift-ci-robot requested a review from sjenning Sep 12, 2019
@cgwalters

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2019

I also feel like this is working around a deeper underlying bug. And while as we discussed, if one controls etcd one controls the cluster, it'd still be a lot better for optics (and add some reliability/safety) if the etcd pod wasn't privileged. Doing so would require probably a Kube enhancement to add the equivalent of podman :z for host paths.

@runcom

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

This goes against what was concluded by @joelsmith #526

@hexfusion do we revert this? I can do that

@mrunalp

This comment has been minimized.

Copy link
Member

commented Sep 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.