Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-apiserver failed to load SNI cert and key #1145

Merged
merged 2 commits into from Jun 19, 2021

Conversation

p0lyn0mial
Copy link
Contributor

@p0lyn0mial p0lyn0mial commented Jun 9, 2021

As of today, the dynamic certificates i.e. kube-apiserver-certs are accessed by at least two processes, namely the installer pod and the cert-syncer.
Up until now, there was no coordination between these processes that might have lead to many unexpected errors, like https://bugzilla.redhat.com/show_bug.cgi?id=1963730

This PR writes all certificates in an atomic way by first creating a temporary file, writing the content to it, and then renaming it to the original file.

os.Rename calls syscall.Rename which in turn uses the rename syscall (Linux) which provides atomicity (https://man7.org/linux/man-pages/man2/rename.2.html)

The previous attempt didn't work as it would break the DR scripts - openshift/library-go#1098

@openshift-ci openshift-ci bot requested review from mfojtik and soltysh June 9, 2021 10:53
@p0lyn0mial p0lyn0mial force-pushed the atomic-certs branch 4 times, most recently from 48d8d59 to 7f3690e Compare June 10, 2021 14:13
@p0lyn0mial
Copy link
Contributor Author

/retest

@p0lyn0mial p0lyn0mial changed the title proof for https://github.com/openshift/library-go/pull/1103 Bug 1963730: kube-apiserver failed to load SNI cert and key Jun 11, 2021
@openshift-ci openshift-ci bot added bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jun 11, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 11, 2021

@p0lyn0mial: This pull request references Bugzilla bug 1963730, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.0) matches configured target release for branch (4.8.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @wangke19

In response to this:

Bug 1963730: kube-apiserver failed to load SNI cert and key

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from wangke19 June 11, 2021 10:38
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 11, 2021

@p0lyn0mial: This pull request references Bugzilla bug 1963730, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.0) matches configured target release for branch (4.8.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @wangke19

In response to this:

Bug 1963730: kube-apiserver failed to load SNI cert and key

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mfojtik
Copy link
Member

mfojtik commented Jun 11, 2021

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 11, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 11, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mfojtik, p0lyn0mial

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2021
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@p0lyn0mial
Copy link
Contributor Author

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@p0lyn0mial
Copy link
Contributor Author

/cherrypick release-4.8

@openshift-cherrypick-robot

@p0lyn0mial: once the present PR merges, I will cherry-pick it on top of release-4.8 in a new PR and assign it to you.

In response to this:

/cherrypick release-4.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@p0lyn0mial
Copy link
Contributor Author

/test e2e-gcp-operator

@p0lyn0mial
Copy link
Contributor Author

p0lyn0mial commented Jun 15, 2021

ci/prow/e2e-gcp-operator

keeps failing on

'FailedCreatePodSandBox' Pod "installer-13-ci-op-h6m9mzk7-f6035-jxnvh-master-1" on node "ci-op-h6m9mzk7-f6035-jxnvh-master-1" observed degraded networking: Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_installer-13-ci-op-h6m9mzk7-f6035-jxnvh-master-1_openshift-kube-apiserver_465a28bd-7a40-4617-8d0c-e159c6c1dc11_0(7bd724a9ec3891756bbd3994a3258e5003ec84173f1f91e33f1a8f8ff96f4af4): Multus: [openshift-kube-apiserver/installer-13-ci-op-h6m9mzk7-f6035-jxnvh-master-1]: error getting pod: Unauthorized

@p0lyn0mial
Copy link
Contributor Author

/test e2e-gcp-operator

@sttts
Copy link
Contributor

sttts commented Jun 15, 2021

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@p0lyn0mial
Copy link
Contributor Author

failed to create pod network sandbox will be fixed by https://bugzilla.redhat.com/show_bug.cgi?id=1972490

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

15 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-cherrypick-robot

@p0lyn0mial: new pull request created: #1153

In response to this:

/cherrypick release-4.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants