Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1963730: writes the dynamic certs atomically #1103

Merged

Conversation

p0lyn0mial
Copy link
Contributor

@p0lyn0mial p0lyn0mial commented Jun 9, 2021

As of today, the dynamic certificates i.e. kube-apiserver-certs are accessed by at least two processes, namely the installer pod and the cert-syncer.
Up until now, there was no coordination between these processes that might have lead to many unexpected errors, like https://bugzilla.redhat.com/show_bug.cgi?id=1963730

This PR writes all certificates in an atomic way by first creating a temporary file, writing the content to it, and then renaming it to the original file.

os.Rename calls syscall.Rename which in turn uses the rename syscall (Linux) which provides atomicity (https://man7.org/linux/man-pages/man2/rename.2.html)

The previous attempt didn't work as it would break the DR scripts - #1098

@openshift-ci openshift-ci bot added the bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. label Jun 9, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 9, 2021

@p0lyn0mial: This pull request references Bugzilla bug 1963730, which is invalid:

  • expected the bug to target the "4.8.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1963730: writes the dynamic certs atomically

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Jun 9, 2021
@openshift-ci openshift-ci bot requested review from deads2k and sttts June 9, 2021 10:48
@p0lyn0mial p0lyn0mial force-pushed the wire-dynamic-certs-atomicaly branch from 8e3e42c to 6fe78a9 Compare June 9, 2021 10:48
@p0lyn0mial
Copy link
Contributor Author

/assign @sttts

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 9, 2021

@p0lyn0mial: This pull request references Bugzilla bug 1963730, which is invalid:

  • expected the bug to target the "4.8.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1963730: writes the dynamic certs atomically

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@p0lyn0mial p0lyn0mial force-pushed the wire-dynamic-certs-atomicaly branch from 6fe78a9 to b84bc2e Compare June 9, 2021 11:48
}

func writeTemporaryFile(content []byte, filePerms os.FileMode, contentDir, filename string) (string, error) {
klog.Infof("Creating a temporary file for %q ...", path.Join(contentDir, filename))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need all this logging? Would streamline the func WriteFileAtomic to not output anything.

@@ -408,3 +398,29 @@ func (o *InstallOptions) Run(ctx context.Context) error {
recorder.Eventf("StaticPodInstallerCompleted", "Successfully installed revision %s", o.Revision)
return nil
}

func writeConfig(content []byte, atomic bool, contentDir, filename string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need atomic bool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because not all files need to be written in that way. The other files are written to a per revision directory, like

if err := o.copySecretsAndConfigMaps(ctx, resourceDir, secretPrefixes, optionalSecretPrefixes, configPrefixes, optionalConfigPrefixes, true, false); err != nil {

we could use prefixed arg but that would be a little less explicit.

@p0lyn0mial
Copy link
Contributor Author

/hold

until proof PRs are green

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 9, 2021
@p0lyn0mial p0lyn0mial force-pushed the wire-dynamic-certs-atomicaly branch from 3af072d to 43d70e5 Compare June 10, 2021 14:06
@sttts
Copy link
Contributor

sttts commented Jun 10, 2021

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 10, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 10, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: p0lyn0mial, sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 10, 2021
@p0lyn0mial
Copy link
Contributor Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 11, 2021
@p0lyn0mial
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Jun 11, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 11, 2021

@p0lyn0mial: This pull request references Bugzilla bug 1963730, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.0) matches configured target release for branch (4.8.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @wangke19

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Jun 11, 2021
@openshift-ci openshift-ci bot requested a review from wangke19 June 11, 2021 07:37
@openshift-merge-robot openshift-merge-robot merged commit ad411b3 into openshift:master Jun 11, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 11, 2021

@p0lyn0mial: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Bugzilla bug in order for it to move to the next state. Once unlinked, request a bug refresh with /bugzilla refresh.

Bugzilla bug 1963730 has not been moved to the MODIFIED state.

In response to this:

Bug 1963730: writes the dynamic certs atomically

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants