Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support setting CA validity duration via unsupportedConfigOverrides #85

Merged
merged 1 commit into from Dec 3, 2019

Conversation

marun
Copy link
Contributor

@marun marun commented Nov 15, 2019

Setting the duration of CA validity allows setting a short duration to test operator compatibility with CA rotation.

This PR is required by the proposed periodic rotation compatibility job: openshift/release#5930

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 15, 2019
@marun
Copy link
Contributor Author

marun commented Nov 22, 2019

Rebased and ready for review

@marun marun changed the title WIP Support setting CA validity duration via unsupportedConfigOverrides Support setting CA validity duration via unsupportedConfigOverrides Nov 22, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 22, 2019
@marun marun force-pushed the ca-validity-duration branch 4 times, most recently from e009dae to 747b4d3 Compare November 22, 2019 07:56
Copy link
Member

@stlaz stlaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nobody should need expiry in seconds, if they do, we'll change lib-go instead

return caConfig, nil
}
// operator-go's cert generation only accepts an expiry in days. To
// ensure support for all validity durations, even those less than a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see value in durations < 1 day, just stick to what library-go does and pass the duration in days directly to it. The API ValidityDuration: certificate validity in seconds does not sound very user friendly, either.

Copy link
Contributor Author

@marun marun Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a flexible initial expiry is required to support validating that cert expiry is handled correctly in the context of a periodic job that takes less than a day to execute (see the description of the proposed rotation job). Exploratory testing has a similar requirement. If qa or someone else wants to see what happens when a cert expires, I think it makes sense to allow the specification of the appropriate interval (likely in minutes or hours) rather than forcing them to wait a whole day or depend on clock manipulation.

So, there is a requirement. And if this repo is the only one that requires it, I'm not sure why that suggests the effort of updating library-go (especially for an unsupported api).

To your point, though, it is not great UX for a general-purpose option to set the CA expiry in at most hours (since that's the maximum granularity that Duration supports). I think that suggests changing the name to ValidityDurationForTesting and maybe limiting the scope of the option to the initial cert (or manual rotation, since that's the same code path). If we are required to support custom cert duration for production use in the future, we could add a separate option specified in days. WDYT?

@@ -16,7 +16,7 @@ import (
)

const (
signingCertificateLifetimeInDays = 365 // 1 year
SigningCertificateLifetimeInDays = 365 // 1 year
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone in the future changes this to var, I will eat them

@stlaz
Copy link
Member

stlaz commented Nov 22, 2019

/test e2e-aws-operator

// provided duration is greater than zero. If an error occurs or the duration is
// zero, the original CA is returned.
func maybeUpdateExpiry(caConfig *crypto.TLSCertificateConfig, duration time.Duration) (*crypto.TLSCertificateConfig, error) {
if duration.Nanoseconds() == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn’t there duration.IsZero?

Copy link
Contributor Author

@marun marun Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh. Done.

Time.IsZero is a thing. Duration.IsZero is not.

// reason indicates why a rotation of the signing CA should be forced. If the
// reason is not empty and has not been recorded as an annotation on the signing
// secret, the rotation of the signing CA will be triggered at most once.
// +optional
Reason string `json:"reason"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API lgtm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you do not share @stlaz's concerns regarding ValidityDuration being specified in at most hours?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably >= 1h makes sense. So either document the minimum or switch to hours as integer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the value in switching to an integer representing hours given that metav1.Duration supports providing a time in hours (e.g. 1h) and supporting a duration in minutes (e.g. 20m) is potentially useful for testing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See @liggitt's talk at contributor summit: metav1.Duration is considered a failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the video up yet, but I'm assuming you're suggesting that metav1.Duration not be used? That reinforces my suggestion for ValidityDurationForTesting and denominating it in minutes. If ValidityDuration becomes a thing to set the CA expiry for production use, it would likely be in days.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValidityDurationForTesting is fine with me as well. Let's decide and move forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've update the name to ValidityDurationForTesting and left it as metav1.Duration, hopefully that makes sense for you.

// Set a custom expiry if one was provided
signingCA.config, err = maybeUpdateExpiry(signingCA.config, serviceCAConfig.CAConfig.ValidityDuration)
if err != nil {
return "", fmt.Errorf("Error renewing ca for custom duration: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@stlaz
Copy link
Member

stlaz commented Nov 22, 2019

/approve

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 22, 2019
@marun
Copy link
Contributor Author

marun commented Nov 22, 2019

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 22, 2019
@marun
Copy link
Contributor Author

marun commented Nov 22, 2019

/hold cancel

Updated a couple of error messages in latest push, nothing functional.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 22, 2019
@sttts
Copy link
Contributor

sttts commented Nov 22, 2019

Sgtm overall (would switch to hours).

@marun
Copy link
Contributor Author

marun commented Nov 27, 2019

/retest

1 similar comment
@marun
Copy link
Contributor Author

marun commented Nov 28, 2019

/retest

@sttts
Copy link
Contributor

sttts commented Nov 28, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 28, 2019
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

This is intended to support testing of rotation compatibility.
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Nov 28, 2019
@marun
Copy link
Contributor Author

marun commented Nov 28, 2019

Rebased

@marun
Copy link
Contributor Author

marun commented Nov 29, 2019

/retest

1 similar comment
@marun
Copy link
Contributor Author

marun commented Nov 29, 2019

/retest

@stlaz
Copy link
Member

stlaz commented Dec 3, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marun, stlaz, sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 39ba3e6 into openshift:master Dec 3, 2019
@marun marun deleted the ca-validity-duration branch December 3, 2019 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants