Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surface self-check errors in challenge resource #1244

Merged
merged 8 commits into from
Jan 23, 2019

Conversation

DanielMorsing
Copy link
Contributor

What this PR does / why we need it:
This PR surfaces self-check errors in the challenge resource, making it more obvious what is holding up issuance of a certificate.

Which issue this PR fixes:
fixes #1208

Special notes for your reviewer:
This PR works by changing the Check method on the solver interface. Instead of having a (retry, error) tuple, we eliminate errors from the various checkers, making it so that the only type of error returned is a retryable one. Because of this change, the PR looks like it changes a lot more things than it actually does. If a change seems weird, I suggest looking through the commits one-by-one

Release note:

More observable errors when waiting for self-checks to propagate

Daniel Morsing added 6 commits January 17, 2019 12:46
Signed-off-by: Daniel Morsing <dmo@jetstack.io>
Signed-off-by: Daniel Morsing <dmo@jetstack.io>
Signed-off-by: Daniel Morsing <dmo@jetstack.io>
Body read errors are just regular errors, so we can reclassify them
all into absorb errors

Since we only have absorb errors, flip the switch so that all errors
are absorbed. This will make it easier to surface errors into the
controller.

Signed-off-by: Daniel Morsing <dmo@jetstack.io>
This makes the check function into a simple precondition

Signed-off-by: Daniel Morsing <dmo@jetstack.io>
Signed-off-by: Daniel Morsing <dmo@jetstack.io>
@jetstack-bot jetstack-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 21, 2019
@jetstack-bot jetstack-bot added the area/acme Indicates a PR directly modifies the ACME Issuer code label Jan 21, 2019
if !ok {
ch.Status.Reason = fmt.Sprintf("Waiting for %s challenge propagation", ch.Spec.Type)
glog.Infof("propagation check failed: %v", err)
ch.Status.Reason = fmt.Sprintf("Waiting for %s challenge propagation: %s", ch.Spec.Type, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%s should be %v

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either will work here, since error can always be converted into a string


providerConfig, err := issuer.GetSpec().ACME.DNS01.Provider(providerName)
fqdn, value, ttl, err := util.DNS01Record(ch.Spec.DNSName, ch.Spec.Key, s.DNS01Nameservers, false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on Slack, the change from followCNAME(providerConfig.CNAMEStrategy) to false makes sense as we should be relying on the recursive DNS server to resolve CNAMEs for us anyway, so we needn't ever resolve CNAMEs ourselves as part of the DNS01 self check.

fqdn, value, ttl, err := util.DNS01Record(ch.Spec.DNSName, ch.Spec.Key, s.DNS01Nameservers, followCNAME(providerConfig.CNAMEStrategy))
if err != nil {
return false, err
return err
}

glog.Infof("Checking DNS propagation for %q using name servers: %v", ch.Spec.DNSName, s.Context.DNS01Nameservers)

ok, err := util.PreCheckDNS(fqdn, value, s.Context.DNS01Nameservers,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call here is what performs the actual self check - I think we need to make a similar change to PreCheckDNS as we have done to this function, to make it only return an error type. I may be wrong though?!

Otherwise the only information we'll ever surface is "DNS record for 'example.com' not yet propagated" (instead of some helpful error surfaced from the actual checker)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PreCheckDNS turns various conditions into errors, so the only time we'll get the "not yet propagated" error is when we had no errors but the TXT record was missing

},
expectedErr: true,
expectedErr: false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test case names here need updating? should error with expectedErr: false seems a weird combination to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there ever a case where expectedErr == true now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having another look at the tests. I think they should be valid, but they might be made simpler now that we aren't testing the OK logic

Daniel Morsing added 2 commits January 21, 2019 13:13
Signed-off-by: Daniel Morsing <dmo@jetstack.io>
Signed-off-by: Daniel Morsing <dmo@jetstack.io>
@munnerz munnerz added this to In progress in v0.7 via automation Jan 22, 2019
@munnerz munnerz added this to the v0.7 milestone Jan 22, 2019
@munnerz
Copy link
Member

munnerz commented Jan 23, 2019

/lgtm
/approve

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 23, 2019
@jetstack-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: munnerz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jetstack-bot jetstack-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 23, 2019
@retest-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to jetstack).
Review the full test history for this PR.
Silence the bot with an /lgtm cancel comment for consistent failures.

1 similar comment
@retest-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to jetstack).
Review the full test history for this PR.
Silence the bot with an /lgtm cancel comment for consistent failures.

@jetstack-bot jetstack-bot merged commit e930bd3 into cert-manager:master Jan 23, 2019
v0.7 automation moved this from In progress to Done Jan 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/acme Indicates a PR directly modifies the ACME Issuer code dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
No open projects
v0.7
  
Done
Development

Successfully merging this pull request may close these issues.

Surface self check errors to challenge resources 'status' field
4 participants