Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add challenge support for multiple DNS providers #49

Merged
merged 10 commits into from
May 17, 2019

Conversation

knmorgan
Copy link
Contributor

Multiple dns_challenge stanzas can now be specified to present the DNS challenge token. Previously, in situations with multiple primary DNS providers, the challenge may fail if only one provider was presented with the challenge token.

I am open to discussion on how this feature might be implemented. This was just my initial swing at a solution.

Copy link
Owner

@vancluever vancluever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @knmorgan, thanks for the contribution. This looks pretty straightforward. Wrapping the DNS challenges in a new challenge interface is a great idea too as I did notice that lego does not support multiple challenges on its own.

I left a couple of notes regarding making sure that all errors are trapped and things are not just a no-op.

In addition to those, I'd like to see the following:

  • I would prefer this to be testable as it is a fairly significant change. The test should only activate when certain environment variables are set (ie: ACME_MULTI_PROVIDERS=route53,azure or something similar). This will allow us to at least test it manually - it more than likely will not be in CI, but I'd like to be able to test it myself.
  • The migration to TypeSet will trigger diffs for everyone using the provider, and I'm not too sure I want that. Further, specifying duplicates here will just consolidate them into a single block without telling you that there's a duplicate, which I don't necessarily think is the right UX. I'd prefer if this stayed a list, and we do some extra validation in resourceACMECertificateCustomizeDiff to make sure there are no duplicate fields.

Finally, can you check and validate that the behavior of lego when a DNS challenge fails is to attempt a cleanup before completely erroring out. If it's not, that will need to happen as well, as we don't want dangling records if one challenge succeeds while others fail.

Thanks again for the contribution!

acme/resource_acme_certificate.go Show resolved Hide resolved
acme/resource_acme_certificate.go Show resolved Hide resolved
@ghost ghost added size/L and removed size/M labels May 16, 2019
@knmorgan
Copy link
Contributor Author

Thank you for the feedback, @vancluever. I have added a commit with the following changes.

  • Wrapped errors in both Present and CleanUp using go-multierror, per your suggestion.
  • Reverted TypeSet back to TypeList for dns_challenge.
  • Added validation in resourceACMECertificateCustomizeDiff to ensure duplicate providers are not specified.
  • Added a test that is only active when the ACME_MULTI_PROVIDERS environment variable is set.

I've reviewed lego and it seems it always calls CleanUp, even if Present fails.

I added a second commit with these changes so the diff is easy to see, but I generally prefer to squash them together prior to merging. I am not sure if HashiCorp has standards on commit history or not.

Please let me know if there's anything else I can do.

Copy link
Owner

@vancluever vancluever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing re: error messages: it's Go standard to generally make all error messages non-capitalized. Can you change the errors to match this standard?

Reference: https://github.com/golang/go/wiki/CodeReviewComments#error-strings

PS: The multierror lines are arguably fine either way (as they display on separate lines) but I'd prefer them to be lowercased as well to keep them consistent with other errors that may possibly get added.

acme/resource_acme_certificate.go Outdated Show resolved Hide resolved
acme/resource_acme_certificate.go Outdated Show resolved Hide resolved
acme/resource_acme_certificate.go Outdated Show resolved Hide resolved
@vancluever
Copy link
Owner

@knmorgan this is looking great. 👍

I'll give it a manual test here soon. I was thinking that maybe we could be a little more granular than only allowing one instance of each DNS provider (technically you could do a route53 provider with different zone IDs, for example), but I think this is fine for now and intra-provider-multi-zone cases are probably very edgy.

I've added some feedback that I'd like looked at but otherwise I think it's good to go.

PS: I have also been preferring single-commit PRs lately. GitHub now tracks changes across force-pushes, so it's very easy to keep a clean single-commit history in a PR and still allow reviewers to see what has changed between updates. This was added late last year: https://github.blog/changelog/2018-11-15-force-push-timeline-event/

@vancluever
Copy link
Owner

Looking good @knmorgan 👍 just testing locally here and then we will good to merge.

@vancluever
Copy link
Owner

Hey @knmorgan, looks like there's some build errors on the test binary.

==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./acme -v -run=TestAccACMECertificate_multiProviders -timeout 240m
# github.com/terraform-providers/terraform-provider-acme/acme [github.com/terraform-providers/terraform-provider-acme/acme.test]
acme/acme_structure_test.go:287:6: assignment mismatch: 1 variable but setDNSChallenge returns 3 values
acme/acme_structure_test.go:304:6: assignment mismatch: 1 variable but setDNSChallenge returns 3 values
acme/resource_acme_certificate_test.go:414:3: undefined: providers
acme/resource_acme_certificate_test.go:414:15: undefined: strings
acme/resource_acme_certificate_test.go:415:10: undefined: providers
acme/resource_acme_certificate_test.go:708:2: undefined: providers
acme/resource_acme_certificate_test.go:708:14: undefined: strings
acme/resource_acme_certificate_test.go:740:68: undefined: providers
FAIL	github.com/terraform-providers/terraform-provider-acme/acme [build failed]
make: *** [testacc] Error 2

Can you fix these up and try to run the test yourself successfully?

@knmorgan knmorgan force-pushed the multiple-dns-challenges branch 2 times, most recently from b1d37f7 to e532c5d Compare May 16, 2019 20:17
@knmorgan
Copy link
Contributor Author

@vancluever Sorry about that; the tests are building now. I will admit the tests do not pass for me, but neither did they on master (when following instructions in the README). It works as expected when I run via terraform, but I will dig into this to make sure the tests pass.

@vancluever
Copy link
Owner

@knmorgan I can test the other ones no worries. I don't see anything in your changes that would not make them. I'm just more concerned that you tested the multiple challenge functionality. 🙂

@vancluever
Copy link
Owner

@knmorgan I got it to test over here successfully as well. Can you fix this the configuration so that it has the proper hosts though?

The test config still references www and www2 even though the assertions properly check for www14 and www15.

Multiple dns_challenge stanzas can now be specified to present the DNS
challenge token. Previously, in situations with multiple primary DNS
providers, the challenge may fail if only one provider was presented
with the challenge token.
@knmorgan
Copy link
Contributor Author

@vancluever I fixed the assertions. However, I am seeing the following issues.

  1. PreCheck is apparently not called for TestAccACMECertificate_multiProviders. This fails the test if ACME_MULTI_PROVIDERS is not set. I am not yet sure why this is. The Travis checks have failed due to this.
  2. After export ACME_MULTI_PROVIDERS=gcloud,route53, the appropriate record is set in route53, but not in gcloud when running the tests. This is also true for route53,gcloud. I've inserted some logs that indicate the appropriate DNSProvider is called, but the TXT record is never created. When I execute the provider via terraform, the appropriate record is created for both.

}
}

client.Challenge.SetDNS01Provider(provider)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be client.Challenge.SetDNS01Provider(provider, challenge_opts...) so that recursive nameservers get set.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: This also would need to be done for Create, but check out my comment here which probably supersedes this.

@vancluever
Copy link
Owner

@knmorgan that's interesting - as I'm seeing the record created in both places (both in the consoles of the respective providers and also in code using delve). I am also using the route53 and gcloud providers.

I'm getting a different issue. I'm intermittently getting errors similar to:

[www14.example.com] acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: NXDOMAIN looking up TXT for _acme-challenge.www14.example.com, url:

I'm not too sure if this is a configuration issue with my test subdomain or not. I initially thought it might be a propagation race but it doesn't look like it, as the propagation check actually checks for the record first on the configured recursive nameservers, and then on the authoritative nameservers themselves. So we should be fine there.

The recursive nameserver issue does bring up something we need to address though. As we now allow multiple DNS providers, it does not make sense to have recursive_nameservers in the dns_challenge block anymore, as the propagation check is a global operation and is not local to any specific provider. We will need to do the following:

  • Move recursive_nameservers out of the dns_challenge block and into the global scope for acme_certificate. This is found in acme_structure.go.
  • Increment the SchemaVersion for acme_certificate.
  • Create a new state migration to move the current setting for recursive_nameservers out of the dns_challenge block and into the global scope. State migrations are currently found in acme_migrations.go. This needs to be tested as well.
  • Adjust acme_structure.go and resource_acme_certificate.go respectively. We can probably just get the setting directly in create and update now. Any tests should be updated accordingly.

Thanks!

* Update acme_certificate schema
* Provide schema_version migration
* Update the provider to support recursive_nameservers in the root scope
@ghost ghost added size/XL and removed size/L labels May 17, 2019
@knmorgan
Copy link
Contributor Author

@vancluever I've made the changes you requested. Please thoroughly review the migration logic. It seems to work for me, and the tests pass, but I want to make sure.

@vancluever
Copy link
Owner

@knmorgan migration logic looks correct. I'm just adding a note in the tests to make sure that we know that the presence of recursive_nameservers in versions before schema version 3 is just to make sure that the E2E migration tests work.

I'll review the issues with test skipping now as well.

Did the following:

* Add some notes in the migration tests to make sure that we know
that "recursive_nameservers" did not exist before schema version 3
and its presence in fixtures before this version is only to ensure
that E2E tests succeed.

* Made the migration of recursive_nameservers a bit more explicit by
asserting on the first key as well (dns_challenge) and setting the
key explicitly during replace.

* Also converted the failure data displays to use go-spew so the
output is actually readable.
@vancluever
Copy link
Owner

vancluever commented May 17, 2019

@knmorgan I know why the test is failing. testAccACMECertificateConfigMultiProviders() is being run at initialization time of the TestCase literal to generate the config, and this is failing after attempting to split and then reference the nonexistent ACME_MULTI_PROVIDERS variable.

Wrapping this in a function figuring out a way to lazy evaluate it or otherwise avoid this should fix it, I can do that now, no worries.

vancluever and others added 4 commits May 17, 2019 08:53
The acceptance test framework's skipping logic (skip a test if TF_ACC
is not set) does not run until resource.Test is run, which will
evaluate everything needed to properly pass in the TestCase literal,
including the configuration redering function. This is causing
panics when attempting to reference the data in the nonexistent
ACME_MULTI_PROVIDERS variable.

This adds a small workaround that just provides an empty slice of
length/cap 2 when the variable is not found. This creates an invalid
config, but the config will never be used in this form as the test
will be skipped or failed with this invalid data anyway, as there are
checks on the variable during acceptance testing.
Multiple dns_challenge stanzas can now be specified to present the DNS
challenge token. Previously, in situations with multiple primary DNS
providers, the challenge may fail if only one provider was presented
with the challenge token.

The acme_certificate schema was updated to move recursive_nameservers
into the root scope. The schema version was bumped and a migration was
provided.
Did the following:

* Add some notes in the migration tests to make sure that we know
that "recursive_nameservers" did not exist before schema version 3
and its presence in fixtures before this version is only to ensure
that E2E tests succeed.

* Made the migration of recursive_nameservers a bit more explicit by
asserting on the first key as well (dns_challenge) and setting the
key explicitly during replace.

* Also converted the failure data displays to use go-spew so the
output is actually readable.
The acceptance test framework's skipping logic (skip a test if TF_ACC
is not set) does not run until resource.Test is run, which will
evaluate everything needed to properly pass in the TestCase literal,
including the configuration redering function. This is causing
panics when attempting to reference the data in the nonexistent
ACME_MULTI_PROVIDERS variable.

This adds a small workaround that just provides an empty slice of
length/cap 2 when the variable is not found. This creates an invalid
config, but the config will never be used in this form as the test
will be skipped or failed with this invalid data anyway, as there are
checks on the variable during acceptance testing.
@knmorgan
Copy link
Contributor Author

@vancluever thanks for that. I've squashed my commits together. Are there any other pending changes I need to address right now?

@vancluever
Copy link
Owner

@knmorgan I am just running some testing and I think we will be okay from there.

The other thing that I ran into is that we are not using any of the provider DNS propagation timeout values. I've implemented challenge.ProviderTimeout for this by basically choosing the highest configured or default propagation timeout and polling intervals out of the configured providers.

Aside from this, I just want to add some docs regarding the behavior of this new functionality and the change in recursive_nameservers. I can take care of that though.

This allows DNS propagation timeout settings to be carried from a
specific DNS provider to the multi-provider setup. The highest
propagation timeout and polling interval each are used.
@ghost ghost added the documentation label May 17, 2019
@vancluever
Copy link
Owner

@knmorgan I think we are good now. 👍

I am still getting intermittent errors, but I think it might be related to some finickiness related to the gcloud provider itself - see go-acme/lego#770.

I'm going to merge this now, I think it's shippable. Thank you for all your work on this!

…rm-provider-acme into multiple-dns-challenges
@vancluever vancluever merged commit 115a1c6 into vancluever:master May 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants