Skip to content
This repository has been archived by the owner on Sep 22, 2020. It is now read-only.

ACME provider does trigger cert renewal/replacement despite being below min_days_remaining #15

Closed
ryan-mf opened this issue May 15, 2017 · 8 comments
Assignees

Comments

@ryan-mf
Copy link

ryan-mf commented May 15, 2017

I deployed all of my virtual infrastructure over the past few months with Terraform, and pulled in your awesome ACME plugin for managing SSL certs (thanks by the way!).

Today it's almost been 90 days so it's the first time I went to renew the certs. I have min_days_remaining for my certs set to 7 days, so I figured if I just ran terraform apply on my current state, it would realize my cert(s) were less than 7 days from renewal and kick in. Unfortunately it didn't. Here's my related config:

// Create the private key for the registration (not the certificate)
resource "tls_private_key" "letsencrypt_private_key" {
  algorithm          = "RSA"
  rsa_bits           = "2048"
}

// Set up a registration using a private key from tls_private_key
resource "acme_registration" "reg" {
  server_url         = "https://acme-v01.api.letsencrypt.org/directory"
  account_key_pem    = "${tls_private_key.letsencrypt_private_key.private_key_pem}"
  email_address      = "ops@${var.ORG}.${var.TLD}"
}

/////////////// Begin api.(env).(org).(tld) SSL setup /////////////
// Create the private key for the API certificate
resource "tls_private_key" "api_ssl_private_key" {
  algorithm          = "RSA"
  rsa_bits           = "2048"
}

// GCP Output for above SSL private key
output "api_ssl_private_key" {
  value              = "${tls_private_key.api_ssl_private_key.private_key_pem}"
}

resource "tls_cert_request" "api_ssl_req" {
  key_algorithm      = "RSA"
  private_key_pem    = "${tls_private_key.api_ssl_private_key.private_key_pem}"
//  dns_names        = ["www.example.com", "www2.example.com"]

  subject {
    common_name      = "api.${var.ENV_SHORT}.${var.ORG}.${var.TLD}"
  }
}

// Create a certificate
resource "acme_certificate" "api_ssl_certificate" {
  depends_on         = ["google_dns_managed_zone.env-org-tld"]
  server_url         = "https://acme-v01.api.letsencrypt.org/directory"
  account_key_pem    = "${tls_private_key.letsencrypt_private_key.private_key_pem}"
  certificate_request_pem  = "${tls_cert_request.api_ssl_req.cert_request_pem}"
  min_days_remaining = "7"

  dns_challenge {
    provider = "gcloud"
    config {
      GCE_PROJECT    = "${var.ORG}-${var.ENV_SHORT}"
      GCE_DOMAIN     = "${var.ENV_SHORT}.${var.ORG}.${var.TLD}"
    }
  }

  registration_url   = "${acme_registration.reg.id}"
}

output "api_ssl_certificate" {
  value              = "${acme_certificate.api_ssl_certificate.certificate_pem}"
}

output "api_ssl_intermediate" {
  value              = "${acme_certificate.api_ssl_certificate.issuer_pem}"
}
/////////////// End api.(env).(org).(tld) SSL setup /////////////

Note that I did find a simple workaround that's due to #13 - I simply dropped min_days_remaining to 6 and re-ran terraform apply and that got the job done. I have a feeling that bug will eventually be fixed though =)

@vancluever vancluever self-assigned this May 15, 2017
@vancluever
Copy link
Owner

vancluever commented May 15, 2017

Hey @ryan-mf, just noting what we talked about on Hangops so that it's logged:

It looks like the renewal is happening, but since there are no resources in this config that depend on the certificate in the immediate config, the cert is being updated silently and the only way you might be able to tell is by observing the outputs. You can also observe this behaviour by using TF_LOG=debug on a test config with a really high min_days_remaining value (you should see the renewal fire every time).

Obviously, a silent update is not what we want to see here. This is happening because all of the heavy lifting is done in refresh (including renewal), due to the fact that there are no attributes that don't force a new resource at this point in time (the only one that should be that way is min_days_remaining, as documented in #13). How to de-couple update and refresh in a sane way is something I'm still trying to decide on - the main blocker being that a "renewal" does not necessarily translate well to an apply operation - update of the resource through renewal is something that normally does not require any change to the configuration, hence it's tough to differentiate between a renewal and a refresh, which is something that an update to data without a configuration change is more akin to.

I am entertaining the option of just destroying the resource by clearing its ID on expiry, like in the upstream TLS provider certificate resources, but I'll need to fully evaluate the implications of this (doing this is basically telling Terraform that the resource has gone away, and it hasn't 100% - tainting the resource would be a better option). Since lego couples authorizations with certificates, this may mean that re-authorization may need to happen, but the benefit will be much better verbosity in these kinds of cases in the Terraform output.

@ryan-mf
Copy link
Author

ryan-mf commented May 15, 2017

Thanks for the feedback @vancluever. I definitely didn't realize there would be no output on an update but now that totally makes sense. I'm juggling a bunch right now but my next SSL renewal will come up in a couple weeks so I'll test to make sure the output changes (ie new public key from Letsencrypt) as discussed.

Cheers,
-R

@vancluever
Copy link
Owner

vancluever commented May 28, 2017

Hey @ryan-mf - just an update on this one - I have been doing some work on TF core that will help facilitate a fix for this issue. You can track the progress at: hashicorp/terraform#14887

That PR will give providers the ability to customize the diff, allowing us to leave the certificate alone on refresh, instead using the custom diff functionality to check the expiry and setting the certificate to computed in the diff when it has expired. This will trigger an update correctly, and also show it in the diff.

I expect it will be a few weeks at least before this feature is live but once it is I'll be adding it to the plugin and working on a new release!

@ryan-mf
Copy link
Author

ryan-mf commented May 30, 2017

@vancluever awesome, thanks for the update!

@vancluever
Copy link
Owner

Hey @ryan-mf - you might have seen but custom diff stuff will not be making it into v0.10.

However, in the meantime I've just vendored the fork into this repo and I've been playing with it :) and got good progress :) Check it out! I will probably push out a release with this in the next few days but it will need TF v0.10 to run (you could grab a beta to try it out).

The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

  ~ acme_certificate.certificate
      certificate_pem: "-----BEGIN CERTIFICATE-----\nxxxxxxx\n-----END CERTIFICATE-----\n" => "<computed>"

vancluever added a commit that referenced this issue Jul 7, 2017
Thanks to the new experimental custom diff support, we have been able to
make the certificate renewal process more verbose:

 * Adjusting min_days_remaining does not force a new resource anymore,
   and will take effect immediately. Ref: #13
 * More importantly, certificate renewals now show up in the diff - when
   a certificate has passed its expiry date, it will be set to
   computed in the diff, which should propagate down the stack to
   anything else that depends on it (tests pending on this before this
   goes fully live). Ref: #15

As part of this work, we are now using partial state in Create and
Update, to protect against potential errors that could leave the state
inconsistent - namely, the certificate missing, which would only be
recoverable by a taint.
@ryan-mf
Copy link
Author

ryan-mf commented Jul 7, 2017

@vancluever wow that's awesome, thanks! Can't wait to try it out.

@vancluever
Copy link
Owner

Hey @ryan-mf, version v0.4.0 is now out which fixes this issue - as such closing this now, but let me know if there's any issues (especially since the update to v0.10.0-beta2 as the base).

Cheers!

@ryan-mf
Copy link
Author

ryan-mf commented Jul 19, 2017

@vancluever awesome! I haven't moved to Terraform .10 yet but can't wait to be able to utilize this once I do. Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants