Skip to content
This repository has been archived by the owner on Sep 22, 2020. It is now read-only.

Error 409 - urn:acme:error:malformed - Certificate already revoked #30

Closed
adamlc opened this issue Mar 1, 2018 · 6 comments
Closed

Comments

@adamlc
Copy link

adamlc commented Mar 1, 2018

In my certificates I have create_before_destroy set in the lifecycle settings. I do this because I always want a valid certificate to be active, if this isn't set then the cert is destroyed and then a new one created, which leaves a few mins with no active certificates.

During the destroy process it timed out, which has left a deposed resource in my state file. When I try to do a plan it comes it as a destroy operation:

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  - acme_certificate.jumplead-io (deposed)


Plan: 0 to add, 0 to change, 1 to destroy.

But when applying this plan it fails because the certificate has already been removed:

acme_certificate.jumplead-io.deposed: Destroying... (ID: https://acme-v01.api.letsencrypt.org/ac...t/048b267ca05e8031828b905109e0b4a7f8b6)

Error: Error applying plan:

1 error(s) occurred:

* acme_certificate.jumplead-io (destroy): 1 error(s) occurred:

* acme_certificate.jumplead-io (deposed #0): 1 error(s) occurred:

* acme_certificate.jumplead-io (deposed #0): acme: Error 409 - urn:acme:error:malformed - Certificate already revoked

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

This has currently left me with a broken state file as I'm unable to apply the destroy. Would it be possible if a 409 status has been returned to carry on with the normal terraform function?

@vancluever vancluever added the bug label Mar 1, 2018
@vancluever
Copy link
Owner

Hey @adamlc, this for sure sounds doable. Not too sure if it's possible to get the error code programmatically from lego but we can check for sure.

Might be a bit till I can get to it - how hard is it possibly for you to manually remove this from your state in the meantime?

@vancluever vancluever added enhancement and removed bug labels Mar 1, 2018
@adamlc
Copy link
Author

adamlc commented Mar 1, 2018

Cool! I've actually managed to manually fix my state file for now, so no rush 👍

@vancluever
Copy link
Owner

Awesome! Will fix this in a sweep of the other issues then 👍

@berney
Copy link

berney commented Mar 12, 2018

To clear the issue do something similar to terraform state rm acme_certificate.my_certificate_name.

abn added a commit to abn/terraform-provider-acme that referenced this issue Mar 23, 2018
Attempting to revoke an already revoked certificate will result in the
server responding with a status code of 409. This change, ensures that
this scenario is handled gracefully.

Resolves: vancluever#30
@abn
Copy link
Contributor

abn commented Mar 24, 2018

@vancluever I have had a fix in #35. This will check if the error is of type acme.RemoteError and if it is, checks if the StatusCode is set to 409. If so, it will proceed without returning the error.

Verified it in a similar scenario I encountered similar to @adamlc

@Luzifer
Copy link

Luzifer commented Apr 24, 2018

@vancluever the patch mentioned above is available one month and I just had to fiddle around with the state file as I ran into the same issue. Can we get this merged, please?

vancluever added a commit that referenced this issue Jun 9, 2018
Revocation has been taking longer than it used to take on ACME, both on
staging and production, and we had bug reports (#30, #32) and PRs (#35)
that have been working to address this.

Looking at lego, the library really does not have much in the way for
support of timeouts or contexts, at least none that are exposed to the
API at this point in time. Aside from the elegance drawbacks, I don't
really see this as much of a large issue as the only process that really
seems to have much in the way of issues is revocation and the OCSP poll
that takes place after.

This update sets things up so that we honor the destroy timeout that can
be set in the standard "timeout" attribute in any Terraform resource.
The default to this is the default 20 minutes, so in reality,
(hopefully) this will never need to be changed again, but if need be,
the avenue is there.

Fixes #32.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants