Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not create aws_api_gateway_domain_name on the first run #10447

Closed
speller opened this issue Oct 10, 2019 · 6 comments
Closed

Can not create aws_api_gateway_domain_name on the first run #10447

speller opened this issue Oct 10, 2019 · 6 comments
Labels
documentation Introduces or discusses updates to documentation. good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. service/acm Issues and PRs that pertain to the acm service. service/apigateway Issues and PRs that pertain to the apigateway service.
Milestone

Comments

@speller
Copy link
Contributor

speller commented Oct 10, 2019

Unable to create an aws_api_gateway_domain_name on the first run (when no infra were previously created). On re-run the resource is created successfully. The behavior is constant and always reproduces.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.12.6

Affected Resource(s)

  • aws_api_gateway_domain_name

Terraform Configuration Files

provider "aws" {
  version = "~> 2.31"
  region  = var.aws_default_region
  access_key = var.aws_access_key
  secret_key = var.aws_secret_key
}

variable "aws_default_region" {
  default = "ap-northeast-1"
}

variable "aws_access_key" {}
variable "aws_secret_key" {}
variable "domain_zone_name" {}

variable "auth_subdomain" {
  default = "test-cert"
}

locals {
  test_subdomain = "${var.auth_subdomain}.${var.domain_zone_name}"
}

resource "aws_acm_certificate" "domain-cert" {
  domain_name = local.test_subdomain
  validation_method = "DNS"
}

data "aws_route53_zone" "api-zone" {
  name = "${var.domain_zone_name}."
}

resource "aws_route53_record" "validation-record" {
  name = aws_acm_certificate.domain-cert.domain_validation_options[0].resource_record_name
  type = aws_acm_certificate.domain-cert.domain_validation_options[0].resource_record_type
  zone_id = data.aws_route53_zone.api-zone.id
  records = [aws_acm_certificate.domain-cert.domain_validation_options[0].resource_record_value]
  ttl = 60
}

resource "aws_acm_certificate_validation" "cert-validation" {
  certificate_arn = aws_acm_certificate.domain-cert.arn
  validation_record_fqdns = [aws_route53_record.validation-record.fqdn]
}

resource "aws_api_gateway_domain_name" "gateway-domain" {
  domain_name = local.test_subdomain
  regional_certificate_arn = aws_acm_certificate.domain-cert.arn
  endpoint_configuration {
    types = ["REGIONAL"]
  }
}

Prerequisites:

  • An AWS account with API credentials
  • A Route53 zone (may not be an AWS-managed one)
  • Subdomain name on which a certificate will be issued. To work with it without buying a domain in Route53 or moving it here copy NS values or the zone to the subdomain name at your main domain registrar.

Debug Output

https://gist.github.com/speller/23fbc81c1c8c53cef8ce5cf63db86221

Regular output:

data.aws_route53_zone.api-zone: Refreshing state...
aws_acm_certificate.domain-cert: Creating...
aws_acm_certificate.domain-cert: Still creating... [10s elapsed]
aws_acm_certificate.domain-cert: Creation complete after 12s [id=arn:aws:acm:ap-northeast-1:860602778092:certificate/7f6b4521-bd60-41f5-96d5-89a55ddb0be4]
aws_api_gateway_domain_name.gateway-domain: Creating...
aws_route53_record.validation-record: Creating...
aws_route53_record.validation-record: Still creating... [10s elapsed]
aws_route53_record.validation-record: Still creating... [20s elapsed]
aws_route53_record.validation-record: Still creating... [30s elapsed]
aws_route53_record.validation-record: Still creating... [40s elapsed]
aws_route53_record.validation-record: Still creating... [50s elapsed]
aws_route53_record.validation-record: Creation complete after 1m0s [id=Z2EQCMF2D8EJH3__784ab3a2bdc724558bdbc61aa520a1ad.test-cert.project-dev.me.uk._CNAME]
aws_acm_certificate_validation.cert-validation: Creating...
aws_acm_certificate_validation.cert-validation: Creation complete after 5s [id=2019-10-10 09:25:11 +0000 UTC]

Error: Error creating API Gateway Domain Name: BadRequestException: The provided certificate does not exist.
        status code: 400, request id: 4b398357-c39a-4324-a546-4b25a0ac5cf2

  on main.tf line 53, in resource "aws_api_gateway_domain_name" "gateway-domain":
  53: resource "aws_api_gateway_domain_name" "gateway-domain" {

Expected Behavior

The aws_api_gateway_domain_name resource is created without issues.

Actual Behavior

The exception is thrown and the script fails.

Steps to Reproduce

  1. terraform apply

References

Workaround

Add depends_on = [aws_acm_certificate_validation.cert-validation] to the aws_api_gateway_domain_name definition. Or change cert arn property to this: regional_certificate_arn = aws_acm_certificate_validation.cert-validation.certificate_arn. Explicit dependency on the aws_acm_certificate_validation resource is required. This should be fixed or mentioned in docs. Without this workaround, TF fails despite it's actually waiting until the aws_acm_certificate_validation is created.

@ghost ghost added service/acm Issues and PRs that pertain to the acm service. service/apigateway Issues and PRs that pertain to the apigateway service. service/route53 Issues and PRs that pertain to the route53 service. labels Oct 10, 2019
@speller speller changed the title Can not create aws_api_gateway_domain_name at the first run Can not create aws_api_gateway_domain_name on the first run Oct 10, 2019
@bflad bflad added needs-triage Waiting for first response or review from a maintainer. documentation Introduces or discusses updates to documentation. good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. and removed needs-triage Waiting for first response or review from a maintainer. service/route53 Issues and PRs that pertain to the route53 service. labels Oct 10, 2019
@bflad
Copy link
Member

bflad commented Oct 10, 2019

Hi @speller 👋 Thanks for reporting this and sorry for the hassle.

Add depends_on = [aws_acm_certificate_validation.cert-validation] to the aws_api_gateway_domain_name definition. Or change cert arn property to this: regional_certificate_arn = aws_acm_certificate_validation.cert-validation.certificate_arn. Explicit dependency on the aws_acm_certificate_validation resource is required. This should be fixed or mentioned in docs.

The explicit depends_on usage or implict dependency via aws_acm_certificate_validation is expected and as you mention should be documented better if necessary -- we cannot do anything in the Terraform AWS Provider code to enforce this as resource ordering is handled outside the provider.

The documentation lives in this codebase at website/docs/r/api_gateway_domain_name.html.markdown if you or anyone is interested in handling this.

@speller
Copy link
Contributor Author

speller commented Oct 11, 2019

@bflad Thank you for your answer. My concern is about why TF is waiting for the validation resource created and fails even after it is created? Can it be fixed? The error message is incorrect by the way. It states about invalid certificate ARN, but actually it is about some issues with validation.

@speller
Copy link
Contributor Author

speller commented Oct 11, 2019

I've added a note in the documentation #10466 for this issue.

@bflad
Copy link
Member

bflad commented Oct 11, 2019

My concern is about why TF is waiting for the validation resource created and fails even after it is created?

When you give Terraform a configuration to apply, it generates a directed acyclic graph (DAG) that determines if operations have dependencies or can otherwise be done in parallel. By default, Terraform performs operations (that do not have dependencies on each other) with a concurrency of 10. You can see this graph if you run terraform graph. The advanced details about this process are available in the Resource Graph documentation.

When a particular node in the graph has an error, nodes that are dependent on that node are not executed. If there are other nodes in the graph that are not dependent on that node, they are allowed to continue applying since (theoretically) they should not be affected by the failure. Configuring Terraform with implicit or explicit dependencies is one way to ensure expected behavior when applying a configuration, should there be any failures.

Since the original configuration does not have an implicit or explicit dependency between aws_acm_certificate_validation and aws_api_gateway_domain_name, they are able to run separately without errors failing the Terraform run immediately. If aws_acm_certificate had errored, then neither of those resources would have executed since they both depend on that resource.

The error message is incorrect by the way. It states about invalid certificate ARN, but actually it is about some issues with validation.

The error message:

BadRequestException: The provided certificate does not exist.
        status code: 400, request id: 4b398357-c39a-4324-a546-4b25a0ac5cf2

Is generated from the AWS API and just passed through by Terraform. For improvements to this error messaging, you would need to contact AWS. Of note though, ACM changes can sometimes display eventual consistency issues with other AWS services during certificate creation/deletion, so the API Gateway service may not be able to see the new ACM certificate immediately after its created.

In certain resources where setting up the correct dependencies can still display eventual consistency issues like these (notorious when working with IAM and S3 for example), we do introduce retry logic for a few minutes (up to 5 minutes) on specific error messaging to help operators. In this case though, ACM certificate validation can be manual or take upwards of 45 minutes. Most likely, we would not want to introduce retries that long into the aws_api_gateway_domain_name resource here because that would present a bad user experience (waiting 45 minutes for an error to return) should there be a legitimate configuration issue such as never validating the ACM certificate.

I've added a note in the documentation #10466 for this issue.

Much appreciated. 😄

@bflad bflad added this to the v2.37.0 milestone Nov 15, 2019
@bflad bflad closed this as completed Nov 18, 2019
@ghost
Copy link

ghost commented Nov 18, 2019

This has been released in version 2.37.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@ghost
Copy link

ghost commented Mar 29, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Introduces or discusses updates to documentation. good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. service/acm Issues and PRs that pertain to the acm service. service/apigateway Issues and PRs that pertain to the apigateway service.
Projects
None yet
Development

No branches or pull requests

2 participants