Unable to define http/s health checks for Network Loadbalancer #2708

jkroepke · 2017-12-19T10:10:20Z

Hi there,

Thank you for opening an issue. Please note that we try to keep the Terraform issue tracker reserved for bug reports and feature requests. For general usage questions, please see: https://www.terraform.io/community.html.

Terraform Version

Terraform 0.11.1 with AWS Provider 1.6

Affected Resource(s)

Please list the resources as a list, for example:

aws_lb_target_group

If this issue appears to affect multiple resources, it may be an issue with Terraform's core, so please mention this.

Terraform Configuration Files

resource "aws_lb_target_group" "tcp" {
 # ...
 protocol    = "TCP"
 # ...
 health_check {
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = "10"
    port                = "443"
    path                = "/healthz"
    protocol            = "HTTPS"
    interval            = 30
    matcher             = "200-399"
  }
}

Debug Output

Please provider a link to a GitHub Gist containing the complete debug output: https://www.terraform.io/docs/internals/debugging.html. Please do NOT paste the debug output in the issue; just paste a link to the Gist.

Expected Behavior

Setup the expected network loadbalancer like in 1.5

Actual Behavior

* module.lb_internal_master.aws_lb_target_group.tcp: 1 error(s) occurred:

* module.lb_internal_master.aws_lb_target_group.tcp: arn:aws:elasticloadbalancing:eu-central-1:191844718867:targetgroup/openshift-tg-internal-master-443/652998c18664d76d: custom matcher is not supported for target_groups with TCP protocol

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

terraform apply

Important Factoids

N/A

References

N/A

The text was updated successfully, but these errors were encountered:

gaelL · 2017-12-19T11:08:37Z

Hi, Same issue here with the path.

Looking the Aws doc http://docs.aws.amazon.com/fr_fr/elasticloadbalancing/latest/APIReference/API_CreateTargetGroup.html

HTTP health check param path and matcher should be available for Network Load Balancers

Might be related to a6d1266

jasonkuehl · 2017-12-19T16:32:45Z

I just found this error myself.

Terraform 0.11.1

provider.aws: version = "~> 1.6"

Example code

resource "aws_lb_target_group" "testexternal" {
  name     = "testexternal"
  protocol = "TCP"
  port     = 22
  vpc_id      = "${aws_vpc.bla.id}"

  health_check {
    healthy_threshold   = 2
    unhealthy_threshold = 2
    interval            = 10
  }
}

resource "aws_lb" "testexternal" {
  name                        = "testserver"

  load_balancer_type          = "network"
  internal                    = false
  subnets                     = ["${module.subnet.ELB-subnet-ids}"]
  enable_deletion_protection  = true
}

resource "aws_lb_listener" "testexternal" {
  load_balancer_arn = "${aws_lb.testexternal.arn}"
  protocol          = "TCP"
  port              = "22"

  default_action {
    target_group_arn = "${aws_lb_target_group.testexternal.arn}"
    type             = "forward"
  }
}

resource "aws_lb_target_group_attachment" "testexternal" {
  target_group_arn = "${aws_lb_target_group.testexternal.arn}"
  target_id        = "${aws_instance.bla-002.id}"
  port             = 22
}

meyertime · 2017-12-19T17:34:38Z

I ran into this issue as well.

My theory is that it is checking the rules based on the target group protocol (which in this case is TCP) when it should be checking against the health check protocol (which is HTTPS). Both path and matcher are not applicable to TCP health checks, but they are to HTTP and HTTPS health checks, even if the target group's protocol is TCP.

I might also mention that even version 1.5 had some wonky behavior regarding health checks on network load balancers / TCP target groups. This is because network load balancers have certain restrictions on the health checks that application load balancers do not: The unhealthy threshold must equal the healthy threshold, timeout is fixed to 10 and cannot be changed, interval has only two possible values (10 and 30), and matcher (if applicable) is fixed to 200-399.

Unfortunately, terraform would still try to manage these parameters even if they weren't supplied. This led to errors in some cases. In others, it would work upon creation, but then a subsequent apply would detect changes to those parameters and attempt to fix them, resulting in errors. This made it necessary to specify all the parameters and make sure they were valid for network load balancers. For instance, matcher = "200-399" had to be specified in this case in order to avoid errors, even though matcher is always that value and cannot be changed. However, now in 1.6, terraform won't let you specify matcher. I haven't had a chance to see what happens in 1.6 when you don't specify a matcher, though, because in this case we need to specify a path, and that is not allowed now either. So we have to revert to 1.5 for now to work around this.

In any case, the rules will have to take into account not only the protocol of the target group (which determines whether it's an alb or nlb) but also the protocol of the health check.

apparentlymart · 2017-12-20T09:59:57Z

Hi all! Sorry for the regression here.

It seems that this is caused by the additional validation checks added in #2380. The goal of these changes was to catch more errors at plan time that were previously only caught at apply time, regarding the various subtle differences between application and network load balancers.

@deftflux is correct that the validation code is checking the target group protocol to recognize if a given target group is an application or network target group, but indeed it does seem like the health check protocol is the correct thing to check for this case to match, per the relevant API documentation which describes this particular property (Matcher) as being for "HTTP/HTTPS health checks" rather than for network load balancers in particular.

It seems that the same bug exists for HealthCheckPath. The docs also seem to disagree with our implementation about the timeout attribute, which we currently seem to permit only for HTTP/HTTPS target groups but the docs suggest it can work for all target groups but has a different range of valid values and different default depending on the healthcheck protocol.

apparentlymart · 2017-12-20T11:25:29Z

After playing with this some more it seems like the current validation is correct here, per what's enforced by the underlying API. After weakening the check in the provider, I see the following error from the remote API during apply:

InvalidConfigurationRequest: Custom health check matchers are not supported for health checks for target groups with the TCP protocol

The logic I'd implemented -- based on the documentation -- was to allow custom health check matchers if the healthcheck protocol is HTTP, but it seems that there is an undocumented additional restriction that Matcher may not be set for TCP target groups, regardless of the healthcheck protocol.

However, I see that you all saw something working prior to this validation being added, so I'm now trying to figure out what the old implementation (prior to 1.6) was actually doing in this scenario that was allowing it to work.

apparentlymart · 2017-12-20T11:43:21Z

As far as I can tell, this was only working before because it was totally ignoring these attributes:

https://github.com/terraform-providers/terraform-provider-aws/blob/840a82babd3ef0deed25ca7e06104f998577bbab/aws/resource_aws_lb_target_group.go#L218-L224

So while indeed this wasn't an error before, it seems like it was never actually working. In principle we could restore the previous behavior of just silently ignoring these arguments for TCP target groups, but that seems counter to Terraform's usually goal of doing what it says it will do or failing loudly if it can't.

Given that these attributes were not functional before anyway, I'd like to propose that we move forward with these additional checks in place (arguably it was a bug that these checks were not present before) and require removing these previously-non-functional attributes from configuration when upgrading to 1.6 and above. Of course we ideally would've noticed this change in behavior and included it in the 1.6 changelog, which we can do now retroactively although it won't be visible within the v1.6.0 tag's version of the changelog since that is now frozen.

Please let me know if any of you have a use-case where including these arguments even though they are ignored is important; we can then think about how we might strike a compromise to retain the now-more-correct validation while still making those use-cases work.

Sorry for the accidental undocumented compatibility break here! 😖

meyertime · 2017-12-20T16:09:42Z

Thanks for looking into this @apparentlymart !

I did a little testing, and there is still a problem with 1.6. Consider this test configuration:

resource "aws_lb_target_group" "foo" {
    name = "tf-nlb-health-check-test"
    protocol = "TCP"
    port = "1234"
    vpc_id = "${local.vpc}"

    health_check {
        protocol = "HTTPS"
        port = 12345
        #path = "/custom/path"
        #matcher = "200-399"
        interval = 30
        #timeout = 10
        healthy_threshold = 3
        unhealthy_threshold = 3
    }
}

The lines that are commented out are the ones that I used in 1.5 but that are considered invalid now in 1.6. In 1.5 with those lines uncommented, it works for both creation and subsequent plan and apply.

With those lines commented out, 1.6 will create the target group successfully. However, subsequent plan or apply will produce the following error:

aws_lb_target_group.foo: Refreshing state... (ID: arn:aws:elasticloadbalancing:us-east-1:...nlb-health-check-test/f37488d894f4b0a6)

Error: Error running plan: 1 error(s) occurred:

* aws_lb_target_group.foo: 1 error(s) occurred:

* aws_lb_target_group.foo: arn:aws:elasticloadbalancing:us-east-1:365567845318:targetgroup/tf-nlb-health-check-test/f37488d894f4b0a6: custom matcher is not supported for target_groups with TCP protocol

Apparently, terraform is detecting a change in the matcher, presumably because normally the default matcher is assumed by terraform to be 200, but for TCP target groups, AWS locks it to 200-399. That's why previously, explicitly specifying the correct default fixed the problem, but that workaround is no longer possible in 1.6. (However, it is strange that this error happens while refreshing the state rather than when applying a change.)

So it looks like these attributes are being correctly ignored when creating a TCP target group, but not when managing an existing TCP target group.

As far as requiring the removal of these attributes, I suppose it is a bit of a bug that we had to explicitly specify them previously. But we could still maintain backwards compatibility if we allow the only valid value to be specified as the commented lines above. I would definitely be in favor of at least fixing it so that we do not have to specify those attributes, however.

arminbuerkle · 2017-12-21T10:37:09Z

I'm currently experiencing the same problem.

I created an aws_lb_target_group in v1.5.0 with matcher = "200-399".
When i upgraded to v1.6.0 i get the following message, even after removing the matcher attribute:

custom matcher is not supported for target_groups with TCP protocol

When i downgrade to v1.5.0 again and remove the matcher attribute i get:

Error modifying Target Group: ValidationError: Health check matcher HTTP code cannot be empty

How is an upgrade suppose to happen with the current implementation?
Currently the only option i have is to stay on v1.5.0 with matcher set.

meyertime · 2017-12-21T16:06:42Z

You are right, @arminbuerkle. Currently in 1.6, there is no way to have an HTTP/S health check for a TCP target group without getting errors on subsequent plan or apply.

apparentlymart · 2018-01-04T17:37:49Z

Thanks for that extra detail @deftflux, and sorry for the silence while I was on my holiday break.

Indeed it does seem like there is an issue here based on what you described. I suppose what's happening here is that we're reading back some server-provided defaults from the API that are then causing validation to fail.

Probably the best solution for that would be to add some extra checks to the Read implementation to force the relevant attributes to be saved as empty regardless of what the API returns, so that the values in state stay consistent with the empty values we now require in the configuration.

jantman · 2018-01-08T14:48:50Z

I guess I may have gotten a bit lost in the discussion here, but I'm hitting this issue as well. Via both awscli and the Console UI, I can create an NLB with a HTTP health check against a custom path. Assuming the AWS Console UI is correct: a TCP target group can have a HTTP health check and the configurable properties on it are Path, Port and Healthy Threshold. Unhealthy Threshold, Timeout, Interval and Matcher ("Success Codes" in the UI) are grayed out and fixed values.

whereisaaron · 2018-01-08T16:58:10Z

Hi @jantman yes, you are correct, you can create TCP Target Groups with HTTP health checks, and Terraform should be letting us too, but isn't right now.

What you can't do it create a TCP Target Group with and TCP health check, and then change the health check to HTTP (likewise for any protocol change even HTTP -> HTTPS). Changing the health check protocol requires destroying and recreating the Target Group.

Likewise changing unhealthy threshold, timeout, interval, and success codes (matcher), can be set on creation, but all require the Target Group to be recreated to change them (which seems harsh!).

Another note about 'interval', I note in the UI that for HTTP/HTTPS health checks you can freely set the interval, but for TCP health check you can only set 10 seconds or 30 seconds. That might be an API restriction too. Though I guess Terraform could leave that validation up to AWS.

bflad · 2018-01-12T16:05:30Z

This has been released in terraform-provider-aws version 1.7.0. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

whereisaaron · 2018-01-12T21:13:23Z

PR #2906 lets you create HTTP/HTTPS health checks now 🎉, but it doesn't fully resolve validation. The protocol can be changed between HTTP and HTTPS, but changing to or from TCP should trigger a recreation plan for the Target Group.

* module.nlb.aws_lb_target_group.nlb[0]: 1 error(s) occurred:

* aws_lb_target_group.nlb.0: Error modifying Target Group: InvalidConfigurationRequest: You cannot change the health check protocol for a target group with the TCP protocol
        status code: 400, request id: 1234567-f7db-11e7-8d49-f37d7e7f4cf3

iancward · 2018-04-04T21:39:35Z

@whereisaaron I just ran into issue switch from TCP to HTTP/HTTPs (it needs to be re-created and terraform doesn't know that); was a separate issue ever opened to address that?

whereisaaron · 2018-04-08T23:02:14Z

@iancward no sorry, I don't know any issue for this bug where recreation isn't triggered. I currently have to manually intervene.

maneesh8 · 2018-04-10T13:38:51Z

@whereisaaron it doesnt let me change to a custom healthcheck with TCP protocol. While it lets me in AWS console. So i dont think this is something rejected by AWS api's

As suggested by Dani Comnea [1]. When we switched to network load balancers in 16dfbb3 (data/aws: use nlbs instead of elbs, 2018-11-01, openshift#594), we replaced things like: resource "aws_elb" "api_internal" { ... health_check { healthy_threshold = 2 unhealthy_threshold = 2 timeout = 3 target = "SSL:6443" interval = 5 } ... } with: resource "aws_lb_target_group" "api_internal" { ... health_check { healthy_threshold = 3 unhealthy_threshold = 3 interval = 10 port = 6443 protocol = "TCP" } } This resulted in logs like [2]: [core@ip-10-0-11-88 ~]$ sudo crictl ps CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT 1bf4870ea6eea registry.svc.ci.openshift.org/openshift/origin-v4.0-2018-12-15-160933@sha256:97eac256dde260e8bee9a5948efce5edb879dc6cb522a0352567010285378a56 2 minutes ago Running machine-config-server 0 [core@ip-10-0-11-88 ~]$ sudo crictl logs 1bf4870ea6eea I1215 20:23:07.088210 1 bootstrap.go:37] Version: 3.11.0-356-gb7ffe0c7-dirty I1215 20:23:07.088554 1 api.go:54] launching server I1215 20:23:07.088571 1 api.go:54] launching server 2018/12/15 20:24:17 http: TLS handshake error from 10.0.20.86:28372: EOF 2018/12/15 20:24:18 http: TLS handshake error from 10.0.20.86:38438: EOF 2018/12/15 20:24:18 http: TLS handshake error from 10.0.47.69:26320: EOF ... when the health check opens a TCP connection (in this case to the machine-config server on 49500) and then hangs up without completing the TLS handshake. Network load balancers [3,4] do not have an analog to the classic load balancers' SSL protocol [5,6,7], so we're using HTTPS. There's some discussion in [8] about the best way to perform unauthenticated liveness checks on the Kubernetes API server. For now, I'm assuming that both 200 and 401 responses to /healthz requests indicate a functional server, and we can evaluate other response status codes as necessary. Checking against a recent cluster: $ curl -i https://wking-api.devcluster.openshift.com:6443/healthz curl: (60) Peer's Certificate issuer is not recognized. More details here: http://curl.haxx.se/docs/sslcerts.html curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option. $ curl -ik https://wking-api.devcluster.openshift.com:6443/healthz HTTP/1.1 200 OK Cache-Control: no-store Date: Sun, 16 Dec 2018 06:18:23 GMT Content-Length: 2 Content-Type: text/plain; charset=utf-8 I don't know if the network load balancer health checks care about certificate validity or not. I guess we'll see how CI testing handles this. Ignition is only exposed inside the cluster, and checking that from a master node: [core@ip-10-0-26-134 ~]$ curl -i https://wking-api.devcluster.openshift.com:49500/ curl: (60) Peer's Certificate issuer is not recognized. More details here: http://curl.haxx.se/docs/sslcerts.html curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option. [core@ip-10-0-26-134 ~]$ curl -ik https://wking-api.devcluster.openshift.com:49500/ HTTP/1.1 404 Not Found Content-Type: text/plain; charset=utf-8 X-Content-Type-Options: nosniff Date: Sun, 16 Dec 2018 06:30:14 GMT Content-Length: 19 404 page not found Unfortunately, setting matcher [9] is not allowed for network load balancers (e.g. see [10,11]). Setting it leads to errors like: ERROR * module.vpc.aws_lb_target_group.api_internal: 1 error occurred: ERROR * aws_lb_target_group.api_internal: Error creating LB Target Group: InvalidConfigurationRequest: Custom health check matchers are not supported for health checks for target groups with the TCP protocol ERROR status code: 400, request id: 25a53d63-00fe-11e9-80c5-59885e191c9c So I've left it unset here, and we'll just hope the 401s don't start happening. [1]: openshift#923 [2]: https://groups.google.com/d/msg/openshift-4-dev-preview/Jmt6AK0EJR4/Ed3W7yZyBQAJ [3]: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html [4]: https://www.terraform.io/docs/providers/aws/r/lb_target_group.html#protocol [5]: https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-healthchecks.html [6]: https://www.terraform.io/docs/providers/aws/r/elb.html#target [7]: hashicorp/terraform-provider-aws#6866 [8]: kubernetes/kubernetes#43784 [9]: https://www.terraform.io/docs/providers/aws/r/lb_target_group.html#matcher [10]: https://github.com/terraform-providers/terraform-provider-aws/pull/2906/files#diff-375aea487c27a6ada86edfd817ba2401R612 [11]: hashicorp/terraform-provider-aws#2708 (comment)

The server currently 404s the root path. From a master: [core@ip-10-0-26-134 ~]$ curl -ik https://wking-api.devcluster.openshift.com:49500/ HTTP/1.1 404 Not Found Content-Type: text/plain; charset=utf-8 X-Content-Type-Options: nosniff Date: Sun, 16 Dec 2018 06:30:14 GMT Content-Length: 19 404 page not found but we need a reliable response on the range 200-399 to satisfy our network load balancer health checks, which do not support configurable response status codes [1,2] (these are Terraform links, but they discuss an AWS restriction that is not Terraform-specific). This commit adds a /healthz endpoint which always 204s (when the server is alive to handle it). [1]: hashicorp/terraform-provider-aws#2708 (comment) [2]: https://github.com/terraform-providers/terraform-provider-aws/pull/2906/files#diff-375aea487c27a6ada86edfd817ba2401R612

As suggested by Dani Comnea [1]. When we switched to network load balancers in 16dfbb3 (data/aws: use nlbs instead of elbs, 2018-11-01, openshift#594), we replaced things like: resource "aws_elb" "api_internal" { ... health_check { healthy_threshold = 2 unhealthy_threshold = 2 timeout = 3 target = "SSL:6443" interval = 5 } ... } with: resource "aws_lb_target_group" "api_internal" { ... health_check { healthy_threshold = 3 unhealthy_threshold = 3 interval = 10 port = 6443 protocol = "TCP" } } This resulted in logs like [2]: [core@ip-10-0-11-88 ~]$ sudo crictl ps CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT 1bf4870ea6eea registry.svc.ci.openshift.org/openshift/origin-v4.0-2018-12-15-160933@sha256:97eac256dde260e8bee9a5948efce5edb879dc6cb522a0352567010285378a56 2 minutes ago Running machine-config-server 0 [core@ip-10-0-11-88 ~]$ sudo crictl logs 1bf4870ea6eea I1215 20:23:07.088210 1 bootstrap.go:37] Version: 3.11.0-356-gb7ffe0c7-dirty I1215 20:23:07.088554 1 api.go:54] launching server I1215 20:23:07.088571 1 api.go:54] launching server 2018/12/15 20:24:17 http: TLS handshake error from 10.0.20.86:28372: EOF 2018/12/15 20:24:18 http: TLS handshake error from 10.0.20.86:38438: EOF 2018/12/15 20:24:18 http: TLS handshake error from 10.0.47.69:26320: EOF ... when the health check opens a TCP connection (in this case to the machine-config server on 49500) and then hangs up without completing the TLS handshake. Network load balancers [3,4] do not have an analog to the classic load balancers' SSL protocol [5,6,7], so we're using HTTPS. There's some discussion in [8] about the best way to perform unauthenticated liveness checks on the Kubernetes API server that suggests 401s are possible in some configurations. Checking against a recent cluster: $ curl -i https://wking-api.devcluster.openshift.com:6443/healthz curl: (60) Peer's Certificate issuer is not recognized. More details here: http://curl.haxx.se/docs/sslcerts.html curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option. $ curl -ik https://wking-api.devcluster.openshift.com:6443/healthz HTTP/1.1 200 OK Cache-Control: no-store Date: Sun, 16 Dec 2018 06:18:23 GMT Content-Length: 2 Content-Type: text/plain; charset=utf-8 ok I don't know if the network load balancer health checks care about certificate validity or not. I guess we'll see how CI testing handles this. Ignition is only exposed inside the cluster, and checking that from a master node: [core@ip-10-0-26-134 ~]$ curl -i https://wking-api.devcluster.openshift.com:49500/ curl: (60) Peer's Certificate issuer is not recognized. More details here: http://curl.haxx.se/docs/sslcerts.html curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option. [core@ip-10-0-26-134 ~]$ curl -ik https://wking-api.devcluster.openshift.com:49500/ HTTP/1.1 404 Not Found Content-Type: text/plain; charset=utf-8 X-Content-Type-Options: nosniff Date: Sun, 16 Dec 2018 06:30:14 GMT Content-Length: 19 404 page not found So we're checking the new /healthz from openshift/machine-config-operator@d0a7ae21 (server: Add /healthz, 2019-01-04, openshift/machine-config-operator#267) instead. Unfortunately, setting matcher [9] is not allowed for network load balancers (e.g. see [10,11]). Setting it leads to errors like: ERROR * module.vpc.aws_lb_target_group.api_internal: 1 error occurred: ERROR * aws_lb_target_group.api_internal: Error creating LB Target Group: InvalidConfigurationRequest: Custom health check matchers are not supported for health checks for target groups with the TCP protocol ERROR status code: 400, request id: 25a53d63-00fe-11e9-80c5-59885e191c9c So I've left it unset here, and we'll just hope the 401s don't start happening. [1]: openshift#923 [2]: https://groups.google.com/d/msg/openshift-4-dev-preview/Jmt6AK0EJR4/Ed3W7yZyBQAJ [3]: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html [4]: https://www.terraform.io/docs/providers/aws/r/lb_target_group.html#protocol [5]: https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-healthchecks.html [6]: https://www.terraform.io/docs/providers/aws/r/elb.html#target [7]: hashicorp/terraform-provider-aws#6866 [8]: kubernetes/kubernetes#43784 [9]: https://www.terraform.io/docs/providers/aws/r/lb_target_group.html#matcher [10]: https://github.com/terraform-providers/terraform-provider-aws/pull/2906/files#diff-375aea487c27a6ada86edfd817ba2401R612 [11]: hashicorp/terraform-provider-aws#2708 (comment)

MichalPloski · 2019-01-25T14:26:52Z

Unfortunately problem persist for Terraform v0.11.11

resource "aws_lb_target_group" "hapee_nlb_target" {
  name = "hapee-test-nlb-tg"

  vpc_id = "${aws_vpc.default.id}"

  port     = 443
  protocol = "TCP"

  health_check {
    interval            = 30
    path                = "/haproxy_status"
    port                = 8080
    timeout             = 5
    healthy_threshold   = 3
    unhealthy_threshold = 3
    matcher             = "200,202"
  }
aws_lb_target_group.hapee_nlb_target: Error creating LB Target Group: InvalidConfigurationRequest: Custom health check timeouts are not supported for health checks for target groups with the TCP protocol

reid-harrison · 2019-02-10T23:04:39Z

@MichalPloski yes, this looks to still be validated along with other similar parameters: https://github.com/terraform-providers/terraform-provider-aws/blob/d0edc835f07ef347937892b691b9ab0a602b2372/aws/resource_aws_lb_target_group.go#L673

That check technically allows 0 to be set in the case of TCP health check but unfortunately, timeout param fails validation against the allowed range 2-60.

With terraform-provider-aws =1.38.0

Error: aws_lb_target_group.my_tg: expected health_check.0.timeout to be in the range (2 - 60), got 0

red8888 · 2019-02-26T22:42:45Z

Why is this closed? seems to still not be working? Or maybe there is a workaround?

red8888 · 2019-02-26T22:54:04Z

whoops nevermind I can see in the aws UI you actually cant change the timeout at all (greyed out)

reid-harrison · 2019-03-07T20:15:11Z

@red8888 It is still an issue for me because I would like to re-use the aws_lb_target_group resource for both HTTP and TCP protocols (as a module, for example). I should be able to set the health_check timeout argument to 0 or 10 for a TCP TG but with this bug, Terraform will always throw an error when health_check timeout is set when protocol is TCP.

sqlaide · 2019-06-18T17:25:12Z

Is there workaround for this? Why this was closed?

kaushikreddi9 · 2019-09-09T17:08:12Z

I am using provider.aws: version = "~> 2.23", and I am still facing this issue, Let me know how the issue can be resolved or any workaround.

djordje-petrovic · 2019-09-10T09:27:41Z

Had the same issue. TF version 0.11.4. Provider 2.27.0. Solved it by setting matcher to 200~399. Anything else would fail with the same error the rest are having.

kaushikreddi9 · 2019-09-10T13:38:31Z

Had the same issue. TF version 0.11.4. Provider 2.27.0. Solved it by setting matcher to 200~399. Anything else would fail with the same error the rest are having.

I have made the changes which you have suggested, but still facing the same issue. below is NLB target group resource ,

resource "aws_lb_target_group" "nlb_target" {
name = "nlb-target"
port = XXX
protocol = "TCP"
vpc_id = "XXX"

health_check {
healthy_threshold = 3
unhealthy_threshold = 3
timeout = 6
protocol = "HTTPS"
port = XXX
path = "XXX"
interval = 30
matcher = "200,399"
}
}

Let me know if I need to config any additional params?

kaushikreddi9 · 2019-09-12T17:02:56Z

I have fixed the issue, by setting below values,
timeout = 10
matcher = "200-399"
interval = 30

as per hashicorp/terraform-provider-aws#2708 and try to use HTTP as protocol to support a health check path

ghost · 2019-11-01T15:12:57Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

jasonkuehl mentioned this issue Dec 19, 2017

aws_lb_target_group requires HTTP or HTTPS #2668

Closed

apparentlymart added bug Addresses a defect in current functionality. regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. labels Dec 20, 2017

apparentlymart mentioned this issue Dec 20, 2017

HTTP health check not working with NLB #2653

Closed

radeksimko mentioned this issue Jan 9, 2018

r/lb_target_group: Fix validation rules for LB's healthcheck #2906

Merged

radeksimko closed this as completed in #2906 Jan 11, 2018

bflad added this to the v1.7.0 milestone Jan 12, 2018

radeksimko mentioned this issue Jan 18, 2018

Netwrok Load Balancer cant configure HTTPS health check protocol #3054

Closed

iancward mentioned this issue Apr 10, 2018

Changing health check on TCP listener requires resource to be deleted and re-created #4150

Closed

wking mentioned this issue Dec 16, 2018

data/aws/vpc: Use HTTPS for load balancer health checks openshift/installer#924

Merged

wking mentioned this issue Dec 16, 2018

pkg/server/api: Add /healthz for load-balancer health checks openshift/machine-config-operator#239

Closed

derekargueta mentioned this issue Aug 8, 2019

Graceful HTTP Connection Draining During Shutdown? envoyproxy/envoy#7841

Open

etwillbefine added a commit to goci-io/aws-api-gateway-proxy that referenced this issue Oct 6, 2019

correct health check for nlb tls target group

ff574df

as per hashicorp/terraform-provider-aws#2708 and try to use HTTP as protocol to support a health check path

ghost locked and limited conversation to collaborators Nov 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to define http/s health checks for Network Loadbalancer #2708

Unable to define http/s health checks for Network Loadbalancer #2708

jkroepke commented Dec 19, 2017

gaelL commented Dec 19, 2017

jasonkuehl commented Dec 19, 2017

meyertime commented Dec 19, 2017

apparentlymart commented Dec 20, 2017

apparentlymart commented Dec 20, 2017

apparentlymart commented Dec 20, 2017

meyertime commented Dec 20, 2017

arminbuerkle commented Dec 21, 2017

meyertime commented Dec 21, 2017

apparentlymart commented Jan 4, 2018

jantman commented Jan 8, 2018

whereisaaron commented Jan 8, 2018

bflad commented Jan 12, 2018

whereisaaron commented Jan 12, 2018

iancward commented Apr 4, 2018

whereisaaron commented Apr 8, 2018

maneesh8 commented Apr 10, 2018

MichalPloski commented Jan 25, 2019

reid-harrison commented Feb 10, 2019

red8888 commented Feb 26, 2019 •

edited

red8888 commented Feb 26, 2019

reid-harrison commented Mar 7, 2019

sqlaide commented Jun 18, 2019

kaushikreddi9 commented Sep 9, 2019

djordje-petrovic commented Sep 10, 2019

kaushikreddi9 commented Sep 10, 2019

kaushikreddi9 commented Sep 12, 2019

ghost commented Nov 1, 2019

Unable to define http/s health checks for Network Loadbalancer #2708

Unable to define http/s health checks for Network Loadbalancer #2708

Comments

jkroepke commented Dec 19, 2017

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

gaelL commented Dec 19, 2017

jasonkuehl commented Dec 19, 2017

meyertime commented Dec 19, 2017

apparentlymart commented Dec 20, 2017

apparentlymart commented Dec 20, 2017

apparentlymart commented Dec 20, 2017

meyertime commented Dec 20, 2017

arminbuerkle commented Dec 21, 2017

meyertime commented Dec 21, 2017

apparentlymart commented Jan 4, 2018

jantman commented Jan 8, 2018

whereisaaron commented Jan 8, 2018

bflad commented Jan 12, 2018

whereisaaron commented Jan 12, 2018

iancward commented Apr 4, 2018

whereisaaron commented Apr 8, 2018

maneesh8 commented Apr 10, 2018

MichalPloski commented Jan 25, 2019

reid-harrison commented Feb 10, 2019

red8888 commented Feb 26, 2019 • edited

red8888 commented Feb 26, 2019

reid-harrison commented Mar 7, 2019

sqlaide commented Jun 18, 2019

kaushikreddi9 commented Sep 9, 2019

djordje-petrovic commented Sep 10, 2019

kaushikreddi9 commented Sep 10, 2019

kaushikreddi9 commented Sep 12, 2019

ghost commented Nov 1, 2019

red8888 commented Feb 26, 2019 •

edited