Handling services that Consul has deregistered #146

randomswdev · 2019-09-05T10:18:57Z

Terraform Version

0.12

Affected Resource(s)

consul_service

Terraform Configuration Files

resource "consul_service" "redis" {
  name = "redis"
  node = "redis"
  port = 6379

  check {
    name                              = "Redis health check"
    interval                          = "5s"
    timeout                           = "1s"
    deregister_critical_service_after = "30s"
  }
}

Error Output

...
redis: Refreshing state... [id=redis]

Error: Failed to retrieve service: 'redis', services: 1

Actual Behavior

If the service goes into the critical state, Consul deregisters it after the interval defined in deregister_critical_service_after . If this happens, when running again terraform, it outputs an error because the service is defined in the state but cannot be refreshed from Consul (it no longer exists).

Proposal

I would like to discuss options to avoid the error condition, for example by removing the service from the state in case the refresh fails. This behavior can be applied always or, for example, can be controlled through a switch at the resource or provider level. I don't know if there is any other design alternative; any of them is welcome.
If we can agree on a solution for the issue, I can implement and contribute the patch.

The text was updated successfully, but these errors were encountered:

remilapeyre · 2019-09-05T10:42:29Z

Hi @randomswdev, can you run terraform version to see what version of the provider you are using?

randomswdev · 2019-09-05T11:22:15Z

We use a Consul provider built from the master branch because we need some features/fixes that have been included in the master branch but are not available in version v2.5.0.

Terraform v0.12.7
+ provider.consul (unversioned)

We built the provider about a week or two ago, using the source code avialble at that time in the master branch.

remilapeyre · 2019-09-05T11:31:46Z

I just tried with master (64de0bb) and it seems to work:

➜  terraform-provider-consul git:(master) ✗ terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # consul_service.redis will be created
  + resource "consul_service" "redis" {
      + address    = (known after apply)
      + datacenter = (known after apply)
      + id         = (known after apply)
      + name       = "redis"
      + node       = "MacBook-Pro-de-Remi.local"
      + port       = 6379
      + service_id = (known after apply)

      + check {
          + check_id                          = (known after apply)
          + deregister_critical_service_after = "30s"
          + interval                          = "5s"
          + method                            = "GET"
          + name                              = "Redis health check"
          + status                            = "critical"
          + timeout                           = "1s"
          + tls_skip_verify                   = false
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

consul_service.redis: Creating...
consul_service.redis: Creation complete after 0s [id=redis]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
➜  terraform-provider-consul git:(master) ✗ terraform apply
consul_service.redis: Refreshing state... [id=redis]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # consul_service.redis will be created
  + resource "consul_service" "redis" {
      + address    = (known after apply)
      + datacenter = (known after apply)
      + id         = (known after apply)
      + name       = "redis"
      + node       = "MacBook-Pro-de-Remi.local"
      + port       = 6379
      + service_id = (known after apply)

      + check {
          + check_id                          = (known after apply)
          + deregister_critical_service_after = "30s"
          + interval                          = "5s"
          + method                            = "GET"
          + name                              = "Redis health check"
          + status                            = "critical"
          + timeout                           = "1s"
          + tls_skip_verify                   = false
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

consul_service.redis: Creating...
consul_service.redis: Creation complete after 0s [id=redis]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
➜  terraform-provider-consul git:(master) ✗ g rev-parse HEAD
64de0bb68fde35178b2a2f23db38da6ad7a5212e

Did I miss something to reproduce the issue? We used to have a bug like this but it's supposed to have been solved in 194fff3.

randomswdev · 2019-09-05T15:47:39Z

I provided an example that was too simple 😄

I finally produced the issue using the following fragment of terraform:

provider "consul" {
}
resource "consul_node" "redis1" {
  name = "redis/redis1"
  address = "hostname1"
}
resource "consul_node" "redis2" {
  name = "redis/redis2"
  address = "hostname2"
}

resource "consul_service" "redis1" {
  name = "redis"
  node = consul_node.redis1.name
  port = 6379

  check {
    check_id                          = "service:redis1"
    name                              = "Redis health check"
    tcp                               = "127.0.0.1:6379"
    interval                          = "5s"
    timeout                           = "1s"
    deregister_critical_service_after = "30s"
  }
}

resource "consul_service" "redis2" {
  name = "redis"
  node = consul_node.redis2.name
  port = 6379

  check {
    check_id                          = "service:redis1"
    name                              = "Redis health check"
    tcp                               = "127.0.0.1:6379"
    interval                          = "5s"
    timeout                           = "1s"
    deregister_critical_service_after = "30s"
  }
}

If you deregister one node and issue again the terraform command, terraform will complain that it is not able to refresh the service.

The difference here is that the same service is defined on two nodes: if you delete a node, the service still exists, but no longer exists on the deleted node and I think this somehow confuses the provider.

Closes hashicorp#146

remilapeyre · 2019-09-05T16:06:24Z

Yes, this is a mistake. Could you confirm that #147 fixes the issue for you?

randomswdev · 2019-09-06T13:35:06Z

The fix worked for me. Thank you very much @remilapeyre

remilapeyre · 2019-09-06T21:31:38Z

Thanks for the info :)

…147) Closes #146

remilapeyre pushed a commit to remilapeyre/terraform-provider-consul that referenced this issue Sep 5, 2019

Fix resourceConsulServiceRead for services with multiple instances

1dc81f2

Closes hashicorp#146

remilapeyre mentioned this issue Sep 5, 2019

Fix plan for consul_service when there is multiple service instances #147

Merged

remilapeyre closed this as completed in #147 Sep 6, 2019

remilapeyre pushed a commit that referenced this issue Sep 6, 2019

Fix plan for consul_service when there is multiple service instances (#…

0941064

…147) Closes #146

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling services that Consul has deregistered #146

Handling services that Consul has deregistered #146

randomswdev commented Sep 5, 2019

remilapeyre commented Sep 5, 2019

randomswdev commented Sep 5, 2019

remilapeyre commented Sep 5, 2019

randomswdev commented Sep 5, 2019

remilapeyre commented Sep 5, 2019

randomswdev commented Sep 6, 2019

remilapeyre commented Sep 6, 2019

Handling services that Consul has deregistered #146

Handling services that Consul has deregistered #146

Comments

randomswdev commented Sep 5, 2019

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Error Output

Actual Behavior

Proposal

remilapeyre commented Sep 5, 2019

randomswdev commented Sep 5, 2019

remilapeyre commented Sep 5, 2019

randomswdev commented Sep 5, 2019

remilapeyre commented Sep 5, 2019

randomswdev commented Sep 6, 2019

remilapeyre commented Sep 6, 2019