Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TERRA-79 ⁃ Database region order leads to configuration drift #170

Open
giannoul opened this issue Sep 23, 2022 · 13 comments · May be fixed by #217
Open

TERRA-79 ⁃ Database region order leads to configuration drift #170

giannoul opened this issue Sep 23, 2022 · 13 comments · May be fixed by #217

Comments

@giannoul
Copy link

giannoul commented Sep 23, 2022

Terraform Version

Terraform v1.1.8
on linux_amd64

Affected Resource(s)

Please list the resources as a list, for example:

  • astra_database

Terraform Configuration Files

resource "astra_database" "current" {
  name           = var.name
  keyspace       = var.keyspace
  cloud_provider = var.cloud_provider
  regions        = var.regions

  timeouts {
    create = format("%sm", length(var.regions) * local.database_creation_delay)
  }
}

which is fed with variable values:

        name     = "test"
        keyspace = "test"
        cloud_provider = "gcp"
        regions  = ["us-central1", "us-west4", "us-east4"]

Debug Output

None

Panic Output

None

Expected Behavior

After terraform apply we should not be asked for the following change in subsequent applies:

 # module.astra_databases["joya"].astra_database.current will be updated in-place

 ~ resource "astra_database" "current" {
    id          = "111111-1111-11111-111-111111"
    name         = "test"
   ~ regions       = [
      "us-central1",
     - "us-west4",
      "us-east4",
     + "us-west4",
    ]
    # (14 unchanged attributes hidden)
    # (1 unchanged block hidden)

  }

Actual Behavior

On every apply or plan we are notified about the regions change.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. create a multi-region database with regions (in order) ["us-central1", "us-east4", "us-west4"]
  2. from the UI remove the region "us-east4". This will lead to the UI showing regions ["us-central1", "us-west4"]
  3. re-add the region "us-east4". This will lead to the UI showing regions ["us-central1", "us-west4", "us-east4"]
  4. In Terraform modify the regions to reflect the order in the UI like so ["us-central1", "us-west4", "us-east4"]
  5. In all subsequent applies it will keep asking for changing the regions order

Important Factoids

No

References

No

┆Issue is synchronized with this Jira Task by Unito
┆friendlyId: TERRA-79
┆priority: Major

@sync-by-unito sync-by-unito bot changed the title Database region order leads to configuration drift TERRA-79 ⁃ Database region order leads to configuration drift Sep 23, 2022
@emerkle826
Copy link
Contributor

@giannoul What version of the provider are you using? I just tried this with v2.1.4 (should also work with 2.1.5) and while it is a pain to have to re-order the regions list in your terrafrom config file, it does seem to work. I followed your steps and di not have an issue with terraform apply as long as I reversed the order of the regions (us-west4 ahead of us-east4)

@giannoul
Copy link
Author

@emerkle826 I have created another database to provide more concrete data on the above

variable values:

name     = "igtestdb"
keyspace = "igtestdb"
regions  = ["us-central1", "us-west4", "us-east4"]
  • After the terrafrom apply (all looking good):
$ terraform state show 'module.astra_databases["igtestdb"].astra_database.current'

# module.astra_databases["igtestdb"].astra_database.current:
resource "astra_database" "current" {
    additional_keyspaces = []
    cloud_provider       = "GCP"
    cqlsh_url            = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/cqlsh"
    data_endpoint_url    = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/api/rest"
    datacenters          = {
        "GCP.us-central1" = "xxxxxxxxxxxx-xxxxxxxxxxxx-1"
        "GCP.us-east4"    = "xxxxxxxxxxxx-xxxxxxxxxxxx-2"
        "GCP.us-west4"    = "xxxxxxxxxxxx-xxxxxxxxxxxx-3"
    }
    grafana_url          = "https://xxxxxxxxxxxx-us-central1.dashboard.astra.datastax.com/d/cloud/dse-cluster-condensed?refresh=30s&orgId=1&kiosk=tv"
    graphql_url          = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/api/graphql"
    id                   = "xxxxxxxxxxxx"
    keyspace             = "igtestdb"
    name                 = "igtestdb"
    node_count           = 9
    organization_id      = "xxxxxxxxxxxx"
    owner_id             = "xxxxxxxxxxxx"
    regions              = [
        "us-central1",
        "us-east4",
        "us-west4",
    ]
    replication_factor   = 1
    status               = "ACTIVE"
    total_storage        = 5

    timeouts {
        create = "45m"
    }
}
  • After removing the region us-east4 in the UI setting regions = ["us-central1", "us-west4"] and issuing terraform apply (all looking good):
$ terraform state show 'module.astra_databases["igtestdb"].astra_database.current'

# module.astra_databases["igtestdb"].astra_database.current:
resource "astra_database" "current" {
    additional_keyspaces = []
    cloud_provider       = "GCP"
    cqlsh_url            = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/cqlsh"
    data_endpoint_url    = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/api/rest"
    datacenters          = {
        "GCP.us-central1" = "xxxxxxxxxxxx-xxxxxxxxxxxx-1"
        "GCP.us-west4"    = "xxxxxxxxxxxx-xxxxxxxxxxxx-3"
    }
    grafana_url          = "https://xxxxxxxxxxxx-us-central1.dashboard.astra.datastax.com/d/cloud/dse-cluster-condensed?refresh=30s&orgId=1&kiosk=tv"
    graphql_url          = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/api/graphql"
    id                   = "xxxxxxxxxxxx"
    keyspace             = "igtestdb"
    name                 = "igtestdb"
    node_count           = 6
    organization_id      = "xxxxxxxxxxxx"
    owner_id             = "xxxxxxxxxxxx"
    regions              = [
        "us-central1",
        "us-west4",
    ]
    replication_factor   = 1
    status               = "ACTIVE"
    total_storage        = 5

    timeouts {
        create = "45m"
    }
}
  • After re-adding the region us-east4 via the UI and setting regions = ["us-central1", "us-west4", "us-east4"] in the Terraform code like you mentioned (notice the regions order):
$ terraform state show 'module.astra_databases["igtestdb"].astra_database.current'

# module.astra_databases["igtestdb"].astra_database.current:
resource "astra_database" "current" {
    additional_keyspaces = []
    cloud_provider       = "GCP"
    cqlsh_url            = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/cqlsh"
    data_endpoint_url    = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/api/rest"
    datacenters          = {
        "GCP.us-central1" = "xxxxxxxxxxxx-xxxxxxxxxxxx-1"
        "GCP.us-east4"    = "xxxxxxxxxxxx-xxxxxxxxxxxx-5"
        "GCP.us-west4"    = "xxxxxxxxxxxx-xxxxxxxxxxxx-3"
    }
    grafana_url          = "https://xxxxxxxxxxxx-us-central1.dashboard.astra.datastax.com/d/cloud/dse-cluster-condensed?refresh=30s&orgId=1&kiosk=tv"
    graphql_url          = "https://xxxxxxxxxxxx-us-central1.apps.astra.datastax.com/api/graphql"
    id                   = "xxxxxxxxxxxx"
    keyspace             = "igtestdb"
    name                 = "igtestdb"
    node_count           = 9
    organization_id      = "xxxxxxxxxxxx"
    owner_id             = "xxxxxxxxxxxx"
    regions              = [
        "us-central1",
        "us-east4",
        "us-west4",
    ]
    replication_factor   = 1
    status               = "ACTIVE"
    total_storage        = 5

    timeouts {
        create = "45m"
    }
}

From this point onwards, all Terraform apply/plan will result to the message:

  # module.astra_databases["igtestdb"].astra_database.current will be updated in-place
  ~ resource "astra_database" "current" {
        id                   = "xxxxxxxxxxxx"
        name                 = "igtestdb"
      ~ regions              = [
            "us-central1",
          - "us-west4",
            "us-east4",
          + "us-west4",
        ]
        # (14 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy. 

In order to verify that the order is wrong I did add one more region, the europe-west1:
image

However using the value regions = ["us-central1", "us-west4", "us-east4","europe-west1"] the regions are planned as follows:

  # module.astra_databases["igtestdb"].astra_database.current will be updated in-place
  ~ resource "astra_database" "current" {
        id                   = "xxxxxxxxxxxx"
        name                 = "igtestdb"
      ~ regions              = [
          + "europe-west1",
            "us-central1",
          - "us-west4",
            "us-east4",
          - "europe-west1",
          + "us-west4",
        ]
        # (14 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

As you can see the order is lexicographic or it comes from this endpoint:

curl -s -H "Authorization: Bearer xxxx" "https://api.astra.datastax.com/v2/regions/serverless" | jq | grep GCP -B 1 -A 7 | grep name

but always lead to drift in Terraform.

@giannoul
Copy link
Author

The version of the datastax/astra provider is 2.1.4

@StevenLacerda
Copy link

@emerkle826 Hey Erik, I see that you responded to this query, so I don't know if you're working on it, but can you update this?

@emerkle826
Copy link
Contributor

@giannoul @StevenLacerda Sorry I haven't responded sooner. I've been swamped with other work and haven't had a chance to get to this. I am trying to shuffle things so I can bring some much needed attention back to the provider soon.

@gtseres
Copy link

gtseres commented Feb 21, 2023

@emerkle826 Did you have any chance to look into this?

@gtseres
Copy link

gtseres commented Mar 22, 2023

@emerkle826 kind reminder

@emerkle826
Copy link
Contributor

@gtseres Sorry again for the delay. I am revisiting this now.

@emerkle826
Copy link
Contributor

@giannoul @gtseres I think I have a solution for this, but it may come at a cost. The problem with the current implementation is that the list of regions is treated as an ordered list inside the Terraform SDK. Anything I do to try to ignore the ordering can't be applied before Terraform detects a change. So once the order doesn't match exactly, any plan or apply will detect a change and require you to accept it. The only way to prevent that is to switch the internal structure to an unordered set.

Switching to a set, however, means that we lose the special handling of the first element in the regions list being interpreted as the primary datacenter region. If I change the regions array to a set internally, I will have to separate the primary region from the additional regions in order to not lose track of the primary region, which will require a change to your current astra_database definitions.

In addition to a definition change, switching the additional regions from an ordered list to an unordered set means that adding datacenetrs in other regions will not necessarily be done in the order they are listed in your definition. So if controling which region DC-2 is deployed to matters, this change would just create new problems.

I would appreciate your input about this change. As an example, this is how a definition would look for multi-datacenters going forward:

resource astra_database "terra79" {
 name = "terra79"
 keyspace = "testks"
 cloud_provider = "gcp"
 region = "us-west4"
 additional_regions = ["us-east1", "us-east4", "us-central1"]
 timeouts {
   create = "45m"
   update = "45m"
   delete = "45m"
 }
 deletion_protection = false
}

And here is how that database would be represented in the state file:

resource "astra_database" "terra79" {
    additional_keyspaces = []
    additional_regions   = [
        "us-central1",
        "us-east1",
        "us-east4",
    ]
    cloud_provider       = "GCP"
    cqlsh_url            = "https://d0b415b7-41d9-4edd-bb68-2d3283d87874-us-west4.apps.astra.datastax.com/cqlsh"
    data_endpoint_url    = "https://d0b415b7-41d9-4edd-bb68-2d3283d87874-us-west4.apps.astra.datastax.com/api/rest"
    datacenters          = {
        "GCP.us-central1" = "d0b415b7-41d9-4edd-bb68-2d3283d87874-5"
        "GCP.us-east1"    = "d0b415b7-41d9-4edd-bb68-2d3283d87874-3"
        "GCP.us-east4"    = "d0b415b7-41d9-4edd-bb68-2d3283d87874-4"
        "GCP.us-west4"    = "d0b415b7-41d9-4edd-bb68-2d3283d87874-1"
    }
    deletion_protection  = false
    grafana_url          = "https://d0b415b7-41d9-4edd-bb68-2d3283d87874-us-west4.dashboard.astra.datastax.com/d/cloud/dse-cluster-condensed?refresh=30s&orgId=1&kiosk=tv"
    graphql_url          = "https://d0b415b7-41d9-4edd-bb68-2d3283d87874-us-west4.apps.astra.datastax.com/api/graphql"
    id                   = "d0b415b7-41d9-4edd-bb68-2d3283d87874"
    keyspace             = "testks"
    name                 = "terra79"
    node_count           = 12
    organization_id      = <edited for security>
    owner_id             = <edited for security>
    region               = "us-west4"
    replication_factor   = 1
    status               = "ACTIVE"
    total_storage        = 5

    timeouts {
        create = "45m"
        delete = "45m"
        update = "45m"
    }
}

Notice that the order of additional_regions and datacenters is alphabetical and doesn't necessarily match the order they were defined in my definition file. However, with my local changes to the code, I can change the order of the additional_regions and it will not detect a change unless the set of regions has a change (an addition, a deletion, or both).

To sum up, I would appreciate your opinions on this change. In order for the plugin to not detect a change in the set of additional datacenters/regions, you would have to give up deterministic ordering when adding datacenters/regions. If the order in which DCs are added is not important, then I think this change would solve this issue (and be a better design).

@emerkle826 emerkle826 linked a pull request Mar 24, 2023 that will close this issue
@giannoul
Copy link
Author

Hello @emerkle826
It seems indeed that the order of regions is messed up in some cases without any obvious reason. Splitting the primary region and the additional ones seems like a sound approach. In our case we do not handle the Data Center 2 (DC-2) in any special way, so I am not expecting this to impact us.

We are using the astra_database resource within a custom module but I guess that we will need to customize it in order to accept the now split settings regarding the regions.

@emerkle826
Copy link
Contributor

@giannoul Thank you for the response. I'm finishing up the work in #217. I think that when I release it, I will bump the version from 2.1.X to 2.2.x as it will change how astra_database resources and datasources are defined. I will try to write up a little bit in the README how to handle upgrading it when released.....

@pgier
Copy link
Collaborator

pgier commented Jul 10, 2023

Did you already try doing aterraform apply --refresh-only? That should re-sync the state with the server, and then just adjust the order in main.tf to match what's in the Astra UI, and the diffs go away.

@CrackerJackMack
Copy link

Still happens on 2.1.17 even with refresh-only. had to ignore changes for regions making this terraform module and resource unusable right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants