Migrate cluster schema to use the go-sdk struct #3076

edwardfeng-db · 2024-01-05T12:35:23Z

Changes

Before we are able to migrate the cluster resources to use the go-sdk schema, we need to figure out a way to represent the current tf tags on the manually maintained structs
This PR introduces a new interface ResourceProviderStruct which contains a method for aliases and all of the customizations.
- Aliases() map[string]string
- CustomizeSchema(map[string]*schema.Schema) map[string]*schema.Schema
Also added a series of customization functions such as SetOptional, etc so that we can easily write customization logics into the CustomizeSchema function to replace the current tf tags
Migrated the cluster schema into the go-sdk schema with this new interface

Tests

make test run locally
relevant change in docs/ folder
covered with integration tests in internal/acceptance
relevant acceptance tests are passing
using Go SDK

codecov-commenter · 2024-01-08T22:33:57Z

Codecov Report

Attention: 73 lines in your changes are missing coverage. Please review.

Comparison is base (16c61e5) 84.31% compared to head (78fc653) 83.46%.
Report is 20 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3076      +/-   ##
==========================================
- Coverage   84.31%   83.46%   -0.86%     
==========================================
  Files         162      166       +4     
  Lines       14412    14710     +298     
==========================================
+ Hits        12152    12278     +126     
- Misses       1563     1719     +156     
- Partials      697      713      +16

Files	Coverage Δ
clusters/clusters_api.go	`84.48% <ø> (ø)`
clusters/resource_cluster.go	`87.86% <100.00%> (+2.09%)`	⬆️
common/reflect_resource.go	`82.93% <59.25%> (-1.45%)`	⬇️
common/customizable_schema.go	`0.00% <0.00%> (ø)`

... and 20 files with indirect coverage changes

edwardfeng-db · 2024-01-08T22:49:15Z

common/reflect_resource.go

+			field := strings.TrimSuffix(k, ".#")
+			log.Printf("[DEBUG] Suppressing diff for list or set %v: no value configured but platform returned some value (likely {})", field)
+			return true
+		}


This change is introduced in #3044

edwardfeng-db · 2024-01-08T22:50:02Z

common/reflect_resource.go

@@ -74,6 +218,111 @@ func MustSchemaPath(s map[string]*schema.Schema, path ...string) *schema.Schema
 	return sch
 }



Will need to figure out a way to merge these with the code introduced in #3044. Shouldn't be too hard

edwardfeng-db · 2024-01-09T13:51:10Z

clusters/resource_cluster.go

+	common.CustomizeSchemaPath(s, "cluster_log_conf", "dbfs", "destination").SetRequired()
+	common.CustomizeSchemaPath(s, "cluster_log_conf", "s3", "destination").SetRequired()
+	common.CustomizeSchemaPath(s, "spark_version").SetRequired()
+	common.CustomizeSchemaPath(s, "cluster_id").SetReadOnly()


This will introduce a "breaking" diff, but it shouldn't be a problem since users are not expected to set this any way. And this will address #2966

tanmay-db

Thanks for the PR, left some comments. Just to confirm, the nightly did pass on the PR right?

tanmay-db · 2024-01-12T13:08:36Z

clusters/resource_cluster.go

-		s["autotermination_minutes"].Default = 60
-		s["cluster_id"] = &schema.Schema{
+func (ClusterResourceProvider) CustomizeSchema(s map[string]*schema.Schema) map[string]*schema.Schema {
+	common.CustomizeSchemaPath(s, "enable_elastic_disk").SetOptional().SetComputed()


This looks a bit difficult to maintain and I think something similar would be required for jobs resource as well? should we split this up so it's easier to read, for example, we could split on top level field, for example init_scripts, workload_type? A question - how do we check if customizations are complete?

Yeah for the jobs resource, we would just set whatever path that leads to the cluster reference to be the schema derived from ClusterResourceProvider so we don't have to duplicate it.

For making sure the costomizations are complete, we rely on the diff schema PR check and I verified that there's no breaking change (only added field, nothing removed)

Ultimately, these things need to be derived one way or another from the OpenAPI specification. From my read, the main classes of issues are:

Computed fields (not in the OpenAPI spec, but potentially could be derived from comparing get to create/update).

Optional fields (this should be in the OpenAPI spec).

Required fields (this should be in the OpenAPI spec).

Fields that conflict with other fields (not yet in the OpenAPI spec, but can be with oneOf support).

OpenAPI spec missing information (in the short term, we can file a ticket to the upstream team, @edwardfeng-db would be great if you can do this; in the long term, our OpenAPI generator from @hectorcast-db will address this).

MaxItems validation (this could also be in the OpenAPI spec).

Deprecated fields (this is in the OpenAPI spec but is lost in the translation to the Go SDK).

Enum values (these are in the OpenAPI spec, but there is no convenient way to get all enum values in the Go SDK).

Sensitive data (needs an extension, probably, but could be in the OpenAPI spec).

Suppress diff

Here's how I think we can handle this:
2: Fields are optional by default, so I'm curious whether/why we need to call SetOptional() for so many of them. @edwardfeng-db why is this?
3, 5: file a ticket with the Clusters team to update their OpenAPI spec so that it is complete, including missing fields and the correct annotations for required/optional.
8: We can add a method to enums to list all values; then, we could automatically compute a validator for enum fields in the resource. Let's file a ticket for this @edwardfeng-db.
1, 4, 6, 7, 9, 10: I think most of these will be most easily addressed as we try to autogenerate the provider implementation from the OpenAPI spec itself; otherwise, they require hacks to embed this metadata somehow in Go SDK struct field tags.

Thanks. This sounds good. I'll create the follow up tickets.

For 2, there were many calls to SetOptional() because I was following the previous override function but I realized that a lot of these can be taken out because omit_empty exists on the openapi spec. I just took the unnecessary ones out, it should look cleaner now.

Ticket for enums: https://databricks.atlassian.net/browse/DECO-8056

Ticket for the cluster team to fix openapi spec https://databricks.atlassian.net/browse/ES-1004564

clusters/resource_cluster.go

common/reflect_resource.go

clusters/resource_cluster.go

mgyucht

Looking great. I don't think we need to block on the OpenAPI changes for this PR, but I do think we should take that as an immediate follow-up. Just a few comments, and we need to add a unit test for the alias behavior changes.

@tanmay-db let's also take note of these elements as things that need to be addressed as part of the autogeneration of TF resources from OpenAPI spec.

mgyucht · 2024-01-18T08:26:15Z

clusters/resource_cluster.go

-		s["autotermination_minutes"].Default = 60
-		s["cluster_id"] = &schema.Schema{
+func (ClusterResourceProvider) CustomizeSchema(s map[string]*schema.Schema) map[string]*schema.Schema {
+	common.CustomizeSchemaPath(s, "enable_elastic_disk").SetOptional().SetComputed()


Ultimately, these things need to be derived one way or another from the OpenAPI specification. From my read, the main classes of issues are:

Computed fields (not in the OpenAPI spec, but potentially could be derived from comparing get to create/update).

Optional fields (this should be in the OpenAPI spec).

Required fields (this should be in the OpenAPI spec).

Fields that conflict with other fields (not yet in the OpenAPI spec, but can be with oneOf support).

OpenAPI spec missing information (in the short term, we can file a ticket to the upstream team, @edwardfeng-db would be great if you can do this; in the long term, our OpenAPI generator from @hectorcast-db will address this).

MaxItems validation (this could also be in the OpenAPI spec).

Deprecated fields (this is in the OpenAPI spec but is lost in the translation to the Go SDK).

Enum values (these are in the OpenAPI spec, but there is no convenient way to get all enum values in the Go SDK).

Sensitive data (needs an extension, probably, but could be in the OpenAPI spec).

Suppress diff

Here's how I think we can handle this:
2: Fields are optional by default, so I'm curious whether/why we need to call SetOptional() for so many of them. @edwardfeng-db why is this?
3, 5: file a ticket with the Clusters team to update their OpenAPI spec so that it is complete, including missing fields and the correct annotations for required/optional.
8: We can add a method to enums to list all values; then, we could automatically compute a validator for enum fields in the resource. Let's file a ticket for this @edwardfeng-db.
1, 4, 6, 7, 9, 10: I think most of these will be most easily addressed as we try to autogenerate the provider implementation from the OpenAPI spec itself; otherwise, they require hacks to embed this metadata somehow in Go SDK struct field tags.

common/reflect_resource.go

common/reflect_resource_test.go

common/reflect_resource.go

mgyucht

LGTM, thanks man!

alexott

There are still changes required in the cluster resource.

clusters/resource_cluster.go

alexott · 2024-01-18T19:32:28Z

There is a permanent drift at least on Azure:

  # databricks_cluster.shared_autoscaling will be updated in-place
  ~ resource "databricks_cluster" "shared_autoscaling" {
        id                           = "0118-192555-7kqkpwml"
        # (14 unchanged attributes hidden)

      ~ azure_attributes {
          - availability       = "ON_DEMAND_AZURE" -> null
          - first_on_demand    = 1 -> null
          - spot_bid_max_price = -1 -> null
        }

        # (1 unchanged block hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Code to reproduce:

terraform {
  required_providers {
    databricks = {
      source = "databricks/databricks"
      version = "1.34.0"
    }
  }
}

provider "databricks" {
  # Configuration options
}


data "databricks_node_type" "smallest" {
  local_disk = true
}

data "databricks_spark_version" "latest_lts" {
  long_term_support = true
}

resource "databricks_cluster" "shared_autoscaling" {
  cluster_name            = "Shared Autoscaling"
  spark_version           = data.databricks_spark_version.latest_lts.id
  node_type_id            = data.databricks_node_type.smallest.id
  autotermination_minutes = 20
  idempotency_token = "123"
  autoscale {
    min_workers = 1
    max_workers = 50
  }
}

edwardfeng-db · 2024-01-19T16:29:33Z

@alexott Thanks! I fixed the problem with the permanent drift on azure_attributes. It was due to suppress_diff not set on those fields. Now we set those properly.

mgyucht · 2024-01-22T15:54:12Z

Todo: check aws & gcp attributes don't drift. Can add an integration test that create cluster works with no custom attributes set.

mgyucht · 2024-01-24T15:39:50Z

Checked other nightlies: they did fail previously before suppress diff was added to the cloud-specific resources.

### New Features and Improvements * Exporter: timestamps are now added to log entries ([#3146](#3146)). * Validate metastore id for databricks_grant and databricks_grants resources ([#3159](#3159)). * Exporter: Skip emitting of clusters that come from more cluster sources ([#3161](#3161)). * Fix typo in docs ([#3166](#3166)). * Migrate cluster schema to use the go-sdk struct ([#3076](#3076)). * Introduce Generic Settings Resource ([#2997](#2997)). * Update actions/setup-go to v5 ([#3154](#3154)). * Change default branch from `master` to `main` ([#3174](#3174)). * Add .codegen.json configuration ([#3180](#3180)). * Exporter: performance improvements for big workspaces ([#3167](#3167)). * update ([#3192](#3192)). * Exporter: fix generation of cluster policy resources ([#3185](#3185)). * Fix unit test ([#3201](#3201)). * Suppress diff should apply to new fields added in the same chained call to CustomizableSchema ([#3200](#3200)). * Various documentation updates ([#3198](#3198)). * Use common.Resource consistently throughout the provider ([#3193](#3193)). * Extending customizable schema with `AtLeastOneOf`, `ExactlyOneOf`, `RequiredWith` ([#3182](#3182)). * Fix `databricks_connection` regression when creating without owner ([#3186](#3186)). * add test code for job task order ([#3183](#3183)). * Allow using empty strings as job parameters ([#3158](#3158)). * Fix notebook parameters in acceptance test ([#3205](#3205)). * Exporter: Add retries for `Search`, `ReadContext` and `Import` operations when importing the resource ([#3202](#3202)). * Fixed updating owners for UC resources ([#3189](#3189)). * Adds `databricks_volumes` as data source ([#3150](#3150)). ### Documentation Changes ### Exporter ### Internal Changes

* Release v1.35.1 ### New Features and Improvements * Exporter: timestamps are now added to log entries ([#3146](#3146)). * Validate metastore id for databricks_grant and databricks_grants resources ([#3159](#3159)). * Exporter: Skip emitting of clusters that come from more cluster sources ([#3161](#3161)). * Fix typo in docs ([#3166](#3166)). * Migrate cluster schema to use the go-sdk struct ([#3076](#3076)). * Introduce Generic Settings Resource ([#2997](#2997)). * Update actions/setup-go to v5 ([#3154](#3154)). * Change default branch from `master` to `main` ([#3174](#3174)). * Add .codegen.json configuration ([#3180](#3180)). * Exporter: performance improvements for big workspaces ([#3167](#3167)). * update ([#3192](#3192)). * Exporter: fix generation of cluster policy resources ([#3185](#3185)). * Fix unit test ([#3201](#3201)). * Suppress diff should apply to new fields added in the same chained call to CustomizableSchema ([#3200](#3200)). * Various documentation updates ([#3198](#3198)). * Use common.Resource consistently throughout the provider ([#3193](#3193)). * Extending customizable schema with `AtLeastOneOf`, `ExactlyOneOf`, `RequiredWith` ([#3182](#3182)). * Fix `databricks_connection` regression when creating without owner ([#3186](#3186)). * add test code for job task order ([#3183](#3183)). * Allow using empty strings as job parameters ([#3158](#3158)). * Fix notebook parameters in acceptance test ([#3205](#3205)). * Exporter: Add retries for `Search`, `ReadContext` and `Import` operations when importing the resource ([#3202](#3202)). * Fixed updating owners for UC resources ([#3189](#3189)). * Adds `databricks_volumes` as data source ([#3150](#3150)). ### Documentation Changes ### Exporter ### Internal Changes * upd * readable * upd * upd

edwardfeng-db force-pushed the edwardfeng-db/migrate-cluster-schema branch 2 times, most recently from 39a8b24 to 3eb2e0b Compare January 8, 2024 17:57

edwardfeng-db requested review from alexott and mgyucht January 8, 2024 22:37

edwardfeng-db marked this pull request as ready for review January 8, 2024 22:38

edwardfeng-db requested review from a team as code owners January 8, 2024 22:38

edwardfeng-db commented Jan 8, 2024

View reviewed changes

edwardfeng-db commented Jan 9, 2024

View reviewed changes

edwardfeng-db requested a review from tanmay-db January 10, 2024 13:38

rebase

72164a9

edwardfeng-db force-pushed the edwardfeng-db/migrate-cluster-schema branch from bcff531 to 72164a9 Compare January 12, 2024 09:40

edwardfeng-db added 3 commits January 12, 2024 11:21

update

2d704f8

update

0b490e8

update

ca4a846

tanmay-db reviewed Jan 12, 2024

View reviewed changes

edwardfeng-db added 3 commits January 15, 2024 14:16

update

6573566

update

67a453e

update

36ac54a

edwardfeng-db commented Jan 16, 2024

View reviewed changes

clusters/resource_cluster.go Show resolved Hide resolved

edwardfeng-db commented Jan 16, 2024

View reviewed changes

clusters/resource_cluster.go Outdated Show resolved Hide resolved

edwardfeng-db added 7 commits January 16, 2024 14:41

update

b97ae10

update

815ec8b

update

08be9fd

update

ba0489e

update

e8c4229

update

f766b87

update

cdad66a

mgyucht reviewed Jan 18, 2024

View reviewed changes

update

61d4747

edwardfeng-db force-pushed the edwardfeng-db/migrate-cluster-schema branch from 4670568 to 61d4747 Compare January 18, 2024 09:29

update

2ec97b9

edwardfeng-db requested a review from mgyucht January 18, 2024 09:40

mgyucht approved these changes Jan 18, 2024

View reviewed changes

mgyucht enabled auto-merge January 18, 2024 14:25

mgyucht disabled auto-merge January 18, 2024 14:25

alexott requested changes Jan 18, 2024

View reviewed changes

clusters/resource_cluster.go Show resolved Hide resolved

clusters/resource_cluster.go Show resolved Hide resolved

clusters/resource_cluster.go Show resolved Hide resolved

clusters/resource_cluster.go Show resolved Hide resolved

clusters/resource_cluster.go Show resolved Hide resolved

update

78bf8bd

edwardfeng-db force-pushed the edwardfeng-db/migrate-cluster-schema branch from eefd9ad to 78bf8bd Compare January 19, 2024 13:39

update

8a90e50

Set cluster source to read-only

dcfdeb5

mgyucht requested a review from alexott January 24, 2024 15:42

mgyucht added 2 commits January 24, 2024 16:44

work

a6315fc

make cluster_source readonly

78fc653

alexott approved these changes Jan 28, 2024

View reviewed changes

mgyucht added this pull request to the merge queue Jan 29, 2024

Merged via the queue into master with commit 277537a Jan 29, 2024
5 checks passed

mgyucht deleted the edwardfeng-db/migrate-cluster-schema branch January 29, 2024 10:00

karolusz mentioned this pull request Jan 30, 2024

Extending customizable schema with AtLeastOneOf, ExactlyOneOf, RequiredWith #3182

Merged

5 tasks

tanmay-db mentioned this pull request Feb 5, 2024

Release v1.36.0 #3214

Merged

donatasm mentioned this pull request Feb 8, 2024

[ISSUE] Issue with databricks_cluster resource while upgrading provider from 1.35.0 to 1.36.1 #3231

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate cluster schema to use the go-sdk struct #3076

Migrate cluster schema to use the go-sdk struct #3076

edwardfeng-db commented Jan 5, 2024 •

edited

Loading

codecov-commenter commented Jan 8, 2024 •

edited

Loading

edwardfeng-db Jan 8, 2024

edwardfeng-db Jan 8, 2024

edwardfeng-db Jan 9, 2024

tanmay-db left a comment

tanmay-db Jan 12, 2024

edwardfeng-db Jan 15, 2024

mgyucht Jan 18, 2024

edwardfeng-db Jan 18, 2024 •

edited

Loading

edwardfeng-db Jan 18, 2024 •

edited

Loading

mgyucht left a comment •

edited

Loading

mgyucht Jan 18, 2024

mgyucht left a comment

alexott left a comment

alexott commented Jan 18, 2024

edwardfeng-db commented Jan 19, 2024

mgyucht commented Jan 22, 2024 •

edited

Loading

mgyucht commented Jan 24, 2024

		@@ -74,6 +218,111 @@ func MustSchemaPath(s map[string]schema.Schema, path ...string) schema.Schema
		return sch
		}

Migrate cluster schema to use the go-sdk struct #3076

Migrate cluster schema to use the go-sdk struct #3076

Conversation

edwardfeng-db commented Jan 5, 2024 • edited Loading

Changes

Tests

codecov-commenter commented Jan 8, 2024 • edited Loading

Codecov Report

edwardfeng-db Jan 8, 2024

Choose a reason for hiding this comment

edwardfeng-db Jan 8, 2024

Choose a reason for hiding this comment

edwardfeng-db Jan 9, 2024

Choose a reason for hiding this comment

tanmay-db left a comment

Choose a reason for hiding this comment

tanmay-db Jan 12, 2024

Choose a reason for hiding this comment

edwardfeng-db Jan 15, 2024

Choose a reason for hiding this comment

mgyucht Jan 18, 2024

Choose a reason for hiding this comment

edwardfeng-db Jan 18, 2024 • edited Loading

Choose a reason for hiding this comment

edwardfeng-db Jan 18, 2024 • edited Loading

Choose a reason for hiding this comment

mgyucht left a comment • edited Loading

Choose a reason for hiding this comment

mgyucht Jan 18, 2024

Choose a reason for hiding this comment

mgyucht left a comment

Choose a reason for hiding this comment

alexott left a comment

Choose a reason for hiding this comment

alexott commented Jan 18, 2024

edwardfeng-db commented Jan 19, 2024

mgyucht commented Jan 22, 2024 • edited Loading

mgyucht commented Jan 24, 2024

edwardfeng-db commented Jan 5, 2024 •

edited

Loading

codecov-commenter commented Jan 8, 2024 •

edited

Loading

edwardfeng-db Jan 18, 2024 •

edited

Loading

edwardfeng-db Jan 18, 2024 •

edited

Loading

mgyucht left a comment •

edited

Loading

mgyucht commented Jan 22, 2024 •

edited

Loading