-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate cluster schema to use the go-sdk struct #3076
Conversation
39a8b24
to
3eb2e0b
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #3076 +/- ##
==========================================
- Coverage 84.31% 83.46% -0.86%
==========================================
Files 162 166 +4
Lines 14412 14710 +298
==========================================
+ Hits 12152 12278 +126
- Misses 1563 1719 +156
- Partials 697 713 +16
|
common/reflect_resource.go
Outdated
field := strings.TrimSuffix(k, ".#") | ||
log.Printf("[DEBUG] Suppressing diff for list or set %v: no value configured but platform returned some value (likely {})", field) | ||
return true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is introduced in #3044
common/reflect_resource.go
Outdated
@@ -74,6 +218,111 @@ func MustSchemaPath(s map[string]*schema.Schema, path ...string) *schema.Schema | |||
return sch | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need to figure out a way to merge these with the code introduced in #3044. Shouldn't be too hard
clusters/resource_cluster.go
Outdated
common.CustomizeSchemaPath(s, "cluster_log_conf", "dbfs", "destination").SetRequired() | ||
common.CustomizeSchemaPath(s, "cluster_log_conf", "s3", "destination").SetRequired() | ||
common.CustomizeSchemaPath(s, "spark_version").SetRequired() | ||
common.CustomizeSchemaPath(s, "cluster_id").SetReadOnly() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will introduce a "breaking" diff, but it shouldn't be a problem since users are not expected to set this any way. And this will address #2966
bcff531
to
72164a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, left some comments. Just to confirm, the nightly did pass on the PR right?
clusters/resource_cluster.go
Outdated
s["autotermination_minutes"].Default = 60 | ||
s["cluster_id"] = &schema.Schema{ | ||
func (ClusterResourceProvider) CustomizeSchema(s map[string]*schema.Schema) map[string]*schema.Schema { | ||
common.CustomizeSchemaPath(s, "enable_elastic_disk").SetOptional().SetComputed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit difficult to maintain and I think something similar would be required for jobs resource as well? should we split this up so it's easier to read, for example, we could split on top level field, for example init_scripts
, workload_type
? A question - how do we check if customizations are complete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah for the jobs resource, we would just set whatever path that leads to the cluster reference to be the schema derived from ClusterResourceProvider
so we don't have to duplicate it.
For making sure the costomizations are complete, we rely on the diff schema PR check and I verified that there's no breaking change (only added field, nothing removed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, these things need to be derived one way or another from the OpenAPI specification. From my read, the main classes of issues are:
- Computed fields (not in the OpenAPI spec, but potentially could be derived from comparing get to create/update).
- Optional fields (this should be in the OpenAPI spec).
- Required fields (this should be in the OpenAPI spec).
- Fields that conflict with other fields (not yet in the OpenAPI spec, but can be with
oneOf
support). - OpenAPI spec missing information (in the short term, we can file a ticket to the upstream team, @edwardfeng-db would be great if you can do this; in the long term, our OpenAPI generator from @hectorcast-db will address this).
- MaxItems validation (this could also be in the OpenAPI spec).
- Deprecated fields (this is in the OpenAPI spec but is lost in the translation to the Go SDK).
- Enum values (these are in the OpenAPI spec, but there is no convenient way to get all enum values in the Go SDK).
- Sensitive data (needs an extension, probably, but could be in the OpenAPI spec).
- Suppress diff
Here's how I think we can handle this:
2: Fields are optional by default, so I'm curious whether/why we need to call SetOptional()
for so many of them. @edwardfeng-db why is this?
3, 5: file a ticket with the Clusters team to update their OpenAPI spec so that it is complete, including missing fields and the correct annotations for required/optional.
8: We can add a method to enums to list all values; then, we could automatically compute a validator for enum fields in the resource. Let's file a ticket for this @edwardfeng-db.
1, 4, 6, 7, 9, 10: I think most of these will be most easily addressed as we try to autogenerate the provider implementation from the OpenAPI spec itself; otherwise, they require hacks to embed this metadata somehow in Go SDK struct field tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This sounds good. I'll create the follow up tickets.
For 2, there were many calls to SetOptional()
because I was following the previous override function but I realized that a lot of these can be taken out because omit_empty
exists on the openapi spec. I just took the unnecessary ones out, it should look cleaner now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticket for enums: https://databricks.atlassian.net/browse/DECO-8056
Ticket for the cluster team to fix openapi spec https://databricks.atlassian.net/browse/ES-1004564
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great. I don't think we need to block on the OpenAPI changes for this PR, but I do think we should take that as an immediate follow-up. Just a few comments, and we need to add a unit test for the alias behavior changes.
@tanmay-db let's also take note of these elements as things that need to be addressed as part of the autogeneration of TF resources from OpenAPI spec.
clusters/resource_cluster.go
Outdated
s["autotermination_minutes"].Default = 60 | ||
s["cluster_id"] = &schema.Schema{ | ||
func (ClusterResourceProvider) CustomizeSchema(s map[string]*schema.Schema) map[string]*schema.Schema { | ||
common.CustomizeSchemaPath(s, "enable_elastic_disk").SetOptional().SetComputed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, these things need to be derived one way or another from the OpenAPI specification. From my read, the main classes of issues are:
- Computed fields (not in the OpenAPI spec, but potentially could be derived from comparing get to create/update).
- Optional fields (this should be in the OpenAPI spec).
- Required fields (this should be in the OpenAPI spec).
- Fields that conflict with other fields (not yet in the OpenAPI spec, but can be with
oneOf
support). - OpenAPI spec missing information (in the short term, we can file a ticket to the upstream team, @edwardfeng-db would be great if you can do this; in the long term, our OpenAPI generator from @hectorcast-db will address this).
- MaxItems validation (this could also be in the OpenAPI spec).
- Deprecated fields (this is in the OpenAPI spec but is lost in the translation to the Go SDK).
- Enum values (these are in the OpenAPI spec, but there is no convenient way to get all enum values in the Go SDK).
- Sensitive data (needs an extension, probably, but could be in the OpenAPI spec).
- Suppress diff
Here's how I think we can handle this:
2: Fields are optional by default, so I'm curious whether/why we need to call SetOptional()
for so many of them. @edwardfeng-db why is this?
3, 5: file a ticket with the Clusters team to update their OpenAPI spec so that it is complete, including missing fields and the correct annotations for required/optional.
8: We can add a method to enums to list all values; then, we could automatically compute a validator for enum fields in the resource. Let's file a ticket for this @edwardfeng-db.
1, 4, 6, 7, 9, 10: I think most of these will be most easily addressed as we try to autogenerate the provider implementation from the OpenAPI spec itself; otherwise, they require hacks to embed this metadata somehow in Go SDK struct field tags.
4670568
to
61d4747
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks man!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still changes required in the cluster resource.
There is a permanent drift at least on Azure:
Code to reproduce: terraform {
required_providers {
databricks = {
source = "databricks/databricks"
version = "1.34.0"
}
}
}
provider "databricks" {
# Configuration options
}
data "databricks_node_type" "smallest" {
local_disk = true
}
data "databricks_spark_version" "latest_lts" {
long_term_support = true
}
resource "databricks_cluster" "shared_autoscaling" {
cluster_name = "Shared Autoscaling"
spark_version = data.databricks_spark_version.latest_lts.id
node_type_id = data.databricks_node_type.smallest.id
autotermination_minutes = 20
idempotency_token = "123"
autoscale {
min_workers = 1
max_workers = 50
}
} |
eefd9ad
to
78bf8bd
Compare
@alexott Thanks! I fixed the problem with the permanent drift on |
Todo: check aws & gcp attributes don't drift. Can add an integration test that create cluster works with no custom attributes set. |
Checked other nightlies: they did fail previously before suppress diff was added to the cloud-specific resources. |
### New Features and Improvements * Exporter: timestamps are now added to log entries ([#3146](#3146)). * Validate metastore id for databricks_grant and databricks_grants resources ([#3159](#3159)). * Exporter: Skip emitting of clusters that come from more cluster sources ([#3161](#3161)). * Fix typo in docs ([#3166](#3166)). * Migrate cluster schema to use the go-sdk struct ([#3076](#3076)). * Introduce Generic Settings Resource ([#2997](#2997)). * Update actions/setup-go to v5 ([#3154](#3154)). * Change default branch from `master` to `main` ([#3174](#3174)). * Add .codegen.json configuration ([#3180](#3180)). * Exporter: performance improvements for big workspaces ([#3167](#3167)). * update ([#3192](#3192)). * Exporter: fix generation of cluster policy resources ([#3185](#3185)). * Fix unit test ([#3201](#3201)). * Suppress diff should apply to new fields added in the same chained call to CustomizableSchema ([#3200](#3200)). * Various documentation updates ([#3198](#3198)). * Use common.Resource consistently throughout the provider ([#3193](#3193)). * Extending customizable schema with `AtLeastOneOf`, `ExactlyOneOf`, `RequiredWith` ([#3182](#3182)). * Fix `databricks_connection` regression when creating without owner ([#3186](#3186)). * add test code for job task order ([#3183](#3183)). * Allow using empty strings as job parameters ([#3158](#3158)). * Fix notebook parameters in acceptance test ([#3205](#3205)). * Exporter: Add retries for `Search`, `ReadContext` and `Import` operations when importing the resource ([#3202](#3202)). * Fixed updating owners for UC resources ([#3189](#3189)). * Adds `databricks_volumes` as data source ([#3150](#3150)). ### Documentation Changes ### Exporter ### Internal Changes
* Release v1.35.1 ### New Features and Improvements * Exporter: timestamps are now added to log entries ([#3146](#3146)). * Validate metastore id for databricks_grant and databricks_grants resources ([#3159](#3159)). * Exporter: Skip emitting of clusters that come from more cluster sources ([#3161](#3161)). * Fix typo in docs ([#3166](#3166)). * Migrate cluster schema to use the go-sdk struct ([#3076](#3076)). * Introduce Generic Settings Resource ([#2997](#2997)). * Update actions/setup-go to v5 ([#3154](#3154)). * Change default branch from `master` to `main` ([#3174](#3174)). * Add .codegen.json configuration ([#3180](#3180)). * Exporter: performance improvements for big workspaces ([#3167](#3167)). * update ([#3192](#3192)). * Exporter: fix generation of cluster policy resources ([#3185](#3185)). * Fix unit test ([#3201](#3201)). * Suppress diff should apply to new fields added in the same chained call to CustomizableSchema ([#3200](#3200)). * Various documentation updates ([#3198](#3198)). * Use common.Resource consistently throughout the provider ([#3193](#3193)). * Extending customizable schema with `AtLeastOneOf`, `ExactlyOneOf`, `RequiredWith` ([#3182](#3182)). * Fix `databricks_connection` regression when creating without owner ([#3186](#3186)). * add test code for job task order ([#3183](#3183)). * Allow using empty strings as job parameters ([#3158](#3158)). * Fix notebook parameters in acceptance test ([#3205](#3205)). * Exporter: Add retries for `Search`, `ReadContext` and `Import` operations when importing the resource ([#3202](#3202)). * Fixed updating owners for UC resources ([#3189](#3189)). * Adds `databricks_volumes` as data source ([#3150](#3150)). ### Documentation Changes ### Exporter ### Internal Changes * upd * readable * upd * upd
Changes
tf
tags on the manually maintained structsResourceProviderStruct
which contains a method for aliases and all of the customizations.Aliases() map[string]string
CustomizeSchema(map[string]*schema.Schema) map[string]*schema.Schema
SetOptional
, etc so that we can easily write customization logics into theCustomizeSchema
function to replace the currenttf
tagsTests
make test
run locallydocs/
folderinternal/acceptance