-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use dynamic walking to validate unique resource keys #1614
Conversation
…ction and modify all diagnostics paths to be relative to the bundle root path
k := p[1].Key() | ||
|
||
// dyn.Path under the hood is a slice. So, we need to clone it. | ||
pathsByKey[k] = append(pathsByKey[k], slices.Clone(p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use dyn.Path.Append, it does work correctly as it copies internally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dyn.Path.Append
is used to combine two paths into a single path. In this case we are tracking two separate paths by tracking them in a []dyn.Path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shreyas-goenka but why do you need clone then? You don't seem to modify the path anywhere anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is not correct in that case. We would end up appending the pointer p(ie slice) here, the value to which it points is changed upstream as we walk the configuration tree.
func(p dyn.Path, v dyn.Value) (dyn.Value, error) {
// The key for the resource. Eg: "my_job" for jobs.my_job.
The value p
here is a pointer, and the underlying value to it is the prefix
dyn.Path in the visit
functions, which reuses the same pointer apparently/
bundle/tests/validate/duplicate_resource_name_in_multiple_locations/databricks.yml
Show resolved
Hide resolved
libs/dyn/mapping.go
Outdated
return m.pairs | ||
pairs := make([]Pair, len(m.pairs)) | ||
copy(pairs, m.pairs) | ||
sort.Slice(pairs, func(i, j int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make the order in which dyn.MapByPattern
walks configuration fields deterministic.
Even though dyn.Mapping
represents key values fields as a slice, the order in which elements in the slice are present are influenced by multiple sources of randomness like the order in which configuration files are parsed, glob patterns are expanded or empty values are added to the configuration tree during normalization.
Without this we wont be able to make the assertions on []dyn.Location
and []dyn.Path
we make in the unit tests added in this PR.
This modification also makes Pairs()
function nicer since the order now is guaranteed based on the content, similar to how maps work in C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to how maps work in C++.
Except only they implemented differently in C++ and not with sorting :)
Without this we wont be able to make the assertions on []dyn.Location and []dyn.Path we make in the unit tests added in this PR.
Instead of assert.Equal(t, tc.diagnostics, diags)
you could loop through paths / locations and do the Contains, right?
The downside is that what used to be a O(1) call becomes O(nlogn) call now and we use .Pairs() extensively (think as an example of visit
call)
randomness like the order in which configuration files are parsed, glob patterns are expanded or empty values are added to the configuration tree during normalization
Is it really true? Most if not all of this properties are represented as slices and should be deterministic.
If we really need alphabetical order and save the performance we could do the sorting on the Add / Append
to the map operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really true? Most if not all of this properties are represented as slices and should be deterministic.
Yeah, examples of non-determinisim include:
cli/libs/dyn/convert/normalize.go
Line 128 in 383d580
for k, index := range info.Fields { Line 43 in 383d580
for k, v := range vin {
The feedback about performance is fair. I figured it dwarfs the API calls/file IO but let me think of a way to retain performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, now we sort the locations inline to assert the diags.
Bundles: * Add resource for UC schemas to DABs ([#1413](#1413)). Internal: * Use dynamic walking to validate unique resource keys ([#1614](#1614)). * Regenerate TF schema ([#1635](#1635)). * Add upgrade and upgrade eager flags to pip install call ([#1636](#1636)). * Added test for negation pattern in sync include exclude section ([#1637](#1637)). * Use precomputed terraform plan for `bundle deploy` ([#1640](#1640)).
Bundles: * Add resource for UC schemas to DABs ([#1413](#1413)). Internal: * Use dynamic walking to validate unique resource keys ([#1614](#1614)). * Regenerate TF schema ([#1635](#1635)). * Add upgrade and upgrade eager flags to pip install call ([#1636](#1636)). * Added test for negation pattern in sync include exclude section ([#1637](#1637)). * Use precomputed terraform plan for `bundle deploy` ([#1640](#1640)).
Changes
This PR:
dyn.MapByPattern
func) to validate no two resources have the same resource key. The allows us to remove this validation at merge time.dyn.Mapping
to always return a sorted slice of pairs. This makes traversal functions likedyn.Walk
ordyn.MapByPattern
deterministic.Tests
Unit tests. Also manually.