Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic deepmerge() function #31815

Closed
schollii opened this issue Sep 19, 2022 · 23 comments
Closed

Generic deepmerge() function #31815

schollii opened this issue Sep 19, 2022 · 23 comments
Labels
enhancement functions new new issue not yet triaged

Comments

@schollii
Copy link

schollii commented Sep 19, 2022

Terraform Version

1.2.9

Use Cases

Merging of files read via jsondecode and yamldecode, merge of nested maps and objects tree typically used for configuration or lookups

Attempted Solutions

  • merge(): does not recursively merge values that are objects and maps

  • Kyle Kotowik's deepmerge module: is computationally intensive (since the tree must be flattened, merged, then re-created) and is currently not usable if infracost is used for PR costing

  • Provider: not attempted yet but it is a fallback solution if this proposal is not accepted, because more work to create and maintain than the proposed builtin, and more work to use (since will have to add a provider, define data source etc)

Proposal

The deepmerge(list_of_items_to_merge, merge_config}) would behave like this:

Takes a list of items to merge [item1, item2, ...], and an optional merge-configuration object {list_merge_strategy="replace", type_mismatch_strategy="complex_wins"}, and returns a single item that is a merge of all items, based on the merge configuration.

The merge algorithm for list_of_items_to_merge is as follows:

  • items of different types are handled based on the type_mismatch_strategy:
    - complex_wins: object and map win over lists which win over literals; the losers are removed from the merge;
    - abort: abort the terraform plan or apply
  • if all items are literals, the rightmost item is used
  • if the remaining items are lists, the list_merge_strategy specified (which defaults to "replace") is applied:
    - replace: the list of the right most argument applies
    - concat: the sequence of lists is concatenated in the same order as arguments
    - merge: each list is appended with null so they all have the same length, then the deepmerge() is applied to each slice through the lists
  • if the remaining items are maps or objects (the most common case), if more than one defines the same key or attribute, the deepmerge() is applied to the sequence of associated values across all the items

This is backwards compatible with all versions of terraform, and should cover 99% of use cases.

Examples:

deepmerge([a, b, c, d]):

  • if a, b, c, d are all literals (string, bool, number), then d is return
  • if all items are lists, then they are all merged according to list_merge_strategy
  • if a and b are lists and c and d are literals, then c and d are discarded and a and b are merged according to list_merge_strategy
  • if a and b are maps or objects then c and d are discarded and for each key in a, b, deepmerge() is called on the corresponding list of associated values, and objects that don't have the key are not included. So if a has key1 and key2, and b has key1 and key3, then the set of keys of the result is (key1, key2, key3) and there will be 3 calls to deepmerge(): deepmerge([a.key1, b.key1]), deepmerge([a.key2]), and deepmerge([b.key3]) (the merge-config is passed to these but left out here for simplicity)
  • if all objects are maps or objects then the previous rule is applied to all items instead of just a and b

References

Many threads of discussion about this on community forum and github issues related to merge(). Also, the table in #24987 (comment).

@schollii schollii added enhancement new new issue not yet triaged labels Sep 19, 2022
@schollii schollii changed the title Simple deepmerge() function Generic deepmerge() function Sep 19, 2022
@crw
Copy link
Collaborator

crw commented Sep 19, 2022

Thanks for gathering these requirements from the issues.

@sftim
Copy link

sftim commented Dec 2, 2022

I'd expect deepmerge() to take items with the same top-level as its arguments a varying number of items as its function arguments. Otherwise it will not have an interface that matches merge().

Example documentation:

deepmerge takes an arbitrary number of maps or objects, and returns a single map or object that contains a merged set of elements from all arguments. If the elements are maps or objects, this function performs a deep merge (like calling merge on each child element); if they are plain types (strings, numbers, or bools) then the behavior is the same as merge

If more than one given map or object defines the same key or attribute, then the one that is later in the argument sequence takes precedence. If the argument types do not match, the resulting type will be an object matching the type structure of the attributes after the merging rules have been applied.

@schollii
Copy link
Author

schollii commented Dec 7, 2022

I don't understand the statement, so I'm not sure what might need changing.

@sftim
Copy link

sftim commented Dec 7, 2022

I'm saying that deepmerge should have the same interface as merge.

@schollii
Copy link
Author

schollii commented Dec 7, 2022

Thanks for the clarification.

I think you cannot have same interface as merge AND support multiple ways of deep merging, without resorting to multiple deepmerge functions.

Alternatively, could a deepmerge function check its first argument (a map) for a certain structure, like the presence of only one key, called "options", and if the map fits then the function uses it to choose the merge behavior and applies merge to remainder of map args.

The docs would still be straightforward addition to what you wrote. Eg "... If the first arg is a map that satisfies certain conditions (see section XYZ for description), it will be used to control the merge of list/map/object elements of the map args, instead of being merged with the other map args."

@sftim
Copy link

sftim commented Dec 7, 2022

The first argument could be a list of merge options and then still support every input that merge can accept.

@schollii
Copy link
Author

schollii commented Dec 7, 2022

That works, although using a map has several advantages over a list:

  • it is optional since it is of the same type as the remaining args ... Or can a function be coded so as to accept a first arg of 2 types and take action accordingly?
  • each option is optional : with a list the options are positional, whereas with a map they are named, this allows you to specify only those options that are different from the default, not possible with list

It would be nice to not have to specify the options at all, and to be able to only specify those that need to be.

@yordis
Copy link

yordis commented Jul 2, 2023

Hey peeps, any updates over here?

@aladdin-atypon
Copy link

Ping!

@crw
Copy link
Collaborator

crw commented Oct 25, 2023

This would likely need to be implemented as a provider (external) function, if or when that becomes a possibility. Please see: #27696.

@nitrocode
Copy link

There is a provider as a workaround used by this submodule

https://registry.terraform.io/modules/cloudposse/config/yaml/0.2.0/submodules/deepmerge

and without a provider using this module

https://registry.terraform.io/modules/Invicton-Labs/deepmerge/null/latest

The intent of ticket is to use a native function to avoid these workarounds

@crw
Copy link
Collaborator

crw commented Mar 6, 2024

Thank you for your continued interest in this issue.

Terraform version 1.8 launches with support of provider-defined functions. It is now possible to implement your own functions! We would love to see this implemented as a provider-defined function.

Please see the provider-defined functions documentation to learn how to implement functions in your providers. If you are new to provider development, learn how to create a new provider with the Terraform Plugin Framework. If you have any questions, please visit the Terraform Plugin Development category in our official forum.

We hope this feature unblocks future function development and provides more flexibility for the Terraform community. Thank you for your continued support of Terraform!

@hegerdes
Copy link

It is great to see the possibility to implement this via provider functions and I really believe these can be useful for provider specific tasks but recursive merge is a generic problem and it seems like that hashicorp just outsources the problem to the providers.

Creating providers with a specific resource to solve such a problem was possible before and has been done by people, the problem is that this open the door for non standard implementations for the same function/task. Deep merge is not an obscure function and people wanting this since TF v0.12.16 (2020 #24987) with 129 upvotes since locked.

Having a great standard library that solves a common problems that everybody has is impotent, failing this results in so much effort and 100 different implementations from each provider that sort of do the same - except when they dont. In the extrem the results can be seen in the javerscript world.

Sad to see that the built in functions in terraform are kind of limited and dont get the attention they deserve.

@sftim
Copy link

sftim commented Apr 14, 2024

I actually expect that either Terraform or OpenTofu (or both) will make some of the built-in functions be syntactic sugar for functions from a built-in provider. For example, base64decode as an alias for provider::base64::decode.
If I guessed right and that is the direction of travel, having deep merge live in a provider makes a lot of sense.

@isometry
Copy link

I've successfully drafted a terraform-provider-deepmerge using the new provider functionality in Terraform 1.8.0. It uses mergo under the hood and currently exposes two functions: provider::deepmerge::replace (mergo with "WithOverride" for recursive merge-like behaviour), and provider::deepmerge::append (mergo with both "WithOverride" and "WithAppendSlice") to have lists append rather than overwrite. At present, no special support for merging of Sets (they're just treated as dumb lists), nor any advanced merge-lists-by-key logic.
Any thoughts on the function naming or basic functionality before I publish?

@isometry
Copy link

isometry commented Apr 14, 2024

I actually expect that either Terraform or OpenTofu (or both) will make some of the built-in functions be syntactic sugar for functions from a built-in provider. For example, base64decode as an alias for provider::base64::decode.

To be fair, ~95% of Terraforms built-in functions are already just calls to the function library in go-cty1 (including the existing merge2). Those wanting a builtin deepmerge should probably start by contributing an implementation there rather than here.

@schollii
Copy link
Author

I've successfully drafted a terraform-provider-deepmerge using the new provider functionality...
Any thoughts on the function naming or basic functionality before I publish?

Nice! Would it make sense for your provider to be slightly higher level, eg "tree" and provide functions related to tree structures, like deepmerge() would be the first function. Community could contribute more, making "tree" the go-to provider for all tree related processing.

Otherwise we could end up with a pile of tiny providers, but maybe that's not such a bad thing either.

@luis-guimaraes-exoawk
Copy link

I actually expect that either Terraform or OpenTofu (or both) will make some of the built-in functions be syntactic sugar for functions from a built-in provider. For example, base64decode as an alias for provider::base64::decode.

To be fair, ~95% of Terraforms built-in functions are already just calls to the function library in go-cty1 (including the existing merge2). Those wanting a builtin deepmerge should probably start by contributing an implementation there rather than here.

This seems like the correct answer to this problem. It would help standardize a deepmerge function instead of relying on providers to implement it "correctly".

@isometry
Copy link

I've published an initial implementation with mergo semantics at https://registry.terraform.io/providers/isometry/deepmerge/latest

Code is at https://github.com/isometry/terraform-provider-deepmerge, contributions welcome!

tfplugindocs generated documentation "isn't great", and I haven't spent the time to work out how to bend it to my will yet, but usage is simple:

output "test" {
  value = provider::deepmerge::mergo(local.map1, local.map2)
}

An arbitrary number of arbitrarily deep maps are theoretically supported, with similar merge logic to the built-in merge by default (i.e. key-values overridden with priority to the right). As it's invalid to merge a string with a map, I've tried a "options-pattern" approach to overriding mergo behaviour: you can provide any combination of the additional arguments "no_override" (to merge without override, i.e. priority to the left) and "append_lists" (to have list values append rather than replace; alias "append"). Position doesn't matter, and other strings will throw the error "unrecognised option".

@isometry
Copy link

This seems like the correct answer to this problem. It would help standardize a deepmerge function instead of relying on providers to implement it "correctly".

Well… you tried. If you don't like the mergo semantics I've implemented, you're welcome to contribute an alternative to my provider (this is why I ended up explicitly using mergo as the function name, to leave room for other implementations).

@luis-guimaraes-exoawk
Copy link

@isometry Yeah...
I have no issue with your implementation, seems like a sane one 😄 . I was just pushing to get a standard one so we can avoid similarly named providers/functions with different behaviours. If we can make your provider a "standard" one, it probably won't be an issue!

@apparentlymart
Copy link
Member

Hi all,

As discussed over in the earlier issue on this subject #24987 (comment), we chose to encourage implementing this in external providers rather than as a builtin because there seems to be no strong consensus for what exactly the "deep merge" operation implies, and so it seems that multiple implementations is both inevitable and desirable to meet the different use-cases that led to those differing definitions of the function in other languages.

I see that the mergo function in isometry/deepmerge is aiming to deal with multiple variations at once using special string arguments, which is an interesting approach! I've not tried that provider directly myself, but those who were interested in this issue might wish to study it to see if it meets their needs. If you have a need it doesn't meet then you still have the option of implementing your own provider with different behavior if you wish.

Since there is now at least one provider offering an answer to this, I'm going to close this issue. Thanks!

Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement functions new new issue not yet triaged
Projects
None yet
Development

No branches or pull requests

10 participants