-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lang/funcs: add jsonpath function #22460
Conversation
9050dbd
to
d666611
Compare
Hi @bonifaido! Thanks for working on this. This is an interesting proposal for sure, but I feel a little unsure as to what this adds that isn't already possible using the native Terraform language syntax and existing functions. For example:
If you'd be willing, I'd prefer to take a step back here and talk about what the use-cases are, and then we can discuss this and other potential solutions to address those use-cases. While I do very much appreciate you taking the time to work on this PR, I'm cautious about introducing another competing traversal syntax into Terraform since it is likely to cause confusion as to which approach is best for a given situation, and confusion for readers of configuration about whether the these two approaches have a significantly different meaning or purpose. If we can identify some specific use-cases to address, hopefully we can find a way to address them that doesn't also duplicate existing functionality along the way. Thanks! |
Hi @apparentlymart! Yes, as I mentioned, this is just a POC/proposal, so I'm also not 100% sure about that we need this, and I have no issues if drop this entirely. I think that accessing deeply nested objects could have been easier, when some parts of the path didn't exist, for example in this example, what happens if
Of course, this can be solved with existing Terraform syntax as well, with a chain of "lookup with default as empty map" calls, but this version is hard to read: lookup(lookup(lookup(local.mymap, "a", {}), "b", {}), "c", 12) We mostly use this to parse YAML/JSON configuration files. |
@apparentlymart as @bonifaido pointed out we lookup values in nested objects on various levels. At the moment that's cumbersome to say the least (see the example above). We decided to propose jsonpath, because it's an established pattern for traversing nested JSON objects (eg. aws cli uses it as well), but we would be happy to work on any kind of solution that you think fits Terraform more. |
Ping @apparentlymart |
+1 on In my use case, I have a series of config blocks that I filter down and then apply a
none of the following work:
FWIW, it seems like a simple path traversal might catch a good chunk of the use cases without the full power/complexity of a JsonPath. |
Hi again! Sorry for the silence here. Got distracted by some other work... I think this boils down to a couple different language design tensions:
For these reasons, I still feel ambivalent about the idea of encouraging "try this and use a default if it doesn't work" as a common design pattern, but I can at least say that I'd rather not use JSONPath to meet that use-case if we do move forward with it, and would want to use Terraform's (HCL's) relative traversal syntax instead, matching what we see in the If either of you are able to share some examples of situations where you've needed to use constructions like |
Thanks for the detailed and thoughtful response! I don't know how common my use-case is, but it seems like something that may become more common with terraform's recent We use terraform to provision a few dozen SFTP clients and their client-specific configurations in GCP, including the SFTP server they connect to, the cron job schedules that trigger data pulls, cloud storage + pubsub notifications associated with outgoing data, encryption keys for each client, etc. We have a single
As you can see, we essentially have a collection of features available for each partner in several categories. For triggering, we offer pubsub, GCS, and scheduled jobs. For src/destination, we can connect to SFTP servers or directly read+write to partner GCS buckets. Since each option requires some additional config, we can't really switch on a single enum, or list out a bunch of features and have boolean on/off switches and I think that approach would be much less readable. Having this single file makes configuring a new partner dead simple-- folks without terraform experience can see what their options are by inspection. At the same time, we can interpolate values into the config (e.g., Our terraform resources and modules slice this config into pieces and use
Instead, the filter step becomes even more convoluted:
I wholeheartedly agree with you about making sure the schema is configured properly. Unfortunately Terraform's native type system, even with TF .12 changes, is not strong enough for this kind of config. Instead, we validate the schema of this blob using an As I said, this may be an extreme usecase, or maybe not what terraform is intended for. I will say that this pattern of "uber config that get sliced into parts and divvied out to modules + resources" is becoming popular on our repos, especially since we recently upgraded and now allow |
Thanks for the detailed answer!
In our case it's rather having a default value and allowing the user to override it. Technically it's still an incomplete structure, but to add another detail: we decode YAML files and allow users to pass complex structures there. The reason we do that is because those YAML structures actually end up in helm chart installations and it's easier for users to write a familiar syntax for helm chart values. Given the huge amount of parameters, terraform's flat variable design doesn't really work for us in these scenarios. Some of those values though are used in other places as well which is when we want to traverse a nested structure and return a default value. |
Thanks for sharing these use-case details, both! @jakebiesinger-onduo, this sort of "mega-object" configuration style is not a configuration style we've seen a lot, but I can see the attractions to it. We've generally been encouraging a quite different strategy of separation of concerns via module composition, where the root module consists mainly of calls to various modules that each describe one part of the overall system, and data flows between them so that it's easier to see how the components are interconnected. For the system you described here, I suppose that would end up looking like one module per partner where the module itself replaces the That is a variant of the idea discussed under "multi-cloud abstractions" in the Module Composition guide, albeit probably abstracting over a few different ways to achieve the same thing in the same "cloud" in your case. My worry about providing a big, all-encompassing data structure like that and then pulling it apart into individual resources would be that it seems likely that it would be hard to predict exactly how a change to the data structure will impact the described infrastructure, and in turn folks making those changes are less likely to be able to confidently interpret Terraform's plan. One reason why we try to discourage "heavy" abstractions is that anyone making changes to a Terraform configuration ideally ought to be able to look at the resulting plan and confirm that it matches what they expected to happen. @sagikazarmark in your case I feel like I'd want to approach it by first normalizing the data structure (applying the defaults, making sure all of the specified values are of the correct type, etc) all together in one place, and then in the rest of the configuration just assume that the data is already in the right shape. That would then keep all of the "ugly" (subjectively) dynamic type wrangling together in one place and let most of the configuration be straightforward, unconditional references into that normalized data structure. With that said, it's not easy to write that sort of normalization in Terraform today either. I'm not sure if you were talking specifically about the Helm locals {
raw_chart = yamldecode(file("${path.module}/chart.yaml"))
chart = {
apiVersion = raw_chart.apiVersion
name = raw_chart.name
version = raw_chart.version
kubeVersion = lookup(raw_chart, "kubeVersion", ">= 0.0.0")
description = lookup(raw_chart, "description", null)
keywords = lookup(raw_chart, "keywords", [])
home = lookup(raw_chart, "home", null)
sources = lookup(raw_chart, "sources", [])
maintainers = [
for m in lookup(raw_chart, "maintainers", []) : {
name = m.name
email = lookup(m, "email", null)
url = lookup(m, "url", null)
}
]
engine = lookup(raw_chart, "engine", "gotpl")
icon = lookup(raw_chart, "icon", null)
appVersion = lookup(raw_chart, "appVersion", null)
deprecated = lookup(raw_chart, "deprecated", false)
tillerVersion = lookup(raw_chart, "tillerVersion", ">2.0.0")
}
} With that normalization in place then elsewhere in the configuration I could e.g. write Maybe it would serve us better to work on making it easier to write normalization expressions like the above in a more readable way, so that you could meet your use-case in a way that would reduce the sprawling complexity of conditional lookups. I remember in another issue (which sadly I wasn't able to find quickly now to reference) we were discussing a possible 🤔 Lots of language design questions to noodle on here. Thanks again to both of you for sharing your use-cases. Both of them are somewhat novel approaches that I've not seen a lot in the wild, but I want to be clear that I'm not trying to tell you that they are invalid approaches, I'm just thinking aloud about how they compare to existing approaches I have seen before and whether there are different ways to meet these intents within existing language features. I'm thinking maybe we should divert this discussion into a feature request issue rather than this PR, since this has become more of a design question than a review of a proposed implementation. I'm running out of time for the day today, so I won't be able to do this immediately, but I'm thinking maybe we record each of those slightly-different use-cases in its own issue and then discuss some different ways we could address them. Does that sound reasonable to you both? If so, I'm happy to do the paperwork of adapting what you already shared into the "Feature Request" template, since you already took the trouble of writing out your examples in some detail. |
Thanks for the detailed answer again! Unfortunately we are talking about We have given it quite a few thoughts actually, and nested object traversing is the best we could come up with. |
Thanks again for the detailed feedback! What you describe here makes sense and we'e certainly contemplated it. The idea of having a module instantiation per partner certainly groups the associated resources more nicely (e.g., you see changes to That said, the two approaches aren't incompatible. Lots more folks understand config files than understand terraform modules. The super-config presented here just needs a All that said, we've effectively reduced the depth by 1 here. Further modularization can help abstract stuff and in doing so reduce depth, but the general problem hasn't gone away IMHO. We want more and more things to be infra-as-code and those things are often nested and complex. It feels like maybe a false dichotomy to offer up modularization + decomposition as the solution-- I think we can recommend that path while still allowing an escape hatch for more complex cases? As I mentioned before, I assumed that simple path-based traversal was a thing in terraform. It just seems odd to allow arbitrarily nested maps without a clean way to get to those values. |
Aaaand sorry, yes, happy to move this conversation to a feature request. Probably should have done that before the novel of a response :) |
Yeah, same here, we don't want to force this solution at all, if it starts a conversation, we are more than happy. |
Closing this, for now, I hope it will reborn some time as a form a feature-request rather :) |
Hi all! Thanks for the feedback. I've created #23378 "Top-level Configuration Abstractions" (for want of a better name for the use-case) to represent the first of the break-out use-cases discussed above. I tried to elaborate there on some more of my thought process and collect some links to other issues that seem in a similar spot. I had been planning to open a second one named something like "Data Structure Normalization" but it sounds like I misread what @sagikazarmark described and, reflecting on it again now with fresh eyes, I see that a summary like that would be describing a solution rather than a use-case anyway. I think I'd prefer to make it more specific to the problem at hand, so something like "Passing Helm Charts |
In the forthcoming release v0.12.20 we're planning to include a new function locals {
mymap = { "a" = { "b" = 3 } }
mylist = [local.mymap]
}
output "jsonpath-found" {
value = try(local.mymap.a.b, 12)
}
output "jsonpath-notfound" {
value = try(local.mymap.a.b.c, 12)
}
output "jsonpath-list" {
value = try(local.mylist[0], {})
} The As is common with special language features like this, it will of course be possible to use My other recommendation with locals {
raw_chart = yamldecode(file("${path.module}/chart.yaml"))
chart = {
apiVersion = local.raw_chart.apiVersion
name = local.raw_chart.name
version = local.raw_chart.version
kubeVersion = try(local.raw_chart.kubeVersion, ">= 0.0.0")
description = try(local.raw_chart.description, null)
keywords = try(local.raw_chart.keywords, [])
home = try(local.raw_chart.home, null)
sources = try(local.raw_chart.sources, [])
maintainers = [
for m in lookup(local.raw_chart, "maintainers", []) : {
name = m.name
email = try(m.email, null)
url = try(m.url, null)
}
]
engine = try(local.raw_chart.engine, "gotpl")
icon = try(local.raw_chart.icon, null)
appVersion = try(local.raw_chart.appVersion, null)
deprecated = try(local.raw_chart.deprecated, false)
tillerVersion = try(local.raw_chart.tillerVersion, ">2.0.0")
}
} My intent here is that all of the "trickery" for normalizing the value can be gathered together into a single spot, and the rest of the configuration can just have straightforward references into I'm hoping that this new Thanks again for sharing the use-cases here, and giving us some food for thought on how something like this might fit with the existing Terraform language features. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
This is mostly a POC/proposal but I would be more than happy if this or something similar gets accepted. I'm highly open to any kind of feedback and/or ideas. JSONPath is a quite common object query language and could make the life of Terraform users easier, when looking up values in deeply nested objects/maps/lists. It is very similar to the
lookup
function but instead of akey
it accepts apath
parameter.Usage example:
Some issues:
interface{}
from thecty.Value
, what is the official way to do this? :)