Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for HCL2 #11

Open
mattolenik opened this issue Apr 23, 2019 · 8 comments
Open

Add support for HCL2 #11

mattolenik opened this issue Apr 23, 2019 · 8 comments
Assignees
Milestone

Comments

@mattolenik
Copy link
Owner

HCL2 lands with Terraform 0.12.

@mattolenik mattolenik added this to the 0.7.0 milestone Apr 23, 2019
@mattolenik mattolenik self-assigned this Apr 23, 2019
@skyzyx
Copy link

skyzyx commented Jan 5, 2020

What do I need to know about how hclq works so that I can help implement support for HCL 2?

@mattolenik
Copy link
Owner Author

mattolenik commented Jan 7, 2020

HCL2 is essentially a complete rewrite of the language, and the API works in a fundamentally different way. It prompted me to essentially do a full rewrite, including a new "query language" with a PEG grammar to more adequately and unambiguously express more complex queries.

I've been working off the HCL2 godocs located here: https://godoc.org/github.com/hashicorp/hcl2. Specifically, the hclsyntax API. There are a few different APIs at different levels of abstraction. gohcl is a high-level API for serializing in and out of structs, much like a JSON or YAML parser. hclwrite can generate new HCL but isn't suitable for reading, and so forth. I've been using the hclsyntax API, it seems to be the appropriate level of abstraction.

Essentially the new design works as follows:

  1. Parse the input query with pigeon, generating a custom AST of Go structs
  2. Parse the HCL and get an AST for the document
  3. Walk the query AST at the same time as the HCL tree, checking of the HCL node can be matched by the expression in the query's current node. If so, add a reference to that node to a list of aggregates representing all results of the query
  4. Take the aggregate nodes from the previous step and perform whatever the query operation is, such as getting or setting values
  5. Output the modified HCL document

Right now I'm on #3. Sadly I just haven't had the time to work on it lately, and motivation is especially difficult after using Terraform all day at work. :/

I'd recommend a thorough read of the HCL2 docs, it took me a while just to wrap my head around it. HCL2 is way more expressive than HCL1, which in addition to providing many benefits, comes with some complexity as well. It's a design choice, too, in terms of how complicated the tool should be.

For example, HCL2 supports expressions, and the API allows you to pass in values and evaluate the expression itself. Should hclq allow the evaluation of said expressions, or only have the option to return raw text? HCL2 supports richer value types, so now a query that returns a whole object (rather than just one attribute) becomes a much more important use case. That kind of thing.

If you want to take a crack at it, check out the v2 branch, it's where I've been doing all of this work. Forewarned, though, it's a bit of a mess and represents more of a playground than anything at the moment.

It may be that a full on "query language" for hclq is more than most people require -- maybe there's a simpler way? Any ideas would be welcome!

@skyzyx
Copy link

skyzyx commented Jan 8, 2020

Right now I'm on #3. Sadly I just haven't had the time to work on it lately, and motivation is especially difficult after using Terraform all day at work. :/

As a long-time OSS maintainer, I get it. :)

For example, HCL2 supports expressions, and the API allows you to pass in values and evaluate the expression itself. Should hclq allow the evaluation of said expressions, or only have the option to return raw text? HCL2 supports richer value types, so now a query that returns a whole object (rather than just one attribute) becomes a much more important use case. That kind of thing.

Makes sense, although as a Terraform user I hadn't realized that HCL2 was such a divergence from HCL1.

It may be that a full on "query language" for hclq is more than most people require -- maybe there's a simpler way? Any ideas would be welcome!

Depends on the use case, as always. For JSON, I use jq which is pretty amazing (although I sometimes dip into JMESPath as well if I'm doing a lot of AWS stuff). I generally use it to discover matching query values from JSON I don't own.

In my more immediate Terraform use-case, however, I'm looking to programmatically discover references to Git addresses for Terraform Modules, so that I can rewrite them to be a local Git submodule directory. In my mind, this is essentially querying the HCL, changing a value, then reserializing back to HCL. But then again, maybe I should just write a single-purpose Golang app?

I think that something like hclq obviously has a valuable use-case in the Terraform world (just look at how many started/stopped projects exist that are called hclq). I think that many people, like myself, would wish for something like jq for HCL2.

Anyway, thanks for the dialogue. I'll start by trying to wrap my head around the hclsyntax docs, and go from there.

Thanks!

@hongkongkiwi
Copy link

Could you release compiled binaries with v2 support?

@apparentlymart
Copy link

Hi all! I just ran across this thanks to following a link from someone asking why hclq doesn't work with modern Terraform configurations. I work on Terraform at HashiCorp and was one of the contributors to HCL 2.

The discussion above about what would be a suitable level of abstraction for a tool like hclq is definitely interesting, and I don't have a definitive answer to recommend but I do have some context that might be useful when thinking about it.

The main thing I'd note is that HCL 2 is not intended to be at the same level of abstraction as JSON or YAML: instead of being a serialization of a data structure, it's instead a toolkit for building a configuration language, and so ultimately it's up to the calling application to decide how (and indeed whether) to convert the result into a different sort of data structure. In most HCL 2-based applications so far, the resulting data structure has only existed in memory during execution.

Terraform's a particularly interesting example in that the Terraform runtime is built directly around the HCL API, and so there isn't a single up-front decoding/evaluation step like you might expect for a typical JSON/YAML-based format. Instead, the Terraform language runtime decodes and evaluates more and more of the data structure gradually as it walks through its dependency graph, which is what allows the configuration inside one block to refer to values generated by another block.

With that said then, I suspect that it may not really be practical to make a generalized query tool similar to jq at the HCL level of abstraction, or at least if you did build such a thing then it would likely not be able to work with a Terraform configuration except in a very shallow sense.

The Terraform team maintains a library called terraform-config-inspect which understands a shallower form of the core Terraform language, ignoring any parts that would typically be defined by a provider plugin. Although not the main purpose of that codebase, it does also include a command line tool which can produce a JSON serialization of the subset of the Terraform language constructs it understands, which could then in principle be processed with jq.

For editing HCL-based files there is a third-party utility hcledit which is built in terms of the HCL2 hclwrite package. Again it's tricky to build something at the HCL level of abstraction that can work with everything an application can potentially do in its HCL-based language, but at least for simple edits it seems to work well. In particular, I think it should work okay for the problem of rewriting the source argument inside a particular module block, as long as you know which of your .tf files contains that module block. (terraform-config-inspect could help with that part, since it reports the filename where it found each object)

I typically write little tools of this sort in Go directly with the hclwrite API, which is the most general answer with the fewest limitations and assumptions that might not apply to all applications. I know writing software in Go is not for everyone, but as a practical matter that's the language HCL is written in today and so it's the path that will require the least amount of fighting against technology! 😀

I think probably the most practical path for the future is for Terraform itself to grow some new features for dealing with the most common use-cases that require configuration querying and editing. It's unlikely to be as general as something like hclq would be, but would avoid the need to reimplement various Terraform assumptions elsewhere. I do have some old design sketches for some CLI commands for adding and changing modules and providers which I'd like to do something with one day (the terraform providers command we have today is actually one tiny part of one of those) but we're being more cautious about expanding the CLI surface area right now because we want to let the dust settle from the Terraform 1.0 Compatibility Promises before we add additional things that we'd then be compelled to support indefinitely. In the meantime, specialized external tools based on hclwrite seem to have worked reasonably well for some folks.

I hope this unstructured pile of information is helpful in some way!

@mattolenik
Copy link
Owner Author

@apparentlymart thank you so much for your comment, I really appreciate your input! I actually rewrote hclq at hack week at work a while ago, but haven't had time to continue it. The problem I've run into is as what you describe, HCL2 is not a static language like JSON or YAML. True for 1 as well but especially for 2. Querying it in a static fashion is less than useful.

I think probably the most practical path for the future is for Terraform itself to grow some new features for dealing with the most common use-cases that require configuration querying and editing.
I agree, and I was just thinking about this the other day. The main use case I had for hclq was for custom linting, you could use it to perform checks in just a shell script. I was thinking that it'd be great if Terraform had an integration point after plan but before apply, where a plugin could inspect the structure and tell Terraform it should or shouldn't continue with apply.

A simple example would be inspecting tags (e.g. tags on AWS resources) to ensure they conform to some organizational tagging standard. It's common to have something like a local.tags that's merged with tags for each resources. You can't look at that statically and have a good idea of what the resulting value would actually be. If I could write a program that can receive the entire configuration and inspect it, I could enforce any kind of rules that I want.

I'm sure there's more and better examples, but I really like the idea of being able to intercept a plan and do additional inspection before apply. It's bound to have a lot more uses than just linting.

@apparentlymart
Copy link

A common existing way to meet policy-related use-cases is to save a plan file (terraform plan -out=tfplan), have Terraform generate a JSON-based summary of it (terraform show -json tfplan), inspect that with software of your choice, and then decide whether or not to apply the plan (terraform apply tfplan).

That approach is not the same as static analysis of the configuration, and in particular it requires having enough information available to configure all of the providers that will participate in the plan, but on the other hand it can get a more thorough result because you can check against the final result of evaluating an expression, rather than the raw expression syntax.

Not a suitable answer for all problems, but I think a reasonable one for checks such as whether the final tags map for all resource instances matches organization conventions.

@magodo
Copy link

magodo commented Dec 27, 2021

Just FYI, I've just created a project https://github.com/magodo/hclgrep to allow syntax based grep on HCL2 files.
Though hclgrep is using a different strategy to query the HCL files as hclq doesn, while hope that can bring any insights to the refactoring on the hclq v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants