Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas to improve the user experience #817

Closed
1mamute opened this issue Jun 15, 2021 · 5 comments
Closed

Ideas to improve the user experience #817

1mamute opened this issue Jun 15, 2021 · 5 comments
Labels
enhancement New feature or request

Comments

@1mamute
Copy link

1mamute commented Jun 15, 2021

Infracost is awesome, we already know that. We can estimate our cloud infrastructures easily, set usage configurations and even run your own cloud pricing API. But, as a new user, I had a hard time setting it up. I subconsciously thought I could generate a infracost report directly from my state files, but this was not the case. It doesn't feels natural to me like other terraform tools i've been using.

The problem

The three main points i'd like to talk is that:

  1. There are many caveats as which file formats infracost does actually supports

Currently, infracost only supports reading directly from a terraform directory or from a Terraform JSON Output that is the result of a terraform show -json command (see #810).
There is a open issue (see #811) for discussing supporting terraform's JSON internal state file but I think that tfstate files should also be included.
The documentation is a bit all over the place in this regard. I'll document what I've gathered below in the "Currently supported workflow for infracost usage" sections, feel free to use this in the documentation if it's adequate.

  1. No support for remote state, plan and infracost breakdown's json output files

There isn't an option to reach any file remotely (maybe this is by design), everything needs to be available to infracost at runtime.
Putting this together with the fact that infracost can't parse the state files by itself means that we can't estimate infrastructure costs from state files stored in any remote backend (e.g: S3, Gitlab) or remote executions plans (e.g: Terraform Cloud Plans) without some script action (see CI/CD below).

  1. Infracost feels like a script and not like a standalone tool

I actually thinks that number 1 and 2 leads to number 3. Infracost has external dependencies on tools to (download and) parse terraform configuration in a determined output that can be ingested by it. These external tools needs to be installed and you need to run commands before actually executing infracost. Even though you can run them manually, infracost does it under the hood for you, running one after another, in the right order, to get its desired output. Much like a script.

Local experience

I think that this was the main environment target from the team because the local experience is fine if you have the terraform files, even though it stills needs the terraform binary to be installed (or any other terraform binary, e.g. terragrunt, that you can set via env variable INFRACOST_TERRAFORM_BINARY) as I explained before. If you don't have the terraform files, just a state or plan file (which is not usual, btw), things get tricky: see CI/CD below, the same applies.

CI/CD

Using infracost in CI/CD is very hacky-ish. You have to clone the repository, cd into the terraform directory and run infracost (that under the hood runs terraform). This means that the image running the pipeline gets bigger because it needs the terraform binary to be present and it also means that you can't generate uniform reports from different branches (different workspaces does works tho) without some serious scripting.

I managed to get my way through a pipeline that have 3 remote state files (one for each branch) by curling the state files from the remote state backend and converting it to Terraform JSON Output that infracosts expects as you can see in #810. The alpine image with curl, terraform and infracost was relatively big and it didn't feel right at all.

Currently supported workflow for infracost usage

Commands Terraform directory Terraform State JSON Output tfstate plan binary plan json
infracost diff ✔️ N/A N/A ✔️ if converted to JSON with terraform show -json tfplan.binary > plan.json (only local) ✔️ if converted to JSON with terraform show -json tfplan.json > plan.json (only local)
infracost breakdown ✔️ ✔️ (only local) ✔️ with --use-terraform-state flag or converting to JSON with terraform show -json terraform.tfstate > terraform.json (only local) ✔️ if converted to JSON with terraform show -json tfplan.binary > plan.json (only local) ✔️ if converted to JSON with terraform show -json tfplan.json > plan.json (only local)

PS: infracost output supports only local JSONs outputted by infracost breakdown, e.g: infracost breakdown --path /path/to/project1 --format json > project1.json.

Here's the link for terraform's show command documentation.

Proposal

I think in a perfect world infracosts users would have the options to generate breakdown reports from a local terraform directory or any state or plan file either locally or remotely and also generate diff reports from a local terraform directory or any plan file either locally or remotely and compare those against remote state files as well.

Remote

Other terraform tools like driftctl succesfully found a way to reference remote tfstate, maybe it's worth checking them out.
This might be a common problem for other tools like tflint and tfsec as they don't have yet a way to reference a remote state file too. They also work differently (by linting the HCL syntax) but my point still stands.
It even may be worth coming together as a community with those other tools and creating a generic go module that does this in a good way so everyone can use and improve together.

Terraform's plans and states in JSON, binary and .tfstate

If we can't find a way to parse the state files by ourselves, which might be a bad ideia anyway (as stated in this page), a solution would be to bundle hashicorp's terraform-exec module into infracost and run terraform CLIs command from within the application itself. Still feel scripty but it's much more transparent to the user. This would make infracost a standalone tool, that doesn't even need terraform to work. This comes with a increased binary size, but personally, I think it's worth it.

EDIT: Maybe this tool kvz/json2hcl can help in some way?

@aliscott
Copy link
Member

Thanks @1mamute, this is incredible feedback. Let me provide some context to explain where we currently are, and then I'll dive into some ideas of what we can do/where to start.

  • The use-cases we've seen so far with Infracost have been mainly for pre-deploy, after the plan stage, so we haven't concentrated much on anything after the apply stage yet.

  • We've started with the formats that Terraform recommends for integrations, but this means that they need to be generated which doesn't make it user-friendly.

  • We added the wrappers around terraform init, plan and show to try and make it more user-friendly for running against just a Terraform project. But this adds a lot of complexity due to the dependencies and flags, especially for CI/CD. It becomes more complicated with different terraform versions, terragrunt, etc.

In terms of what we should look at, here's some ideas:

  1. Supporting the internal state representation (as per Support internal Terraform state representation #811) is a good shout and should probably our be our first focus, since this is the number one thing that would make your use-case simpler, as it removes the terraform binary dependency.

  2. I can see the use case for supporting remote files for Terraform state, so we could look at how driftctl is doing this. Initially, I see this as less of a problem than #1, since it's easier to workaround with curls, etc. Is there a use-case for remote files for plans as well or would you see this as only working with state files?

  3. For improving the tool in general and so it feels less like a script/removes dependencies, here's some initial ideas:

    1. Only support the raw files (plan JSON, state files, etc) and leave anything that requires the Terraform binary up to the user. This simplifies the interface, but might make it more difficult to get immediate value from Infracost, and would still require these dependencies to be added to CI/CD.

    2. Directly parse the HCL - this is what a lot of the static analyzers/security tools do. My assumption is that a lot of cost information is configured in variables, so we'd need to work out how we can evaluate these. This ticket might help.

    3. Use terraform-exec (as you've suggested 😄). This could solve the dependency issue - we'd need to think about how this works with different versions and terragrunt, how variable are passed, etc.

@1mamute
Copy link
Author

1mamute commented Jun 17, 2021

The use-cases we've seen so far with Infracost have been mainly for pre-deploy, after the plan stage, so we haven't concentrated much on anything after the apply stage yet.

I thought the purpose of infracost breakdown was to generate a breakdown of the costs of your infrastructure, not only for the planned changes (in a pull request for example) but also for an existing state of the infrastructure. I don't see why people aren't using it for that.

I can see the use case for supporting remote files for Terraform state, so we could look at how driftctl is doing this. Initially, I see this as less of a problem than #1, since it's easier to workaround with curls, etc.

I think the importance of this is directly related to the support of the internal states representations. Curling a state file is not the problem but the overhead of having to execute terraform show -json after downloading it that is way too big imho. Of course It all depends if we want infracost to address the state file breakdown use case or not, because if we do, another point to consider that showcases the importance of this is:

If someone wants to see a breakdown of a remote state file today, either in a CI/CD environment or locally, after downloading the file via curl or any other method there is a need to manage the JSON files for every step of the process in the filesystem because infracost doesn't currently parses from stdin.

That causes another overhead because we have to deal with file names, filesystem permissons and security/compliance concerns with those files. If infracost handles it internally, all these overheads goes away.

Using Gitlab Remote State, for example, today we have to:

curl --header 'Authorization:Bearer ${CI_JOB_TOKEN}' 'https://gitlab.com/api/v4/projects/XXXXX/terraform/state/dev' -o /tmp/dev.state.json
terraform show -json /tmp/dev.state.json > /tmp/converted.dev.state.json
infracost breakdown --path=/tmp/converted.dev.state.json > /tmp/breakdown.dev.state.json
rm /tmp/dev.state.json /tmp/converted.dev.state.json /tmp/breakdown.dev.state.json

Instead of pipeing the plan to infracost's stdin:

curl --header 'Authorization:Bearer ${CI_JOB_TOKEN}' \
'https://gitlab.com/api/v4/projects/XXXXX/terraform/state/dev' -o /tmp/dev.state.json 
terraform show -json /tmp/dev.state.json | infracost breakdown
rm /tmp/dev.state.json

Notice that unfortunately terraform show also don't read from stdin. There's already an issue open for this feature request: hashicorp/terraform#23094. That by itself makes using terraform-exec or reading directly from the state files even more appealing.

Is there a use-case for remote files for plans as well or would you see this as only working with state files?

I never saw anyone using this and I don't plan to use this feature myself tbh but it might be a feature for some scenarios with multiple terraform repositories or big monorepos.

If we address the remote state file, I don't think it would be a major effort to implement since the mechanism to get the files would be the same. I don't know if a remote plan file follows the same schema as any other plan file, if it does then there's no problem besides downloading the remote plan file and if it doesn't, then there is more work to do. Maybe we can investigate if the schemas are the same and if they aren't we raise an issue to see if this is important for the community?

Directly parse the HCL - this is what a lot of the static analyzers/security tools do. My assumption is that a lot of cost information is configured in variables, so we'd need to work out how we can evaluate these.

To be frankly I don't know the complexity of this but that would open numerous avenues for Infracost besides making it a standalone tool. Imagine the kinds of integrations it would be possible: using infracost in git hooks to see the cost of planned changes before commiting (setting a budget and failing the hooks if the planned changes surpasses that!) or integrating with terraform-ls to see real-time infrastructure costs in the IDE (that would be f* next level!)

Use terraform-exec (as you've suggested smile). This could solve the dependency issue - we'd need to think about how this works with different versions and terragrunt, how variable are passed, etc.

Short term, seems like terraform-exec would be the best path to follow while we reiterate where we wanna go. Don't think passing variables would be a problem, we can simply wrap planConfig struct in our flag --terraform-plan-flags and go from that.

I think we should focus on Terraform 1.0 as Hashicorps promises 18 months of maintenance and interoperability with lots of versions of terraform. We just need to check if terraform-exec follows these statements too.

Only support the raw files (plan JSON, state files, etc) and leave anything that requires the Terraform binary up to the user.

As a new user and contributor, still not sure where we think infracost's place should be in an engineer or company workflow. Is it a tool best suited to generate reports in a CI/CD environment or in a local machine while developing infrastructure as code? What are our main user targets? Only experienced engineers or we also want to make infracost workflow easy for new developers or non-technical people (e.g. business intelligence) to generate infrastructure cost reports?

Leaving these things to the user, which we currently do in some extent, creates complexity and like you said makes "more difficult to get immediate value from Infracost". Personally, I think we should make the tool as simple as possible to use, but I respect any decisions by the maintainers.

@aliscott
Copy link
Member

aliscott commented Jun 17, 2021

To be frankly I don't know the complexity of this but that would open numerous avenues for Infracost besides making it a standalone tool.

Yeah this could be really interesting and definitely something I think we should investigate more. Even if it's not possible to be as accurate as using a plan file, it could provide users with immediate feedback and be used in a lot more places.

Instead of pipeing the plan to infracost's stdin

Yeah I can see this being useful, especially if we remove the requirement to do terraform show -json for state files.

Thanks for all the feedback, there's definitely a lot we need to do to make Infracost better for the user 🚀

@alikhajeh1 alikhajeh1 added the enhancement New feature or request label Jul 23, 2021
@alikhajeh1
Copy link
Member

Closing this issue for now, thanks @1mamute for starting the discussion, we've created focused GH issues out of the discussion:
#811 (upvoted quite a number of times)
#821

@aliscott
Copy link
Member

aliscott commented Mar 3, 2022

To be frankly I don't know the complexity of this but that would open numerous avenues for Infracost besides making it a standalone tool. Imagine the kinds of integrations it would be possible: using infracost in git hooks to see the cost of planned changes before commiting (setting a budget and failing the hooks if the planned changes surpasses that!)

@1mamute this issue has been a huge inspiration to what we've been working on for the last month. It may have been a while but we have made some progress on this...

In v0.9.19 we shipped an experimental feature that parses the HCL code directly. We really need users to try it out on their terraform code so we can iron out all the issues, so it would be great if you could give it a shot by running something like the below, comparing the results to how you run Infracost normally:

infracost breakdown --path=path/to/code --terraform-parse-hcl \
  --terraform-var-file="myvars.tfvars" \ # Load variables from provided files, similar to Terraform's -var-file flag
  --terraform-var "my_var=value" \       # Set a value for one of the input variables, similar to Terraform's -var flag
  --terraform-var "my_other_var=value"   # The --terraform-var-file and --terraform-var flags can be used multiple times

(Also - we'll be discussing this more in the March Community Call if you're interested)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants