QUESTION: Terraform plan output may contain sensitive data. #163

caquino · 2018-07-04T16:55:12Z

Hi,

We are evaluating adoption of Atlantis and one of our concerns regarding security is that while doing 'atlantis plan' secret data can be displayed on the plan output.

Is there any recommendation on how to overcome this issue?

Thanks

lkysow · 2018-07-04T17:53:44Z

Hi Cassiano, right now there's no built-in way to Atlantis to overcome that issue however there are some mechanisms within Terraform itself and some options I'd like to evaluate for atlantis:

Within Terraform

As per hashicorp/terraform#16554 (comment), output secrets can be hidden by setting sensitive = true:

output "out" {
  value     = "$secret"
  sensitive = true
}

Resulting in an apply output of:

Outputs:

out = <sensitive>

Within resources, it's provider dependent to annotate arguments as secret. For example, the aws_db_instance resource marks its password argument as secret, so:

resource "aws_db_instance" "default" {
  # ...
  username = "foo"
  password = "foobarbaz"
}

Results in:

Terraform will perform the following actions:

  + aws_db_instance.default
     ...
      password:                   <sensitive>
      username:                   "foo"

Plan: 1 to add, 0 to change, 0 to destroy.

Just for my knowledge is there a particular resource you're managing that you're worried about?

For Atlantis

I'm going to try out https://github.com/opencredo/terrahelp which has a mask function and I'll get back to you if that works and if so how to use it within Atlantis.

caquino · 2018-07-05T10:50:55Z

@lkysow Thanks for such fast and comprehensive answer!

First of all please read this with the "brainstorm hat", I'm not demanding or telling what should be done.

We are aware that this is not really an Atlantis issue, its a terraform issue, but it turned out to be even more of a nuisance in our case because of having this content on GH comments.

Sometimes, data like URIs, IP addresses and such, that are not really "sensitive data" gets exposed which gives more information about our infrastructure than needed, I can imagine some possible options to solve the issue:

1 - To have options on atlantis.yml to specify regular expressions to filter out the plan output before submitting it to GH, which IMHO is not the best option because it may mask data that is useful for the user to determine if the plan should/should not be applied.

2 - In conjunction to 1, instead of masking the data as sensitive let the regular expression specify filters to encrypt parts of the plan output, which lets the user decrypt the data if it's something necessary for them to analyze the plan output.

Having an output similar to :

Terraform will perform the following actions:

  + aws_db_instance.default
     ...
      password:                   ENCRYPTED:Hd8ajksa89alashdhkasd==
      username:                   "foo"

Plan: 1 to add, 0 to change, 0 to destroy.

Which would allow users if necessary to decrypt the data if they have the key and even if they want to be fancy to build a browser extension to do it automatically.

Lets imagine that I could do something like: atlantis plan pgp_encrypt:(my pgp key id)

3 - For the plan an option to have the plan to be hosted on the Atlantis web interface itself, which I also don't like as it makes Atlantis a lot more complex as it will need to start having state to be able to fulfill compliance auditing reports.

Being completely honest, after writing the options down, none feels like a good option they all make me feel bad for different reasons, but most of all none feels like a good user experience. (I suspect this is what makes things more secure?)

BTW the idea of the encryption on the state output was inspired by https://github.com/mozilla/sops

lkysow · 2018-07-05T12:44:38Z

Yes security tends to do that :) but I think this is definitely something that Atlantis should aim to tackle since it's the one posting the plan output on github! I think building this solution is best tackled with a well-defined use-case that I can fully understand. If you'd be interested, I'd love to have a quick call with you to more fully understand your use-case and how you deal with secrets. You can email me at lkysow [at] gmail if you'd like to set that up or message me in the slack channel. I think it would be dangerous to build this in a vacuum without a real-world use case.

Also brainstorming hat on...

A fully authed and rich Atlantis UI is a direction I want to go but it's not a short-term solution. I could do something short-term where you can mark a plan output as sensitive, causing it to only be available on the atlantis server as just a hosted file and then rely on users to add their own auth system on top of the UI (see #49 (comment)). For this I have a couple questions:

is this a project-wide thing where you mark the plan output as sensitive or is it a per individual plan thing in which case you'd need to disable autoplanning and then manually plan with new flag to atlantis, ex. atlantis plan --sensitive
do people want to run atlantis behind their own auth proxy?

Without the UI, I think the solution is some form of:

A config that allows users to specify which variables or resource arguments are sensitive
A mechanism within Atlantis for detecting those being output from terraform plan and masking them
A mechanism for the Atlantis user to view those masked variables.

I think 1 and 2 are easier than 3. And in terms of order of operations, I might want to implement 1 and 2, say for 3 that you just need to run plan locally (for now) and then build out 3 later. I know codecov.io has a simple UI-based secret mechanism where you paste in your string and it gives you the encrypted version back, Atlantis could have the same thing but the opposite. Questions I have about that:

Is it variables or resource attributes that are sensitive or both? The config format would need to be capable of defining both and I'm not sure a regex would be flexible enough?
Is it obvious beforehand what is going to be sensitive? I'm wondering what the workflow for an engineer would be. Do they see the plan has something sensitive, delete the comment and add another line to the config file. Or are whole repos/directories known to be sensitive.

lkysow · 2018-07-12T13:32:49Z

I've got Atlantis working with terrahelp mask which masks sensitive variables from a plan. For example:

terraform plan | terrahelp mask

terrahelp figures out which variables are sensitive simply by assuming that all variables in a terraform.tfvars file (which Terraform includes automatically) are sensitive. Obviously if you put sensitive variables into that file and then checked it in, it would have no purpose so you would have to create that file at runtime, maybe by downloading from Vault?

Either way, maybe this helps. You can see the Atlantis config here: #46 (comment) and you'll need the terrahelp binary to be available where Atlantis is running.

You can read more about terrahelp here: https://github.com/opencredo/terrahelp

lkysow · 2018-07-17T14:10:21Z

If #121 were implemented (store plan output on server) then this would also be a possible solution. I don't think you'd need to keep the plan output around for auditing reasons because it's not the plan that matters but the apply log. If the apply log could still be printed to the pull request (in which any sensitive outputs could be marked as such) then this would be another solution.

sterfield · 2019-03-21T08:57:55Z

Unfortunately, sensitive outputs is not covering all cases.

For example, if you use random_string to generate passwords, the ID of the random_string is... the content of the string generated. So in plan or apply, you would see random_string.database_password (ID:<the password>)...

And relying on terrahelp or string regexs is tricky, as a single change in output format in Terraform could break it and leak your secrets in Github comments.

IslamAzab · 2019-09-25T13:39:43Z

Is there a way do mark variables as sensitive so that they don't show up in the plan output?

For example

source

resource "aws_sns_topic_subscription" "subscription" {
  endpoint = "${var.url}${var.token}"

output

resource "aws_sns_topic_subscription" "subscription" {
  endpoint  = "https://www.url-example.com/path?access_token=SENSITIVE_DATA"

lkysow · 2019-09-25T16:45:16Z

Not built in to Atlantis

h3lo · 2020-02-22T21:10:32Z

Is there a way do mark variables as sensitive so that they don't show up in the plan output?

For example

source
resource "aws_sns_topic_subscription" "subscription" {
  endpoint = "${var.url}${var.token}"
output
resource "aws_sns_topic_subscription" "subscription" {
  endpoint  = "https://www.url-example.com/path?access_token=SENSITIVE_DATA"

I know this issue is old, but I am running into this exact situation (API key in aws_sns_topic_subscription resource) being exposed by TF. Could we perhaps have a way to mark certain attributes like this as sensitive so they get masked by atlantis during plan/apply?

I know the issue is in the provider, but as mentioned above, it is Atlantis that makes this an issue since it posts comments to git. Disabling auto-planning and requiring people to look at the Atlantis server kind of defeats the purpose of Atlantis imo...

IslamAzab · 2020-02-23T09:05:00Z

@h3lo I am not sure if it will be addressed in Atlantis.

You can try Terraform sensitive output

output "db_password" {
  value       = aws_db_instance.db.password
  description = "The password for logging in to the database."
  sensitive   = true
}

h3lo · 2020-02-23T15:37:37Z

That attribute is only for outputs. I'm talking about the aws_sns_topic_subscription resource, and more specifically, the endpoint attribute, which is likely to be sensitive for anyone using some 3rd party's API over HTTPS.

I even tried creating a module with nothing in it but my sensitive data, and outputting it with the sensitive attribute set, then using it in my aws_sns_topic_subscription resource's endpoint attribute by reference, but TF still outputs its interpolated value.

This isn't really a big risk when doing terraform from the command line, it's specifically a function of Atlantis making the plan and apply output semi-public that causes risk. Of course, this aspect of Atlantis is also one of the main reasons to use it (tracking of what's been done).

Atlantis generally works great, thank you for your hard work. I hope my feedback can help improvements be made that will be useful to everyone.

wlonkly · 2020-02-23T16:42:39Z

I think

TF still outputs its interpolated value

is the key bit here, and would encourage opening up an issue on the provider, since I think that's the right place to mark sensitivity rather than having special cases in various providers inside Atlantis. The same problem would happen in eg. Terraform Cloud, and can be avoided by setting the sensitive flag in the provider schema.

Heck, it's probably a one-line PR, so you could be a Terraform contributor! 😄

(FWIW, I agree that endpoint is sensitive for the reasons you described and would be happy to upvote the issue there.)

h3lo · 2020-02-24T21:58:24Z

@wlonkly I feel quite strongly that this doesn't belong in the provider. The provider outputting some sensitive data isn't inherently wrong because from TF perspective, the host you run plans and applies on is secure. It is Atlantis that makes this output public and therefore it is specifically Atlantis that should handle masking it.

jasonrberk · 2021-04-08T15:25:52Z

@h3lo I think it would be nice if the provider provided a way for you to "mark" that something is sensitive, or even denoted sensitive data in some standardized way. Then, Atlantis could use that standard accordingly, instead of a potentially non-deterministic regex. This would still allow TF to spit out the real info, that you want to see when you run locally, but also allow Atlantis to scrub it when posting it in a PR.

Another thought..... after the PR merges, could / should the Atlantis comments be deleted? Just getting started with TF and Atlantis, so I'm no expert....but wouldn't that at least limit exposure. I know our TF PRs don't generally stay open for very long. By the time a dev is ready to PR, they should have done the plan/apply and even a destroy in a lower env.

I don't have enough history to know if I'd ever need to look back on those plan / apply PRs later to debug something.

nitrocode · 2022-10-19T07:12:35Z

For example, if you use random_string to generate passwords, the ID of the random_string is... the content of the string generated. So in plan or apply, you would see random_string.database_password (ID:)...

@sterfield I know this is an old issue. I have a small addition here which may not have been available at the time.

https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/password

Identical to random_string with the exception that the result is treated as sensitive and, thus, not displayed in console output. Read more about sensitive data handling in the Terraform documentation.

GMartinez-Sisti · 2024-03-09T13:15:25Z

We're having a similar issue with this masking values on the plan output. Due to some architecture decisions, we have some variables that are considered secrets, created on terraform and passed into templates variables to templatefile functions. Since they are passed as text, terraform loses the context of them being sensitive and outputs them as clear text on the plan. Before having automation on a PR this was not a concern, as the terraform executor would also have access to the terraform state file, so it would be acceptable to have it on an output, but now the value is available to anyone with access to the repository where Atlantis comments.

I understand this might be considered a hack, since the real solution is to address the way the secret is passed, but as we all know, sometimes we don't have the cycles to fix this when it's already working across multiple production environments and the business has other requirements.

My proposal is to add a feature to allow masking the plan output based on a configurable regex. The regex argument can be a list to make it easier to achieve multiple masks without increasing the complexity of the regex pattern. I've tested this locally with a simple pipe that sends terraform plan output to sed and it works as long as we can figure out a valid regex for the desired outcome, I believe this would be somewhat easy to achieve with Atlantis. Afaik Atlantis doesn't really interfere with terraform, just calls the binary and interprets results so I can't think of another way to achieve this.

I'm happy to develop this feature. Thoughts?

PS: Another option is to allow a custom post processing command on

atlantis/server/core/config/valid/repo_cfg.go

Lines 176 to 183 in 57aef30

    
           // PostProcessRunOutputOption is an enum of options for post-processing RunCommand output 
        
           type PostProcessRunOutputOption string 
        
           const ( 
        
           	PostProcessRunOutputShow            = "show" 
        
           	PostProcessRunOutputHide            = "hide" 
        
           	PostProcessRunOutputStripRefreshing = "strip_refreshing" 
        
           )

This was referenced Jan 14, 2019

Feature Request: Support masking output dmlittle/scenery#5

Closed

[terrahelp] add package cloudposse/packages#137

Merged

lkysow added the docs Documentation label Apr 4, 2019

MPV mentioned this issue Feb 27, 2020

Option to use GitHub App's "Checks API" instead of PR comments #936

Open

GMartinez-Sisti mentioned this issue Mar 10, 2024

feat: allow masking output on comments #4331

Open

dosubot bot added the Stale label Oct 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QUESTION: Terraform plan output may contain sensitive data. #163

QUESTION: Terraform plan output may contain sensitive data. #163

caquino commented Jul 4, 2018

lkysow commented Jul 4, 2018 •

edited by nitrocode

Loading

caquino commented Jul 5, 2018 •

edited

Loading

lkysow commented Jul 5, 2018

lkysow commented Jul 12, 2018

lkysow commented Jul 17, 2018

sterfield commented Mar 21, 2019

IslamAzab commented Sep 25, 2019

lkysow commented Sep 25, 2019

h3lo commented Feb 22, 2020

IslamAzab commented Feb 23, 2020

h3lo commented Feb 23, 2020

wlonkly commented Feb 23, 2020

h3lo commented Feb 24, 2020

jasonrberk commented Apr 8, 2021

nitrocode commented Oct 19, 2022

GMartinez-Sisti commented Mar 9, 2024 •

edited

Loading

QUESTION: Terraform plan output may contain sensitive data. #163

QUESTION: Terraform plan output may contain sensitive data. #163

Comments

caquino commented Jul 4, 2018

lkysow commented Jul 4, 2018 • edited by nitrocode Loading

Within Terraform

For Atlantis

caquino commented Jul 5, 2018 • edited Loading

lkysow commented Jul 5, 2018

lkysow commented Jul 12, 2018

lkysow commented Jul 17, 2018

sterfield commented Mar 21, 2019

IslamAzab commented Sep 25, 2019

lkysow commented Sep 25, 2019

h3lo commented Feb 22, 2020

IslamAzab commented Feb 23, 2020

h3lo commented Feb 23, 2020

wlonkly commented Feb 23, 2020

h3lo commented Feb 24, 2020

jasonrberk commented Apr 8, 2021

nitrocode commented Oct 19, 2022

GMartinez-Sisti commented Mar 9, 2024 • edited Loading

lkysow commented Jul 4, 2018 •

edited by nitrocode

Loading

caquino commented Jul 5, 2018 •

edited

Loading

GMartinez-Sisti commented Mar 9, 2024 •

edited

Loading