Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(terraform): recursively detect all Root Modules #4457

Merged
merged 6 commits into from
Jun 8, 2023

Conversation

jof
Copy link
Contributor

@jof jof commented May 24, 2023

Addreses #4463

Description

This forces trivy to use defsec to recursively scan for all Terraform Root Modules.

This fixes some immediate issues and sets a new default.
This defsec scanner option could also make sense as a CLI option, conditionally enabled by Trivy.

Checklist

  • I've read the guidelines for contributing to this repository.
  • I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

@jof jof requested a review from knqyf263 as a code owner May 24, 2023 05:17
@jof jof changed the title <fix>(terraform): recursively detect all Root Modules fix(terraform): recursively detect all Root Modules May 24, 2023
@jof
Copy link
Contributor Author

jof commented May 24, 2023

It'll take me a lot more time to learn the test setup in this repo, so I'm hoping to get some positive feedback from a maintainer first before spending more time making tests and docs to complete this PR.

@simar7
Copy link
Member

simar7 commented May 24, 2023

Thanks I'll take a look at it.

Signed-off-by: Simar <simar@linux.com>
Signed-off-by: Simar <simar@linux.com>
@simar7
Copy link
Member

simar7 commented May 25, 2023

Thanks for the PR again, I had some time to work on this finally and I think your suggestion to add another feature flag makes sense for two reasons:

  1. It keeps current behaviour (although it might not correct per se, changing it will be a breaking change). We can re-visit this default behaviour at a later stage to change it.
  2. Some users might actually not want to recursively scan by default. This could be due to large number of files present or otherwise.

Therefore, I've added a command line flag --scan-all-dirs to optionally enable this behaviour.

Before

$ tree ./Bug2
./Bug2
├── empty.tf
└── test
    └── main.tf

1 directory, 2 files


$ trivy config  ./Bug2 
2023-05-24T18:19:47.948-0600    INFO    Misconfiguration scanning is enabled
2023-05-24T18:19:48.451-0600    INFO    Detected config files: 0

After

$ trivy config --scan-all-dirs ./Bug2
2023-05-24T18:20:06.131-0600    INFO    Misconfiguration scanning is enabled
2023-05-24T18:20:06.626-0600    INFO    Detected config files: 1

test/main.tf (terraform)

Tests: 10 (SUCCESSES: 1, FAILURES: 9, EXCEPTIONS: 0)
Failures: 9 (UNKNOWN: 0, LOW: 1, MEDIUM: 2, HIGH: 6, CRITICAL: 0)


<snip>

@simar7 simar7 self-assigned this May 25, 2023
@simar7 simar7 added scan/misconfiguration Issues relating to misconfiguration scanning kind/feature Categorizes issue or PR as related to a new feature. labels May 25, 2023
@reedloden
Copy link

  1. It keeps current behaviour (although it might not correct per se, changing it will be a breaking change). We can re-visit this default behaviour at a later stage to change it.

Coming from tfsec, this is the opposite of what is expected as the default. But apparently, this actually is a regression, which explains why we haven't seen as many warnings from tfsec lately. See https://github.com/aquasecurity/tfsec/issues/1927. So, maybe there's actually a bug in defsec that is causing this.

@AnaisUrlichs
Copy link
Member

Can someone please clarify this for me -- maybe @jof -- from what I understood

  • Trivy config has problems to scan ALL the files in terraform subdirectories, right? e.g.
  • by using the flag "--scan-all-dirs " you basically force Trivy to take a lot at all of the directories present?
  • Why does this flag have to be present in all of the trivy commands e.g. trivy image?
  • If this is the behaviour could we put a note in the documentation? Right now, the PR does not make the user aware of this issue and the workaround e.g. somewhere here specific to Terraform https://aquasecurity.github.io/trivy/v0.41/docs/scanner/misconfiguration/#terraform-value-overrides

Separately, thank you for taking initiative on this and contributing :)

@jof
Copy link
Contributor Author

jof commented May 25, 2023

Thank you for having a proper look at this @simar7 and @AnaisUrlichs -- it is really appreciated.

I have a similar perspective as @reedloden -- coming from tfsec/defsec to Trivy/defsec, this feels like a regression in user experience and expectations.
In any moderately-complex Terraform repo with multiple Root Modules, Trivy now covers a lot less than what tfsec was previously scanning.

For example, in one of our internal repos:

# tfsec
$ tfsec --verbose 2>&1 | grep -E "Parsing '.*\.tf'..." | wc -l
     639

# Trivy
$ trivy config -d . 2>&1 | grep -E 'Scanned config file.*.tf' | wc -l
       3

Some points of clarification:

  • This forceAllDirs/ScannerWithAllDirectories option seems only applicable in the context of using the Terraform scanner from defsec.
  • I think this option should be enabled by default in all Terraform config-scanning contexts
    • Finding and understanding the need for this additional option will probably only happen to users have been hurt by having missing coverage. Without enabling debug logging, it isn't immediately clear which Terraform files are being scanned and that some files may be getting ignored or missed.
    • This becomes a bit of a design-philosophy question: I have the perspective that security scanning tools should err on the side of scanning for and detecting as many risks as possible, and optionally provide ways to filter down or limit what is being examined or reported.
  • If we choose to keep this new functionality optional, we should probably:
    • Only expose this option for Terraform-scanning contexts. It seems strange to see reference to this new option in the docs/docs/references/configuration/cli/trivy_kubernetes.md doc, for example.
    • Rename the option and docs to describe what it does in the Terraform context: recursively discover all Terraform Root Modules. Maybe a CLI name like: --tf-find-all-roots "Recursively search for all Terraform Root Modules"
    • Add corresponding options in the aquasecurity/trivy-action repo to enable this functionality in GitHub Actions.

Addressing each of @AnaisUrlichs's questions:

Trivy config has problems to scan ALL the files in terraform subdirectories, right? e.g.
by using the flag "--scan-all-dirs " you basically force Trivy to take a lot at all of the directories present?

Something like this. Without this option set, defsec does something like a breadth-first search for Terraform Root Modules. Once it finds any Terraform Root Modules at a layer of depth, it stops descending into deeper subdirectories.

This probably works just fine for simple cases where there is a single Terraform Root Module, or there is one repo per-Root Module, but fails in moderately complex environments with multiple Terraform Root Modules in a single repo.

Ideally we want to point Trivy's Config scanning at a directory, and have it examine everything inside of that directory.

Why does this flag have to be present in all of the trivy commands e.g. trivy image?

This is a better question for @simar7; I think any option for this should only be needed in Trivy Config scanning contexts when the Terraform scanner is used.

If this is the behaviour could we put a note in the documentation? Right now, the PR does not make the user aware of this issue and the workaround e.g. somewhere here specific to Terraform https://aquasecurity.github.io/trivy/v0.41/docs/scanner/misconfiguration/#terraform-value-overrides

I feel like the docs changes will depend on what we decide about making this a default option or not. But regardless, we should describe in the docs how the defsec Terraform scanner will be called (i.e. "scans for the first Root Modules it finds; set this option for more" or "defsec Terraform will recursively search for all Terraform Root Modules")

@simar7
Copy link
Member

simar7 commented May 25, 2023

To summarize:

  1. This option should only be available Terraform scans
  2. This should be the new default, rather than an opt in?

As for 1) this option is only used within terraform https://github.com/aquasecurity/trivy/pull/4457/files#diff-50dbe9b7496c81601826d673eea9cf88d33170ff79be2ca201032a187adc5219R255-R256 – I am not sure why the generated markdowns include changes for other subcommands as well. @knqyf263 would you know why that's the case?

As for 2) I'm open to changing it, but in my opinion this will be a breaking change as for the reasons I mentioned above. @itaysk and @knqyf263 any thoughts?

@jof
Copy link
Contributor Author

jof commented May 25, 2023

Ah, I didn't realize those documentation files were auto-generated. It makes some sense then, that it might reflect an option being defined for all config scans into all of those docs contexts.
I think so long as we rename the option to something Terraform-ey, it will be clear.

Could we clarify what a "breaking change" is?
While this does create some outward changes in behavior of Trivy, the before/after difference is that Trivy will start detecting things it was otherwise previously ignoring.
I'm trying to imagine a context where this new visibility would be undesirable or break existing expectations of Trivy, and I can't think of any.
From my context-perspective, Trivy is already broken and this just fixes the breakage.

While I would advocate for this recursive Terraform scanning becoming a new default, the outcome is not super important to me: I only care that we can enable this recursive scanning through GitHub Actions (whether as a new default that comes with a Trivy/trivy-action release, or with an added option that we pass into the job spec)

@jof
Copy link
Contributor Author

jof commented May 26, 2023

I was curious to explore how/why this regressed (IMO) from tfsec to trivy and was surprised to see that tfsec was calling defsec's Terraform scanner without this forceAllDirs option!

I still feel like the changes in here appear to do the right thing, but now I'm having some doubts about my approach and could use some more debugger time (or original author perspective) to figure out why the behavior is different between the tools.

@knqyf263
Copy link
Collaborator

knqyf263 commented Jun 4, 2023

@knqyf263 would you know why that's the case?

The flag group is also used for other subcommands having the functionality to detect misconfigurations.

// MisconfFlagGroup composes common printer flag structs used for commands providing misconfinguration scanning.
type MisconfFlagGroup struct {
IncludeNonFailures *Flag
ResetPolicyBundle *Flag
ScanAllDirs *Flag
// Values Files
HelmValues *Flag
HelmValueFiles *Flag
HelmFileValues *Flag
HelmStringValues *Flag
TerraformTFVars *Flag
}

You've got to explicitly disable it in irrelevant subcommands if it is unneeded.

reportFlagGroup.ExitOnEOL = nil // disable '--exit-on-eol'

As for 2) I'm open to changing it, but in my opinion this will be a breaking change as for the reasons I mentioned above. @itaysk and @knqyf263 any thoughts?

If it is a regression, it is just a bug fix. I don't think it's a breaking change. I'd vote for changing the default behavior. Actually, I expected any reason to change the behavior in tfsec, and blamed commits, but didn't find any clue. Looks like it is simply a regression.

Signed-off-by: Simar <simar@linux.com>
Signed-off-by: Simar <simar@linux.com>
@simar7
Copy link
Member

simar7 commented Jun 7, 2023

@knqyf263 would you know why that's the case?

The flag group is also used for other subcommands having the functionality to detect misconfigurations.

// MisconfFlagGroup composes common printer flag structs used for commands providing misconfinguration scanning.
type MisconfFlagGroup struct {
IncludeNonFailures *Flag
ResetPolicyBundle *Flag
ScanAllDirs *Flag
// Values Files
HelmValues *Flag
HelmValueFiles *Flag
HelmFileValues *Flag
HelmStringValues *Flag
TerraformTFVars *Flag
}

You've got to explicitly disable it in irrelevant subcommands if it is unneeded.

reportFlagGroup.ExitOnEOL = nil // disable '--exit-on-eol'

As for 2) I'm open to changing it, but in my opinion this will be a breaking change as for the reasons I mentioned above. @itaysk and @knqyf263 any thoughts?

If it is a regression, it is just a bug fix. I don't think it's a breaking change. I'd vote for changing the default behavior. Actually, I expected any reason to change the behavior in tfsec, and blamed commits, but didn't find any clue. Looks like it is simply a regression.

Fair enough, I've changed the behaviour to be true by default now. @knqyf263 could you take another look at reviewing this?

Comment on lines 16 to 21
ScanAllDirsFlag = Flag{
Name: "scan-all-dirs",
ConfigName: "misconfiguration.scan-all-dirs",
Value: true,
Usage: "scan all dirs recursively",
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question: do we need the flag? We can just change the default behavior and add this flag when someone complains about it. I'm wondering if almost all users want to enable it. Please correct me if I'm wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I left it in since at least having the ability to keep the behaviour that exists today, is guaranteed with such a flag. This would at least give users the time to fix all the issues before their pipeline is operational again.

I'll remove it for now as you mentioned.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would at least give users the time to fix all the issues before their pipeline is operational again

It also makes sense. We can probably explain that Trivy had a bug on Terraform scanning and didn't work well. They can keep using older versions until they fix issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for just making this the default -- it really feels like the right thing for the tool.

Signed-off-by: Simar <simar@linux.com>

Signed-off-by: Simar <simar@linux.com>
@knqyf263 knqyf263 merged commit 52cbe79 into aquasecurity:main Jun 8, 2023
9 checks passed
@jof
Copy link
Contributor Author

jof commented Jun 13, 2023

Thanks for merging this in. Making this change and putting it as default really feels like the right thing for the tool to do.

After a bit more testing and debugger time, I couldn't reproduce a simpler test case with tfsec, so I think my last comment might have been a bit of a distraction.

AnaisUrlichs pushed a commit to AnaisUrlichs/trivy that referenced this pull request Aug 10, 2023
Signed-off-by: Simar <simar@linux.com>
Co-authored-by: Simar <simar@linux.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. scan/misconfiguration Issues relating to misconfiguration scanning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants