Skip to content

Filter expansion#1

Merged
lukeatayde merged 14 commits intomasterfrom
filterExpansion
Nov 29, 2022
Merged

Filter expansion#1
lukeatayde merged 14 commits intomasterfrom
filterExpansion

Conversation

@lukeatayde
Copy link
Collaborator

@lukeatayde lukeatayde commented Nov 18, 2022

Description:

This PR introduces a new mechanism for filtering out false positives.

In prior versions of Gitleaks, one could filter out results using a collection of regex patterns known as the "allowlist". Unfortunately, patterns in the allowlist only match on the secret itself, not the lines enclosing the entire secret. This means that filtering rules that needed to consider the surrounding context of a secret were impossible to express prior to this update.

Consider the following code snippet:

non_sensitive_key_1 = "<SOME_HASH_1>"
non_sensitive_key_2 = "<SOME_HASH_2>"
non_sensitive_key_3 = "<SOME_HASH_3>"
non_sensitive_key_4 = "<SOME_HASH_4>"

Previously, each of the keys would be caught by Gitleaks' generic key matching rule, but it would be impossible for the user to filter out the results. This is due to the fact that the allowlist regex patterns would only ever be supplied the <SOME_HASH_i> strings, instead of the entire line. As a result, a filter pattern like non_sensitive_key_.* wouldn't work to suppress the secrets.

To address this problem, the update introduces the ability to configure an "enclosing line allowlist". These allowlist match against all the lines enclosed in a finding, not just the secret itself. As a result, one can filter out results like non_sensitive_key_1 = "<SOME_HASH_1>" using the pattern non_sensitive_key_.*. Like the older allowlists, these patterns can be applied globally, or to a particular rule.

While this update expands the list of patterns a user can filter out on, it isn't a complete solution. Unfortunately, Go's built-in regex library doesn't support positive or negative "lookaround" commands. This was an intentional decision by the language's developers to maintain linear time performance in Go's standard library. A developer with these assertions would have greater flexibility over what Gitleaks cab natively filter out. Unfortunately, the changes required to implement this would require a new library, refactoring, and a suspension of linear performance guarantees.

The modifications proposed in this MR could be simplified updating the older version of the allowlist to match on the entire finding instead of the secret alone. I decided to keep the old allowlist around to maintain backwards compatibility. Any patterns utilizing Regex anchor tokens would likely break otherwise.

Other Changes

  • Renamed the "line" field in the findings structure to "lines". This is to reflect the fact that the field can hold more than one line when it encodes newline characters.

Checklist:

  • [✅] Does your PR pass tests?
  • [✅] Have you written new tests for your changes?
  • [✅] Have you lint your code locally prior to submission?

@Kennysan Kennysan self-assigned this Nov 22, 2022
@lukeatayde lukeatayde requested a review from blevz November 22, 2022 21:15
@lukeatayde lukeatayde requested a review from blevz November 28, 2022 19:23
Copy link
Collaborator

@blevz blevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@lukeatayde lukeatayde merged commit b671eb3 into master Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants