Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an improved rule selection mechanism #51

Closed
bradlarsen opened this issue May 5, 2023 · 2 comments
Closed

Add an improved rule selection mechanism #51

bradlarsen opened this issue May 5, 2023 · 2 comments
Labels
detection Related to rules or detection of sensitive information enhancement New feature or request

Comments

@bradlarsen
Copy link
Collaborator

The current mechanism for selecting rules to be used when scanning is very simplistic: you can use all the default rules, and you can specify additional YAML-format rules to load with the --rules FILE_OR_DIR option.

What are the problems with this?

  • There is no way to disable a particular rule at scan time
  • There is no way to enable only a particular rule at scan time
  • Some of the default rules are much noisier than others (The Generic * rules in particular), and result in the largest proportion of reported findings

Let's improve the rule selection mechanism. I'm thinking that this would involve a new "ruleset" mechanism, which is an explicitly-specified set of available rules. Perhaps a YAML list of rule names to enable. Or perhaps gitignore format.

We will also want a new rules list CLI command, which will print out the set of selected rules according to some ruleset.

@bradlarsen bradlarsen added enhancement New feature or request detection Related to rules or detection of sensitive information labels May 5, 2023
@bradlarsen
Copy link
Collaborator Author

The possible design space is large! For example, one approach would be to add additional properties to the rules, such as severity, precision, or keyword tags, and add a small expression language to select rules according to a filter.

This is a slippery slope. I do want to add properties like that to rules (which would help with #34, for example), but I don't want to accidentally invent an ad-hoc expression language. Inventing/choosing an expression language does seem unavoidable though: even a simple allow/deny list of rules is a simple expression language! Let's be thoughtful and deliberate about what the expression language is.

@CameronLonsdale
Copy link

From my perspective, it would be great to take advantage of transparent updates to the tool, which include the communities newest secret rules. That way I don't need to spend extra effort to go through them and selectively enable (since, after all, I don't have a clue what I might find in my scans).

A simple deny for entire patterns works, where for example, the API key is not a secret, but just something used by the SaaS for rate limiting. I'd rather exclude those entirely.

Then for within-rule filtering, you'd go with #59 or #52 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detection Related to rules or detection of sensitive information enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants