Add overrides via .semgrepconfig.yml #319

underyx · 2021-07-01T14:03:02Z

You can add a .semgrepconfig.yml that looks like this:

overrides:
  - if.path: "tests/*" # the first two lines in this example are "conditions"
    if.rule_id: "secrets.*aws*"
    mute: true # this last line in this example is an "action"

  - if.policy_slug: "*important*"
    if.severity_in: ["ERROR"]
    unmute: true # report issues even if they were muted with # nosemgrep
    set_severity: WARNING # but lower their severity a bit

Every finding will be checked against these override definitions one by one.
The override definitions will run in the same order you have them in your config file.

An override's actions are applied when all if. conditions are true. If an override applies, it'll take actions as described by keys not starting with if.*.

Available conditions

key	example value	`True` if the finding's…
`if.path`	`"tests/*"`	path matches the given glob
`if.rule_id`	`"secrets.aws"`	rule ID matches the given glob
`if.ruleset_id`	`"secrets"`	rule is from a ruleset matching the given glob
`if.finding_id`	`"1fd8aac00"`	finding's `syntactic_id` starts with this value
`if.policy_slug`	`"security*"`	rule is from a policy matching the given glob
`if.severity_in`	`["INFO", "WARNING"]`	rule's severity is in the given list

Available actions

key	example value	description
`mute`	`true`	acts as if the line had a `# nosemgrep` comment
`unmute`	`true`	ignores any `# nosemgrep` comments
`set_severity`	`"INFO"`	changes the reported severity of the issue

aryx · 2021-07-01T14:27:29Z

Why not be more general and just let people filter the JSON with whatever they want (jq, python comprehension, whatever)?

underyx · 2021-07-01T14:36:26Z

Why not be more general and just let people filter the JSON with whatever they want (jq, python comprehension, whatever)?

First, as context, here's my earlier response to this from Slack:

That’s because semgrep-agent publishes the findings to semgrep.dev as well. So if you’re thinking of a pipeline such as scan | postprocess | publish it’s not trivial to place a processing step in the middle cause semgrep-agent already does scan | publish, and by design it’s supposed to handle the piping of data as well (from scan to publish) so that users can drop the Semgrep CI config in their repo and don’t have to think about it.

Now, a more specific list of reasons against that solution:

users would need to learn semgrep's JSON schema
we'd need to commit much more to a stable JSON format (including internal-only keys within metadata:)
it'd make it easier for users to introduce bugs (including security bugs) in Semgrep CI's behavior
maybe there is a clean way to do this but I haven't yet thought of a way to inject arbitrary code in the middle of semgrep-agent's execution — I'd want to avoid eval() if possible
this is even more difficult if we want to allow people to use any language or even binary (such as jq), cause then we need to pipe to an arbitrary subprocess
the semgrep-agent docker image doesn't even include jq nor any language environment other than Python right now, we try to keep this image lean
but even if we include some extra languages, what if a user doesn't know how to write code in any of them?
and even if users can already write python, it sounds annoying for everyone to individually have to search Python docs for how to do glob matching and similar operations
with this PR's declarative overrides, we can apply them at any point in semgrep-agent's execution, while piping results to jq is post-processing only. this PR also only does post-processing for the sake of simplicity, but we could later keep using the same config format and make some overrides skip scanning entire files for a performance boost

DrewDennison · 2021-07-01T17:13:24Z

Thoughts on naming this semgrep.yml? I feel like we have an opportunity grow this into semgrep’s version of package.json or Dockerfile so we should choose wisely on the name. I would not do .hidden since want people to see that semgrep is installed when running ls

underyx · 2021-07-01T17:15:25Z

Thoughts on naming this semgrep.yml?

I'd be okay with it. I just wanted the solution to work from the get-go and iterate in the PR. semgrep.yml already has a special meaning as it contains local rules if you want your rules committed, so we'll need to redo some of the special handling of that. But it's not too bad since rules: and overrides: can easily coexist.

Should I update the PR to name it semgrep.yml?

ievans · 2021-07-01T18:35:40Z

@underyx is/will this be documented in semgrep-docs?

underyx · 2021-07-01T18:40:44Z

@underyx is/will this be documented in semgrep-docs?

Depends on the reaction here, really. Currently I'm not even sure if this PR will be accepted. If there's very good reception from users and r2seers, then we can add it in semgrep-docs.

brendongo

IMO we should plan out more what pain point this is solving and if adding more fields and complexity to semgrep-action is the best solution for those problems.

aryx requested review from nbrahms, DrewDennison and ievans July 1, 2021 14:27

DrewDennison approved these changes Jul 1, 2021

View reviewed changes

brendongo suggested changes Jul 1, 2021

View reviewed changes

sabrinabrogren linked an issue Jul 2, 2021 that may be closed by this pull request

False positives management #322

Open

underyx added 5 commits July 7, 2021 11:38

Add overrides via .semgrepconfig.yml

58cdebb

Update readme

6ae99c7

Update readme

6614e13

Make JSON output commit_date stable in tests

f1a69a6

Add if.finding_id condition

59a26be

underyx force-pushed the bence/semgrepconfig branch from 4c07dfd to 59a26be Compare July 7, 2021 09:01

readme update

cf83058

ievans mentioned this pull request Aug 27, 2021

Ability to exclude rules from rulesets on command line semgrep/semgrep#2530

Closed

brendongo closed this Nov 19, 2021

brendongo deleted the bence/semgrepconfig branch March 8, 2022 19:02

underyx restored the bence/semgrepconfig branch April 6, 2022 22:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add overrides via .semgrepconfig.yml #319

Add overrides via .semgrepconfig.yml #319

underyx commented Jul 1, 2021 •

edited

Loading

aryx commented Jul 1, 2021

underyx commented Jul 1, 2021 •

edited

Loading

DrewDennison commented Jul 1, 2021 •

edited

Loading

underyx commented Jul 1, 2021

ievans commented Jul 1, 2021

underyx commented Jul 1, 2021

brendongo left a comment

Add overrides via .semgrepconfig.yml #319

Add overrides via .semgrepconfig.yml #319

Conversation

underyx commented Jul 1, 2021 • edited Loading

Available conditions

Available actions

aryx commented Jul 1, 2021

underyx commented Jul 1, 2021 • edited Loading

DrewDennison commented Jul 1, 2021 • edited Loading

underyx commented Jul 1, 2021

ievans commented Jul 1, 2021

underyx commented Jul 1, 2021

brendongo left a comment

Choose a reason for hiding this comment

underyx commented Jul 1, 2021 •

edited

Loading

underyx commented Jul 1, 2021 •

edited

Loading

DrewDennison commented Jul 1, 2021 •

edited

Loading