Add per-line whitelisting capabilities #900

mschwager · 2020-06-03T17:31:39Z

We currently support a few different means of whitelisting: --exclude, --exclude-dir, and pattern-not to some extent. These are useful, but don't have the level of precision an AppSec team may need. Often a security engineer will want to whitelist a specific line in a file, not the entire file. Let's add functionality around whitelisting a specific line in a file. For example:

os.system('ls -alh')  # noqa command-injection-os-system: constant argument is okay

A good first pass would be simply including something like noqa as a comment to whitelist a line. A more fully featured version would be something like:

noqa <rule-id>: <message>

Where <rule-id> is optional and will only whitelist the specific rule. A naked noqa will whitelist all rules. The <message> is also optional and is used to describe why it's whitelisted. A common line will be something like noqa command-injection-os-system: AppSec reviewed.

Here are how some other linter/static analysis apps do it:

flake8: https://flake8.pycqa.org/en/latest/user/violations.html - they have a very comprehensive story for selecting and ignoring/whitelisting.
bandit: https://bandit.readthedocs.io/en/latest/config.html#suppressing-individual-lines
eslint: https://eslint.org/docs/user-guide/configuring#disabling-rules-with-inline-comments - they also have a good story here.

The text was updated successfully, but these errors were encountered:

mjambon · 2020-06-12T23:11:37Z

Here's a list of whitelisting methods I'm aware of:

excluding paths or rules from the command line.
using .semgrepignore files that would be similar to .gitignore files. They're familiar and local to a subfolder. It's nice to be able to rename or move a folder without having to update the semgrepignore paths. Possible extension: Instead of excluding whole files from semgrep scanning, it could only exclude certain rules. This would require a special syntax for the user to learn. Perhaps a yaml format would be appropriate.
from within a file, disabling certain rules for the whole file.
from within a file, disabling certain rules for a piece of code.

(1) and (2) can be done in the same way regardless of the target language. This is pretty nice to the user, who can reuse their knowledge to work across all languages.

(3) and (4) are both useful. (3) could be done using an external mechanism ((1) or (2)) so it's not critical to support it. (4) however needs to be tackled on its own. A familiar solution to disable (or enable) a check locally consists in interpreting special comments. The syntax is typically described as:

"place a comment containing semgrep: disable foo on the line exhibiting pattern foo, for which semgrep should remain silent".
or "place such special comment on its own line at the beginning of the block to which it applies".

The difficulty for (3) and (4) is that we need to establish a special syntax for comments, in all supported languages, and make sure it's easy and natural for the user. On the implementation side, it requires special work for each language, to parse special comments occurring in specific locations. This is commonly done for most languages, so there should be a way.

In conclusion:

(3) and (4) require a lot more work to implement and document than (1) and (2).
local override (4) seems valuable and without a great alternative, but also feels intrusive (what if there's also a linter and a documentation extraction tool, both requiring special comments as well?).

Personally, at this time:

I'd like to see how far local .semgrepignore files can take us.
Supporting special comment syntax in all the languages we support looks like a lot of work for the implementors but also for the user who works with multiple languages.
I'm actually not sure if disabling a check for just a local block of code or line is that useful. Disabling the check for the whole file may be a worthy compromise.

aryx · 2020-06-14T20:37:57Z

Yep want that feature too!
Note that regarding the comments for each language, we may not have to handle all kind of
comments, regexp sometimes can be good an universal :) just looking for semgrep-disable:
in the line before should be enough and work immediately for all languages.

aryx · 2020-06-16T18:49:25Z

Note a huge fan of "nosem". What does "noqa" means btw?
Otherwise great! thx!

mschwager · 2020-06-16T19:00:29Z

Note a huge fan of "nosem". What does "noqa" means btw?
Otherwise great! thx!

noqa comes from flake8, which I believe stands for "no QA" like "no quality assurance." But this is also a somewhat familiar trend as bandit uses nosec too. We could follow eslint's lead with something like semgrep-disable, which is much more descriptive but obviously misses brevity 🤷‍♂️ (although with rule names like python.flask.experimental.security.audit.secure-cookies.secure-session-cookies.secure-session-cookies I don't know if we'll see much brevity :))

mschwager added enhancement New feature or request UX labels Jun 3, 2020

dlukeomalley added feature:cli-ux and removed UX labels Jun 5, 2020

mschwager self-assigned this Jun 16, 2020

mschwager pushed a commit that referenced this issue Jun 16, 2020

Fix #900, add inline whitelisting capabilities

3e412e6

mschwager closed this as completed in 4b171bd Jun 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add per-line whitelisting capabilities #900

Add per-line whitelisting capabilities #900

mschwager commented Jun 3, 2020

mjambon commented Jun 12, 2020 •

edited

Loading

aryx commented Jun 14, 2020

aryx commented Jun 16, 2020

mschwager commented Jun 16, 2020 •

edited

Loading

Add per-line whitelisting capabilities #900

Add per-line whitelisting capabilities #900

Comments

mschwager commented Jun 3, 2020

mjambon commented Jun 12, 2020 • edited Loading

aryx commented Jun 14, 2020

aryx commented Jun 16, 2020

mschwager commented Jun 16, 2020 • edited Loading

mjambon commented Jun 12, 2020 •

edited

Loading

mschwager commented Jun 16, 2020 •

edited

Loading