You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I have a code base where a certain sanitizer is used both as sink(id) as well as sanitize(id.toString()) (this is Kotlin). I've defined two sanitizer rules, each of them correctly matching one of these patterns, however when used together it looks like a more generic one takes priority (sink(id)) and the other one is completely ignored:
# These two sanitizer rules together should produce zero findings# (identifying sanitizers in all four test cases), but they produce# (false positive) findings at test1 and test2pattern-sanitizers:
# Alone, produces findings at test1 and test2
- patterns:
- pattern: sanitize($X)
- focus-metavariable: $Xby-side-effect: true# Alone, produces findings at test3 and test4
- patterns:
- pattern: sanitize($X.toString())
- focus-metavariable: $Xby-side-effect: true
Expected behavior
All sanitizer rules should be matched separately. It should be never possible to create more matches by adding a new sanitizer rule to existing ones, but that's exactly what we observe right now: commenting out the second rule will avoid matching test1 and test2, but enabling it produce these false positive matches.
Screenshots
With only one sanitizer pattern enabled (sanitize($X.toString())), test1 and test2 examples are correctly identified as sanitized and not matched:
By adding an additional sanitizer pattern, parsing of the previous rule appears to be affected, as test and test2 not are considered matches:
What is the priority of the bug to you?
P0: blocking your adoption of Semgrep or workflow
P1: important to fix or quite annoying
P2: regular bug that should get fixed
Environment
The issue surfaced when using Semgrep Pro for integration within the CI/CD, but it is reproducible on semgrep.dev as well.
Use case
Significantly reduce false positives, increasing confidence in the Semgrep results and enabling it's adoption in our CI/CD.
The text was updated successfully, but these errors were encountered:
There are three sanitization rules, each of them matching the relevant test example when the other two are commented out, but as soon as any two are enabled, they all get effectively cancelled and produce three false positives:
@andrew-konstantinov for the first issue here you have a workaround: https://semgrep.dev/playground/s/7KNqw; the second rule I would have written it like this: https://semgrep.dev/playground/s/8GNlB. Since sources/sanitizers/sinks can be arbitrary search-mode formulas, there is often a way to work around things. But yeah we need to address those limitations to make it simpler to write taint rules, both were known to us, but still very useful to get this bug report, so thanks for that and for the examples.
Thank you, @IagoAbal! I will try to apply this approach to my rules.
Are there any resources you could share about debugging issues such as this one, besides https://semgrep.dev/docs/troubleshooting/rules? I like the "Inspect Rule" gadget in the Playground UI, but it's sometimes it's not enough. I'm comfortable reading/hacking the source code, so pointers to useful methods would suffice as well.
Hope it helps! For the issues you reported here, I think "Inspect Rule" is probably one of the best tools you have. In the second example, you can observe that the individual pattern-insides are matching code, but the patterns matches nothing (because it computes the intersection). If you use the semgrep-core binary you also have the -dfg_tainting flag, although I don't think it would have helped you for any of these issues. You are also welcome to tag me in our Community Slack ("iago (Semgrep)") any time.
Describe the bug
I have a code base where a certain sanitizer is used both as
sink(id)
as well assanitize(id.toString())
(this is Kotlin). I've defined two sanitizer rules, each of them correctly matching one of these patterns, however when used together it looks like a more generic one takes priority (sink(id)
) and the other one is completely ignored:To Reproduce
A concise example that demonstrates the problem (Kotlin): https://semgrep.dev/playground/s/5rzll
Expected behavior
All sanitizer rules should be matched separately. It should be never possible to create more matches by adding a new sanitizer rule to existing ones, but that's exactly what we observe right now: commenting out the second rule will avoid matching test1 and test2, but enabling it produce these false positive matches.
Screenshots
With only one sanitizer pattern enabled (
sanitize($X.toString())
),test1
andtest2
examples are correctly identified as sanitized and not matched:By adding an additional sanitizer pattern, parsing of the previous rule appears to be affected, as
test
andtest2
not are considered matches:What is the priority of the bug to you?
Environment
The issue surfaced when using Semgrep Pro for integration within the CI/CD, but it is reproducible on semgrep.dev as well.
Use case
Significantly reduce false positives, increasing confidence in the Semgrep results and enabling it's adoption in our CI/CD.
The text was updated successfully, but these errors were encountered: