Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
matching: ellipsis: Always try going deep (#8440)
Pattern `... bar()` would behave strangely because it would only try to match deeply if it did not match anything non-deeply. So depending on how the target code looked, `... bar()` would match or not match inside e.g. an `if` statement. This was confusing and not a great property for Semgrep to have. The regression was introduced in PR #852 as part of a set of optimizations done back in 0.9.0 (!), but at present reverting this one thing does not seem to have any negative perf impact. Follows: c1ca429 ("Optimize deep statement matching (#852)") Closes semgrep/semgrep-rules#660 Closes PA-2992 test plan: make test # new tests Also, compared against develop running p/default on 32 repos from stress-test-monorepo, and no meaningful slowdown or increase in memory usage was observed.
- Loading branch information
Showing
7 changed files
with
108 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
Fixed a regression introduced three years ago in 0.9.0, when optimizing | ||
the evaluation of `...` (ellipsis) to be faster. We made `...` only match | ||
deeply (inside an `if` for example) if nothing matched non-deeply, thus | ||
causing that this pattern: | ||
|
||
```python | ||
foo() | ||
... | ||
bar($A) | ||
``` | ||
|
||
would only produce a match rather than two on this code: | ||
|
||
```python | ||
foo() | ||
if cond: | ||
bar(x) | ||
bar(y) | ||
``` | ||
|
||
Semgrep matched from `foo()` to `bar(y)` and because of that it did not | ||
try to match inside the `if`, thus there was no match from `foo()` to `bar(x)`. | ||
However, if we commented out `bar(y)`, then Semgrep did match `bar(x)`. | ||
|
||
Semgrep now produces the two expected matches. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# https://github.com/returntocorp/semgrep-rules/issues/660 | ||
|
||
def decorator_factory( foo ): | ||
def decorator( function ): | ||
# ok:reproducer-660 | ||
def function_wrapper( *args, **kwargs ): | ||
# Do something with 'foo'. | ||
return function( *args, **kwargs ) | ||
return function_wrapper | ||
return decorator | ||
|
||
@decorator_factory( 'bar' ) | ||
def test( ): ''' Simple reproducer. ''' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
rules: | ||
- id: reproducer-660 | ||
patterns: | ||
- pattern-inside: | | ||
def $F(...): | ||
... | ||
def $FF(...): | ||
... | ||
... | ||
- pattern-not-inside: | | ||
def $F(...): | ||
... | ||
def $FF(...): | ||
... | ||
... | ||
<... $FF ...> | ||
- pattern: | | ||
def $FF(...): | ||
... | ||
- focus-metavariable: $FF | ||
message: function `$FF` is defined inside a function but never used | ||
languages: | ||
- python | ||
severity: ERROR | ||
metadata: | ||
category: maintainability | ||
technology: | ||
- python | ||
license: Commons Clause License Condition v1.0[LGPL-2.1-only] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
|
||
foo(); | ||
|
||
obj.met(async () => { | ||
something(); | ||
// ruleid: test | ||
x = baz(); | ||
}); | ||
|
||
bar(); | ||
|
||
// ok: test | ||
y = baz(); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
rules: | ||
- id: test | ||
message: > | ||
Test | ||
languages: | ||
- typescript | ||
severity: WARNING | ||
patterns: | ||
- pattern: baz() | ||
- pattern-inside: | | ||
foo(); | ||
... | ||
$X = baz(); | ||
- pattern-not-inside: | | ||
foo(); | ||
... | ||
bar(); | ||
... | ||
$X = baz(); |
Submodule semgrep-rules
updated
from c86b76 to 61a7cb