Skip to content

Commit

Permalink
matching: ellipsis: Always try going deep (#8440)
Browse files Browse the repository at this point in the history
Pattern `... bar()` would behave strangely because it would only try to
match deeply if it did not match anything non-deeply. So depending on
how the target code looked, `... bar()` would match or not match inside
e.g. an `if` statement. This was confusing and not a great property for
Semgrep to have.

The regression was introduced in PR #852 as part of a set of
optimizations done back in 0.9.0 (!), but at present reverting this one
thing does not seem to have any negative perf impact.

Follows: c1ca429 ("Optimize deep statement matching (#852)")

Closes semgrep/semgrep-rules#660
Closes PA-2992

test plan:
make test # new tests
Also, compared against develop running p/default on 32 repos from
stress-test-monorepo, and no meaningful slowdown or increase in memory
usage was observed.
  • Loading branch information
IagoAbal committed Aug 17, 2023
1 parent 0c849f2 commit 8e718be
Show file tree
Hide file tree
Showing 7 changed files with 108 additions and 10 deletions.
25 changes: 25 additions & 0 deletions changelog.d/pa-2992.fixed
@@ -0,0 +1,25 @@
Fixed a regression introduced three years ago in 0.9.0, when optimizing
the evaluation of `...` (ellipsis) to be faster. We made `...` only match
deeply (inside an `if` for example) if nothing matched non-deeply, thus
causing that this pattern:

```python
foo()
...
bar($A)
```

would only produce a match rather than two on this code:

```python
foo()
if cond:
bar(x)
bar(y)
```

Semgrep matched from `foo()` to `bar(y)` and because of that it did not
try to match inside the `if`, thus there was no match from `foo()` to `bar(x)`.
However, if we commented out `bar(y)`, then Semgrep did match `bar(x)`.

Semgrep now produces the two expected matches.
17 changes: 8 additions & 9 deletions src/matching/Generic_vs_generic.ml
Expand Up @@ -2249,15 +2249,14 @@ and m_stmts_deep ~inside ~less_is_ok (xsa : G.stmt list) (xsb : G.stmt list) =
m_stmts_deep ~inside ~less_is_ok xsa bbs
| ( ({ s = G.ExprStmt ({ e = G.Ellipsis _i; _ }, _); _ } :: _ as xsa),
(_ :: _ as xsb) ) ->
(* let's first try without going deep *)
m_list__m_stmt xsa xsb >!> fun () ->
if_config
(fun x -> x.go_deeper_stmt)
~then_:
(match SubAST_generic.flatten_substmts_of_stmts xsb with
| None -> fail () (* was already flat *)
| Some (xsb, _UNUSED_last_stmt) -> m_list__m_stmt xsa xsb)
~else_:(fail ())
m_list__m_stmt xsa xsb
>||> if_config
(fun x -> x.go_deeper_stmt)
~then_:
(match SubAST_generic.flatten_substmts_of_stmts xsb with
| None -> fail () (* was already flat *)
| Some (xsb, _UNUSED_last_stmt) -> m_list__m_stmt xsa xsb)
~else_:(fail ())
(* dots: metavars: $...BODY *)
| ( ({ s = G.ExprStmt ({ e = G.N (G.Id ((s, _), _idinfo)); _ }, _); _ } :: _
as xsa),
Expand Down
13 changes: 13 additions & 0 deletions tests/rules/ellipsis_stmts_deep.py
@@ -0,0 +1,13 @@
# https://github.com/returntocorp/semgrep-rules/issues/660

def decorator_factory( foo ):
def decorator( function ):
# ok:reproducer-660
def function_wrapper( *args, **kwargs ):
# Do something with 'foo'.
return function( *args, **kwargs )
return function_wrapper
return decorator

@decorator_factory( 'bar' )
def test( ): ''' Simple reproducer. '''
29 changes: 29 additions & 0 deletions tests/rules/ellipsis_stmts_deep.yaml
@@ -0,0 +1,29 @@
rules:
- id: reproducer-660
patterns:
- pattern-inside: |
def $F(...):
...
def $FF(...):
...
...
- pattern-not-inside: |
def $F(...):
...
def $FF(...):
...
...
<... $FF ...>
- pattern: |
def $FF(...):
...
- focus-metavariable: $FF
message: function `$FF` is defined inside a function but never used
languages:
- python
severity: ERROR
metadata:
category: maintainability
technology:
- python
license: Commons Clause License Condition v1.0[LGPL-2.1-only]
13 changes: 13 additions & 0 deletions tests/rules/ellipsis_stmts_deep1.ts
@@ -0,0 +1,13 @@

foo();

obj.met(async () => {
something();
// ruleid: test
x = baz();
});

bar();

// ok: test
y = baz();
19 changes: 19 additions & 0 deletions tests/rules/ellipsis_stmts_deep1.yaml
@@ -0,0 +1,19 @@
rules:
- id: test
message: >
Test
languages:
- typescript
severity: WARNING
patterns:
- pattern: baz()
- pattern-inside: |
foo();
...
$X = baz();
- pattern-not-inside: |
foo();
...
bar();
...
$X = baz();
2 changes: 1 addition & 1 deletion tests/semgrep-rules

0 comments on commit 8e718be

Please sign in to comment.