Releases: semgrep/semgrep
Release v1.56.0
Release v1.55.2
1.55.2 - 2024-01-05
Fixed
-
taint-mode: Semgrep was missing some sources occurring inside type expressions,
for example:char *p = new char[source(x)]; sink(x);
Now, if
x
is tainted by side-effect, Semgrep will checkx
inside the type
expressionchar[...]
and record it as tainting, and generate a finding for
sink(x)
. (pa-3313) -
taint-mode: C/C++: Sanitization by side-effect was not working correctly for
ptr->fld
l-values. In particular, ifptr
is tainted, and thenptr->fld
is
sanitized, Semgrep will now correctly considerptr->fld
as clean. (pa-3328)
Release v1.55.1
Release v1.54.3
1.54.3 - 2023-12-22
Added
-
Pro only: taint-mode: Added experimental
at-exit: true
option for sinks, that
makes a sink spec only apply on the "exit" instructions/statements of a function.
That is, the instructions after which the control-flow exits the function. This is
useful for writing rules to find "leaks", such as checking that file descriptors
are being closed within the same function where they were opened.For example, given this taint rule:
pattern-sources: - by-side-effect: true patterns: - pattern: $FILE = open(...) - focus-metavariable: $FILE pattern-sanitizers: - by-side-effect: true patterns: - pattern: $FILE.close(...) - focus-metavariable: $FILE pattern-sinks: - at-exit: true pattern: | def $FUN(...): ...
Semgrep will report a finding in the code below since at
print(content)
, after
which the control flow reaches the exit of the function, thefile
has not yet
been closed:def test(): file = open("test.txt") content = file.read() print(content) # FINDING ``` (pa-3266)
Release v1.54.2
1.54.2 - 2023-12-21
Added
- metrics: added more granular information about pro engine configurations to
help differentiate scans using different engine capabilities. For instance,
maintainers are now able to distinguish intraprocedural scans without secrets
validation from intraprocedural scans with secrets validation. This allows us
to have a better understanding of usage and more accurately identify
product-specific issues (e.g., to see if something only affects secrets scans). (ea-297)
Fixed
- Revise error message when running
semgrep ci
without being logged in to clarify that--config
is used withsemgrep scan
. (gh-9485)
Release v1.54.1
1.54.1 - 2023-12-20
No significant changes.
Release v1.54.0
1.54.0 - 2023-12-19
Added
- Pro only: taint-mode: In a function/method call, it is now possible to arbitrarily
propagate taint between arguments and the callee. For example in C, one can
propagate taint from the second argument ofstrcat
to the first, that is,
strcat($TO, $FROM)
. Another example, in C++ one can propagate taint from the
left operand of>>
to the right one, that is,$FROM >> $TO
. (pa-3131) - Semgrep IDE integrations will now cache workspace targets, so a full traversal of a workspace is no longer needed on every scan (pdx-148)
Changed
- OCaml: switch to using the tree-sitter based parser instead of
the menhir parser, which has a more complete AST, especially
for objects and classes. (ocaml)
Fixed
-
solidity: support ellipsis in for loops header in the init part. (gh-9431)
-
taint-mode: Fixed recently added
by-side-effect: only
option for taint sources,
so that it does not incorrectly taint expressions that are not l-values, e.g.
given this taint source:pattern-sources: - by-side-effect: only patterns: - pattern: delete $VAR; - focus-metavariable: $VAR
The
get(*from)
expression should not become tainted since it's not an l-value:delete get(*from); ``` (pa-2980)
-
In C++, the string literal now has a type of
char *
. It won't match with the
string
type. For instance,- metavariable-type: metavariable: $EXPR type: string
will only match
string f; // MATCH int x = f.length();
but not
const char *s; // OK s = "foo"; ``` (pa-3236)
-
taint-mode: Semgrep will now treat lambdas' parameters as fresh, so a taint rule
that finds double-delete's should not be triggered on the code below:for (ListNode *node : list) { list.erase(node, [](ListNode *p) { delete p; }); } ``` (pa-3298)
-
Fixed bug where empty tables in pyproject.toml files would fail to parse (sc-1196)
Release v1.53.0
1.53.0 - 2023-12-12
Added
- Users can now ignore findings locally in Semgrep IDE Extensions, per workspace, and this will persist between restarts (pdx-154)
- A new subcommand 'semgrep test', which is an alias for 'semgrep scan
--test'. This means that if you were running semgrep on a test
directory, you will now have to use 'semgrep scan test' otherwise it
will be interpreted as the new 'semgrep test' subcommand. (subcommand_test)
Changed
-
Handling qualified identifiers in constant propagation
We've added support for qualified identifiers in constant propagation. Notably,
this enables the following matches (with the pro engine):rules: - id: cpp-const-field languages: - cpp message: testing severity: INFO pattern: std::cout<<1
#include<iostream> #include "a.h" namespace B { class Bar { public: static const int one = 1; }; } int main() { // ruleid: cpp-const-field std::cout<<1<<std::endl; // ruleid: cpp-const-field std::cout<<A::Foo::one<<std::endl; // ruleid: cpp-const-field std::cout<<B::Bar::one<<std::endl; } ``` (gh-9354)
Fixed
- Updated the parser used for Rust. The largest change relates to how macros are
parsed. (rust)
Release v1.52.0
1.52.0 - 2023-12-05
Added
- Java: Semgrep will now recognize
String.format(...)
expressions as constant
strings when all their arguments are constant, but it will still not know
what exact string it is. For example, codeString.format("Abc %s", "123")
will match pattern"..."
but it will not match pattern"Abc 123"
. (pa-3284)
Changed
- Inter-file diff scan will be gradually introduced to a small percentage of
users through a slow rollout process. Users who enable the pro engine and
engage in differential PR scans on Github or Gitlab may experience the impact
of this update. (ea-268) - secrets: now performs more aggressive deduplication for instances where an
invalid and valid match are reported at the same range. Instead of reporting
both, we now report only the valid match when they are otherwise visually
identical. (scrt-271)
Fixed
-
In expression-based languages, definitions are also expressions.
This change allows dataflow to properly handle definition expressions.
For example, the pattern
0 == 0
will matchx == 0
indef f(c) do x = (y = 0) x == 0 end
because now dataflow is able to handle the expression
y = 0
. (pa-3262) -
In version 1.14.0 (pa-2477) we made sink-matching more precise when the sink
specification was like:pattern-sinks: - patterns: - pattern: sink($X, ...) - focus-metavariable: $X
Where the sink specification most likely has the intent to specify the first
argument ofsink
as a sink, andsink(ok1 if tainted else ok2)
should NOT
produce a finding, becausetainted
is not really what is being passed to
thesink
function.But we only intercepted the most simple pattern above, and more complex sink
specifications that had the same intent were not properly recognized.Now we have generalized that pattern to cover more complex cases like:
patterns: - pattern-either: - patterns: - pattern-inside: | def foo(...): ... - pattern: sink1($X) - patterns: - pattern: sink2($X) - pattern-not: bar(...) - focus-metavariable: $X ``` (pa-3284)
-
Updated the parser used for Rust (rust)
Release v1.51.0
1.51.0 - 2023-11-29
Added
- taint-mode: Added experimental rule option
taint_match_on: source
that makes
Semgrep report taint findings on the taint source rather than on the sink. (pa-3272)
Changed
- Elixir got moved to Pro. (elixir_pro)
- The 'fix_regex' field has been removed from the semgrep JSON output. Instead,
the 'fix' field contains the content the result of the fix_regex. (fix_regex) - taint-mode: Tweaked experimental option
taint_only_propagate_through_assignments
so that when it is enabled,tainted.field
andtainted(args)
will no longer
propagate taint. (pa-2193)
Fixed
-
Fixed Kotlin parse error.
Previously, code like this would throw a parse error
fun f1(context : Context) { Foo(context).elem = var1 }
due to not recognizing
Foo(context).elem = ...
as valid.
Now calls are recognized as valid in the left hand of
assignments. (ea-104) -
Python:
async
statements are now translated into the Dataflow IL so Semgrep
will be able to report findings e.g. insideasync with ...
statements. (gh-9182) -
In gitlab output, use correct url attached to rule instead of generating it.
This fixes url for supply chain findings. (gitlab) -
- The language server will no longer crash on startup for intellij (language-server)
-
- The language server no longer crashes when installed through pip on Mac platforms (language-server-macos)
-
taint-mode: When we encountered an assignment
lval := expr
whereexpr
returned
no taints, we automatically cleanedlval
. This was correct in the early days of
taint-mode, before we introduced taint by side-effect, but it is wrong now. The LHS
lval
may be tainted by side-effect, in which case we cannot clean it just because
expr
returns no taint. Now that we introducedby-side-effect: only
it is also
possible forexpr
to taintlval
by side-effect and return no immediate taint.This kind of source should now work as expected:
- by-side-effect: true patterns: - pattern: | $X = source() - focus-metavariable: $X ``` (pa-3164)
-
taint-mode: Fixed a bug in the recently added
by-side-effect: only
option
causing that when matching l-values of the forml.x
andl[i]
, thel
occurence would unexpectedly become tainted too. This led to FPs in some
typestate rules like those checking for double-lock or double-free.Now a source such as:
- by-side-effect: only patterns: - pattern: lock($L) - focus-metavariable: $L
will not produce FPs on code such as:
lock(obj.l) unlock(obj.l) lock(obj.l) ``` (pa-3282)
-
taint-mode: Removed a hack that made
lval = new ...
assignments to not clean
thelval
despite the RHS was not tainted. This caused FPs in double-free rules.
For example, given this source:pattern-sources: - by-side-effect: only patterns: - pattern: delete $VAR; - focus-metavariable: $VAR
And the code below:
while (nondet) { int *v = new int; delete v; // FP }
The
delete v
statement was reported as a double-free, because Semgrep did not
consider thatv = new int
would clean the taint inv
. (pa-3283)