Skip to content
Fast and syntax-aware semantic code pattern search for many languages: like grep but for code
Python JavaScript OCaml Shell PHP Standard ML Other
Branch: develop
Clone or download


Type Name Latest commit message Commit time
Failed to load latest commit information.
.bento bento: stop ignoring unused vars (#837) May 27, 2020
.github ci: fix deploying develop docker image (#870) May 29, 2020
.vscode 0.44.0 (#138) Feb 10, 2020
docs Optimize deep statement matching (#852) May 29, 2020
install-scripts Setup benchmarking flow (#855) May 29, 2020
pfff @ a6aa5c9 Update pfff (#857) May 28, 2020
release-scripts Remove unused files (#848) May 28, 2020
semgrep-core make sync and moved code in (#865) May 29, 2020
semgrep tests: more verbose output for semgrep-rules regression tests (#869) May 29, 2020
stubs fixup! Fix mypy not using strict options Mar 9, 2020
.bentoignore bento: stop ignoring unused vars (#837) May 27, 2020
.dockerignore Switch environment to use Pipenv (#718) May 14, 2020
.git_archival.txt Switch environment to use Pipenv (#718) May 14, 2020
.gitattributes Switch environment to use Pipenv (#718) May 14, 2020
.gitbook.yaml add in more gitbook sections Apr 8, 2020
.gitignore Move semgrep-core dynamic version to own untracked file (#816) May 22, 2020
.gitmodules 0.4.8 (#273) Mar 9, 2020
.pre-commit-config.yaml Turn on pre-commit hooks for ocaml code (#817) May 22, 2020
.pre-commit-hooks.yaml more renames of sgrep -> semgrep (#553) May 3, 2020 Update pfff (#857) May 28, 2020 Rename sgrep to semgrep in other markdown files (#542) Apr 22, 2020 more renames of sgrep -> semgrep (#553) May 3, 2020
Dockerfile Move semgrep-core dynamic version to own untracked file (#816) May 22, 2020
LICENSE revert moving the files to docs folder because it breaks Github conve… Feb 26, 2020 Add rules OWASP coverage chart. Move 'registry' higher up in the READ… May 29, 2020 more renames of sgrep -> semgrep (#553) May 3, 2020
mypy.ini Refactor sgrep_lint (#509) Apr 21, 2020


Homebrew r2c Community Slack r2c Twitter

semgrep is a tool for easily detecting and preventing bugs and anti-patterns in your codebase. It combines the convenience of grep with the correctness of syntactical and semantic search. Developers, DevOps engineers, and security engineers use semgrep to write code with confidence.

Try it now:


Language support:

Python Javascript Go       Java   C         Typescript PHP    
Coming... Coming...

Example patterns:

Pattern Matches
$X == $X if ( == ...
requests.get(..., verify=False, ...) requests.get(url, timeout=3, verify=False)
os.system(...) from os import system; system('echo semgrep')
$ELEMENT.innerHTML el.innerHTML = "<img src='x' onerror='alert(`XSS`)'>";
$TOKEN.SignedString([]byte("...")) ss, err := token.SignedString([]byte("HARDCODED KEY"))

see more example patterns in the live registry viewer


On macOS, binaries are available via Homebrew:

brew install returntocorp/semgrep/semgrep

On Ubuntu, an install script is available on each release here


To try semgrep without installation, you can also run it via Docker:

docker run --rm -v "${PWD}:/home/repo" returntocorp/semgrep --help


Example Usage

Here is a simple Python example, We want to retrieve an object by ID:

def get_node(node_id, nodes):
    for node in nodes:
        if ==  # Oops, supposed to be 'node_id'
            return node
    return None

This is a bug. Let's use semgrep to find bugs like it, using a simple search pattern: $X == $X. It will find all places in our code where the left- and right-hand sides of a comparison are the same expression:

$ semgrep --lang python --pattern '$X == $X'
3:        if ==  # Oops, supposed to be 'node_id'


For simple patterns use the --lang and --pattern flags. This mode of operation is useful for quickly iterating on a pattern on a single file or folder:

semgrep --lang javascript --pattern 'eval(...)' path/to/file.js

Configuration Files

For advanced configuration use the --config flag. This flag automagically handles a multitude of input configuration types:

  • --config <file|folder|yaml_url|tarball_url|registy_name>

In the absence of this flag, a default configuration is loaded from .semgrep.yml or multiple files matching .semgrep/**/*.yml.


As mentioned above, you may also specify a registry_name as configuration. r2c provides a registry of rules. These rules have been tuned on thousands of repositories using our analysis platform.

You can browse the registry at To run a set of rules, use a rule ID or namespace.

# Run a specific rule
semgrep --config=

# Run a set of rules
semgrep --config=

The registry features rules for many programming errors, including security issues and correctness bugs. Security rules are annotated with CWE and OWASP metadata when applicable. OWASP rule coverage per language is displayed below.

Pattern Features

semgrep patterns make use of two primary features:

  • Metavariables like $X, $WIDGET, or $USERS_2. Metavariable names can only contain uppercase characters, or _, or digits, and must start with an uppercase character or _. Names like $x or $some_value are invalid. Metavariables are used to track a variable across a specific code scope.
  • The ... (ellipsis) operator. The ellipsis operator abstracts away sequences of zero or more arguments, statements, characters, and more.

For example,

$FILE = open(...)

will find all occurrences in your code where the result of an open() call with zero or more arguments is assigned to a variable.

Composing Patterns

You can also construct rules by composing multiple patterns together.

Let's consider an example:

  - id: open-never-closed
      - pattern: $FILE = open(...)
      - pattern-not-inside: |
          $FILE = open(...)
    message: "file object opened without corresponding close"
    languages: [python]
    severity: ERROR

This rule looks for files that are opened but never closed. It accomplishes this by looking for the open(...) pattern and not a following close() pattern. The $FILE metavariable ensures that the same variable name is used in the open and close calls. The ellipsis operator allows for any arguments to be passed to open and any sequence of code statements in-between the open and close calls. We don't care how open is called or what happens up to a close call, we just need to make sure close is called.

For more information on rule fields like patterns and pattern-not-inside see the configuration documentation.


Equivalences are another key concept in semgrep. semgrep automatically searches for code that is semantically equivalent. For example, the following patterns are semantically equivalent. The pattern subprocess.Popen(...) will fire on both.

from subprocess import Popen as sub_popen

result = sub_popen("ls")

For a full list of semgrep feature support by language see the language matrix.

Programmatic Usage

To integrate semgrep's results with other tools, you can get results in machine-readable JSON format with the --json option, or formatted according to the SARIF standard with the --sarif flag.

See our output documentation for details.



semgrep is LGPL-licensed, feel free to help out: CONTRIBUTING.

semgrep is a frontend to a larger program analysis library named pfff. pfff began and was open-sourced at Facebook but is now archived. The primary maintainer now works at r2c. semgrep was originally named sgrep and was renamed to avoid collisons with existing projects.

Commercial Support

semgrep is proudly supported by r2c. We're hiring!

Interested in a fully-supported, hosted version of semgrep? Drop your email and we'll ping you!

You can’t perform that action at this time.