Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time



At Slack proactively securing our systems is a top priority. One way we achieve this is by automating the detection of vulnerabilities with static code analysis scanning. Although an abundance of tools exist for scanning the majority of programming languages, our codebase is overwhelmingly written in Hack - a language not widely used outside of Slack. Rather than building our own tool from scratch, we are extending the functionality of an open source static analysis tool, Semgrep, to be compatible with Hack. But how do we teach Semgrep the Hack programming language?

Like all human languages, programming languages have a structure to them known as grammar. Grammar rules are used to create a parser which converts source code into a concrete syntax tree (CST) which is a structural representation of the code. Tree-Sitter is a fast and robust library that can generate a CST from our Hack grammar rules. This CST has many use cases such as robust syntax highlighting, code folding, linting, etc. Most importantly, Semgrep uses this CST to understand Hack on a semantic level. This semantic understanding in conjunction with Semgrep rules can detect vulnerabilities in source code. This process is demonstrated by the following diagram.

tree-sitter-hack use in Semgrep

In summary, we use tree-sitter-hack to teach Semgrep the Hack programming language.


$ git clone
$ cd tree-sitter-hack
$ npm install


$ echo 'function main(): void { print "wyd, world\\n"; }' > script.hack
$ npx tree-sitter generate
$ npx tree-sitter parse script.hack
(script [0, 0] - [3, 0]
  (function_declaration [0, 0] - [2, 1]
    name: (identifier [0, 9] - [0, 13])
    (parameters [0, 13] - [0, 15])
    return_type: (primitive_type [0, 17] - [0, 21])
    body: (compound_statement [0, 22] - [2, 1]
      (expression_statement [1, 2] - [1, 23]
        (print_expression [1, 2] - [1, 22]
          (string [1, 8] - [1, 22]))))))


$ npx tree-sitter generate
$ bin/test-corpus



Wrapper around tree-sitter generate that skips parser generation if grammar.js hasn't changed since last run.


Unlike most other Tree-sitter projects, we breakout test cases into separate files (see test/cases). This is done so editors have an easier time syntax highlighting test cases. But also I find individual files easier to navigate than the corpus.txt files used by Tree-sitter.

We use bin/generate-corpus to generate the test/corpus/case1.txt from individual test/cases files so we can still use tree-sitter test.


Run bin/generate-corpus and bin/generate-parser before running tree-sitter test.


Run bin/ts-errors on all files with .hack or .php extension in the given directory recursively.

$ ./bin/test-dir hhvm/hphp/hack/test
(3,11)-(3,18) extends
(3,1)-(6,1) function foo(): string {\n  return "AUTO332\n}\n
(4,10)-(6,1) "AUTO332\n}\n


A quieter version of bin/test-dir that only outputs failing files.


If you're interested in contributing, please see the guide.


npm doesn't allow packages with the word "hack" in their registry which is why the repo name does not match the package name.

Unfortunately, the word "hack" triggers our spam detection and can't be used in package names. We recommend choosing other keywords that highlight your package's functionality.


There's no published official Hacklang language spec so we have to make do.