An extensible, heuristic-based vulnerability scanning tool for installed npm packages.
WARNING: npm-scan is still very much in early development and should not be used in production. We are developing more accurate heuristics. We are actively seeking new contributors with ideas for additional heuristics, so please do get in touch :)
Another important project you can contribute to is npm-zoo, where past malicious packages are uploaded for research. We need more examples in order to develop better heuristics.
npm install https://github.com/spaceraccoon/npm-scan.git npx npm-scan
Usage: npx npm-scan [options] Options: -V, --version output the version number -p, --packages-dir <dir path> set directory path for packages. defaults to node_modules -e, --exclude-heuristics <items> exclude comma-separated list of heuristics -o, --output <file path> set file path for JSON output -v, --verbose print more details for each package scan -s, --strict include low-risk heuristics -h, --help output usage information
git clone https://github.com/spaceraccoon/npm-scan.git npm link npm run scan npm run test npm run lint
Push changes on a separate branch.
To add a new heuristic, you will require the following:
- name - Name of the heuristic
- message - Description of the heuristic
- reference - URL to a report or disclosure of the vulnerability/suspicious code
- run - A function that runs the tests, returning a
resultobject if the test is positive and null otherwise. Refer to the existing heuristics for formatting.
There are two types of heuristics, file-based regex and manifest-based checks (like checking version numbers, last update time, etc.). These are specified in
lib/heuristics/index.js and affects how the scanner runs the heuristic.
On 26/11/2018, a popular NPM package event-stream1 with millions of weekly installs was [found to contain obfuscated and encrypted malicious code2 that tries to steal a user's bitcoins. This was caused by an attacker posing as a new maintainer of event-stream adding an unknown dependency (flatmap-stream3) that contained the malicious code.
This incident highlighted the shocking lack of accountability in NPM that had immense ramifications. It is the norm for packages to be linked to a chain of other packages, making it hard to maintain trust. Furthermore, NPM defaults to accepting all new minor versions of a package, making it even harder to keep track of packages.
Q. So how can we prevent such incidents from happening again?
A. We want to give all users and developers power to check their currently installed node_modules for malicious intent.
Q. How will you do that?
A. We created npm-scan. It uses simple regex-based heuristics to check for suspicious lines of code in any installed node module. A particular package with many suspicious lines of code indicates possible malicious behavior. These scores are compiled into a report for the user to check which dependency contains suspicious code, and determine if there are any areas of concern.
Q. How is this different from other scanners out there?
A. Most scanners such as Source Clear and Black Duck conduct their scanning based on databases such as the National Vulnerability Database4. This is slow as it could be weeks or months by the time a vulnerability is disclosed, inwhich time the malicious package would have been automatically updated on millions of devices.
Our heuristics-based approach gives immediate feedback on how suspicious a package is without having to run it. The heuristics are just flagging any suspicious behavior that would not be the norm of typical node packages. For example, flatmap-stream is shipped in minified form (under dependencies in package.json), which is not typical behavior (the top 50 node packages, encompassing 1000+ dependencies, all do not ship in minified form). Although this will be flagged with a low severity score (since it isn't exactly malicious behavior in itself), combine that with other heauristics such as containing the hexadecimal version of the string "AES256", and flatmap-stream starts to look very suspicious. This will all be reflected in npm-scan's report.
Currently, our detection consists of line-based regex. We score each package's severity based on the number of flagged lines.
In the future, we can assign categories to each heuristic to do more complex scoring, such as one based on CVSS v3.05.