Analysis tools are programs that analyze the codebases that are affected by CVEs in the dataset. For the purposes of the CVE Benchmarking, analysis tools are expected to identify the weaknesses in the vulnerable source code, while not flagging up an alert in the patched source code.
The analysis tools are integrated into the CVE Benchmark tooling through tool drivers: small scripts that act as the interface between the analysis tools and the CVE Benchmark tooling.
The purpose of an analysis tool driver is to execute its analysis tool appropriately on the relevant commits of a CVE and convert its results to a common form that can be used to generate benchmark reports.
The contrib/tools directory contains analysis tool
drivers that have been contributed by the community. The name of
these directories is not significant, and drivers do not
have to be installed into this directory in order to be used by bin/cli run
.
Some versions of the following analysis tools are supported with drivers:
Analysis tool | Driver location |
---|---|
CodeQL | contrib/tools/codeql |
ESLint | contrib/tools/eslint |
NodeJSScan (njsscan) | contrib/tools/nodejsscan |
Some vendors might decide to not publicly release a tool driver that integrates their security product into the CVE Benchmark. Please contact your security tool vendor for more information about their support.
Before using bin/cli run
to evaluate the ability of a code analysis tool
to generate alerts, you should:
- install the analysis tool.
- configure the analysis tool driver in a configuration file.
For each supported analysis tool, there should be documentation about how to install the tool.
For example, see eslint/README.md to see how to
install eslint
.
A driver must be configured with an entry in the tools
section of a
configuration file (config.json
). Drivers need to be configured with
at least some knowledge about where their backing analysis tool is
located. For more information about configuration files, see
Configuration or the README for each
driver. Note that it is up to the driver how paths in these
configurations are interpreted, and that there are no general guarantees
about how relative paths will be resolved.
For example, to benchmark eslint
you must insert the
following snippet in your configuration file:
{
"tools": {
"eslint-default": {
"bin": "node",
"args": [
"/home/user-name/ossf-cve-benchmark/build/ts/contrib/tools/eslint/src/eslint.js",
],
"options": {
"eslintDir": "/home/user-name/analysis-tools/eslint-default"
}
}
}
}
This snippet provides the identifier you must use when using bin/cli
commands
that require a --tools
option. For example, in the snippet above eslint-default
is the identifier for using eslint
with bin/cli run
.
To add support for a new analysis tool, you should write an analysis tool driver. Typically, implementing a driver only takes a couple of hundred lines of code! To add a new driver, you should:
- Clone or fork the
openssf-cve-benchmark
repository. - Add a new directory to contrib/tools.
- Add a
README.md
file that describes how the analysis tool can be installed, and how the driver can be configured. - Optionally, add an executable
installers/install.sh
orinstallers/install.cmd
script that can install a version of the analysis tool. For example, see contrib/tools/eslint/installers/install.sh. - Add your new driver to the table of supported analysis tools.
- Open a pull request!
When adding support for a new driver, you might want to reuse the existing logic for running analysis tools on multiple CVEs. For more information about using driver.ts in a new driver, see eslint.ts.
When a user runs bin/cli run
, they are required to supply a tool
identifier <TOOL_ID>
with the --tool
option. bin/cli
looks up <TOOL_ID>
in the configuration file of the run,
and executes the bin
property of that driver configuration
with the following positional arguments:
- 1 ... n.
fixed arguments
: the values in theargs
property of the driver configuration. This is usually just a single string that points to the driver implementation - n + 1.
options
: a path to a dynamically generated JSON file that contains the inputs for the driver. The format of this JSON file is described by theDriverInputs
type.
For example, consider the following contents of config.json
:
{
"tools": {
"eslint-default": {
"bin": "node",
"args": [
"/home/user-name/ossf-cve-benchmarking/build/ts/contrib/tools/eslint/src/eslint.js"
]
},
"options": {
"eslintDir": "/home/user-name/analysis-tools/eslint-2020-12-08"
}
}
}
Here, the <TOOL-ID>
is eslint-default
and the args
value is the path to the driver
implementation eslint.js
. To run an analysis on the command line you would use:
$ bin/cli run --tool eslint-default CVE-123-456 CVE-789-000
which executes:
$ node \
/home/user-name/ossf-cve-benchmarking/build/ts/contrib/tools/eslint/src/eslint.js \
/tmp/driver-inputs.json
The JSON file with the driver inputs contains the --tool
option
value, the CVEs
to
analyze, and the effective configuration
.
In this example, the effective configuration contains two things of particular interest:
- the
tools.<TOOL_ID>.options
: a driver-specific JSON value with additional configuration options for the driver, for instance where the analysis tool is installed results
: the directory where the driver should emit result files for the analysis of each CVE
For the current example, the /tmp/driver-inputs.json
file will contain the following fragments:
{
"toolID": "eslint-default",
"bcves": [ { "CVE": "CVE-123-456", ... }, { "CVE": "CVE-789-000", ... } ],
"config": {
"results": "/home/user-name/ossf-cve-benchmarking/work/results",
"tools": {
"eslint-default": {
"bin": "node",
"args": [
"/home/user-name/ossf-cve-benchmarking/build/ts/contrib/tools/eslint/src/eslint.js"
]
},
"options": {
"eslintDir": "/home/user-name/analysis-tools/eslint-2020-12-08"
}
},
...
}
}
}
So after
/home/user-name/ossf-cve-benchmarking/build/ts/contrib/tools/eslint/src/eslint.js
has executed the eslint
installed at /home/user-name/analysis-tools/eslint-2020-12-08
, the
/home/user-name/ossf-cve-benchmarking/work/results
directory should
contain one to four files with information about how the analysis tool performed on the
vulnerable and fixed commits of CVE-123-456 and CVE-789-000.
When bin/cli run ...
executes a driver, the driver
should emit files to the provided results
directory. It's up to
the driver how it names these files, but they should be valid according
to Log.schema.json.
Below is an example with a run of the eslint
driver on commit
ba6a6f13691000ffaf22ef8e731513737659447f
of CVE-2020-4066. We can
see that an alert was raised on line 106 of file
classifiers/svm/SvmLinear.js
. It is then up to the subsequent report
generator to decide the value of this alert with respect to
CVE-2020-4066.
{
"runs": [
{
"CVE": "CVE-2020-4066",
"commit": "ba6a6f13691000ffaf22ef8e731513737659447f",
"toolID": "eslint-default",
"status": "SUCCESS",
"alerts": [
...
{
"ruleID": "security/detect-non-literal-fs-filename",
"location": {
"file": "classifiers/svm/SvmLinear.js",
"line": 106
}
}
...
]
}
]
}
When analyzing the commits of a CVE, drivers are expected to behave in much the same way as when the corresponding analysis tool has been configured by an expert. For example:
- The driver should use the analysis rules that are relevant for the selected CVE data. For example, if a CVE is about a buffer overflow in C++ code, the driver shouldn't run queries about cross-site scripting in JavaScript. If a driver establishes that an analysis tool has no relevant rules or queries for a particular CVE, the driver can abort early without starting the analysis run.
- Drivers should use any extra information that is normally available to an analysis tool when configured by an expert. For example, some analysis tools need information about special build instructions that are required to produce meaningful results for a particular project. Such information could be downloaded as part of the install steps for a driver.
- If external information is not available for an analysis tool, the
driver can use information contained in the benchmark CVE data.
For example, the
CWEs
value of a benchmark CVE entry can be used to indicate the queries that are relevant for a driver to run. Additionally, the extensions of the files that contain known weaknesses indicate the relevant programming language, which can also be used to determine which queries are relevant.