Release of 133 statements (strict subset) #371
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
While operating Eclipse Steady internally at SAP, the SAP Security Research team collected a dataset of approximately 1400 vulnerability statements of which a first dataset was published in 2019 as part of project KB (and described in MSR 2019).
Our goal is to disclose an additional batch of vulnerability statements from the SAP-internal dataset and to make them available to the community. To do so, we used Prospector to search for fix commits for the vulnerabilities corresponding to those internal data and we compared the findings of the tool with the fix commits we had identified through manual search.
Objective
With this PR we release 133 new statements for which the results found by Prospector (according to the criteria detailed below) matched the fix commits that appeared in the statements of our private dataset. Differently from #369 and #370, the commits of these new statements were a strict subset of Prospector's findings.
Analysis Process
The process begins by executing Prospector to automatically identify fix commits for every vulnerability listed in our private dataset, using the vulnerability identifier and the URL of the vulnerable project's GitHub repository as input parameters. The internal dataset was used for both input parameters.
Upon completion of Prospector execution, an evaluation was performed examining all results from Prospector's findings, extracting candidate fix commits based on the rules that matched.
The ranking system of Prospector evaluates each candidate fix commit based on predefined rules, assigning a relevance value to each. To ensure the highest level of confidence in identifying the commit as an effective patch, the statements released with this PR only contain fix commits that matched at least one high-relevance rule.
It is important to note that Prospector introduces the concept of twin commit. Twin commits can be categorized as an equivalent fix commit from a different, parallel branch. These twin commits refer to changes that are made on one branch and then applied to other branches that support different versions of the project.
To better understand the impact of identifying twin commits when comparing the results gathered from Prospector with the internal dataset, we proceeded with two distinct evaluative measures.
After having extracted all high-confidence commits from Prospector's findings and grouped them for each vulnerability, we compared the results with our internal dataset. We decided to release as valid statements those that aligned with at least one of the following three validation criteria.
Results
In contrast to instances where Prospector identified an exact correspondence with the commits in the dataset, we also aim to publish new statements in which the dataset's commits compose a strict subset of the high-confidence fix commits identified by Prospector. This approach allows the disclosure of 133 additional statements validated through Prospector. For the initial publication of these new statements, we have decided to release those containing the commits present in the dataset exclusively, without adding any supplementary ones uncovered through Prospector.