Generate SARIF File Of Project Vulnerability Findings #3561

aravindparappil46 · 2024-03-18T06:14:16Z

Description

If request header Accept: application/sarif+json is provided to the GET Finding By Project UUID API, it will now return a SARIF file with the vulnerability findings in that project.

SARIF file is generated based on a pebble template.

Addressed Issue

Closes #909

Additional Details

In the GET /v1/finding/project/{uuid} API, added a new Media Type application/sarif+json to the @Produces, to indicate the new response Content-Type.

Added a sarif.peb template file to a new directory under templates/findings/sarif.peb

Sample bom.json and it's resulting .sarif file is attached below:
bom-and-sarif.zip

I have tried to include most of the fields from Finding.java into a meaningful field in the SARIF file based on the SARIF schema here. Those fields for which I could not find an adequate mapping I have kept under the properties object (fields such as epssScore, severityRank etc).

Have tested the generated SARIF files against the validator here with "GitHub Ingestion Rules" enabled as well. Looks like most of the violations are geared towards static analysis results (like, region, helpUri etc missing).

Feedback Needed

Would appreciate if the sarif.peb template is scrutinized so that I am not making any blatant mistakes with the field mappings (from Finding.java to the SARIF schema). Perhaps there are more meaningful fields in the sarif schema that we could possibly assign a finding field to.

A picture showing the fields in Finding.java and its mapped field on the SARIF file is shown below. The green comments are the target fields in the SARIF schema.

results.properties mean that the finding field is added as a key/value pair into the properties object of the schema.

I have added logic to calculate the results.level field of the SARIF schema like below:

if finding.vulnerability.severity in ('LOW', 'INFO')
  level = "note"
else if finding.vulnerability.severity == 'MEDIUM':
  level = "warning"
else if finding.vulnerability.severity in ('HIGH', 'CRITICAL'):
  level = "error"
else:
  level = "none"

Do let me know in case we need to modify it

Even though we have set content-disposition header in the response, I was not able to confirm that a file named findings-{{uuid}}.sarif was indeed being downloaded locally via cURL or Postman testing. It just displays the sarif json on the terminal now. Please let me know in case there is a way to test the actual file download. (I hope we do not have to change the media type to octet stream instead?)
For rules.name, I have concatenated finding.component.name and finding.vulnerability.cweName, hoping that it's more friendly to consume for an end-user. Please do let me know if this needs to be changed.

Checklist

I have read and understand the contributing guidelines
This PR fixes a defect, and I have provided tests to verify that the fix is effective
This PR implements an enhancement, and I have provided tests to verify that it works as intended
This PR introduces changes to the database model, and I have added corresponding update logic
This PR introduces new or alters existing behavior, and I have updated the documentation accordingly

If request header `Accept: application/sarif+json` is provided to the GET Finding By Project UUID API, it will now return a SARIF file with the vulnerability findings in that project. SARIF file is generated based on a pebble template Signed-off-by: Aravind Parappil <aravindparappil@gmail.com>

codacy-production · 2024-03-18T06:44:52Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ -0.05% (target: -1.00%)	✅ 91.43% (target: 70.00%)

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`e55e950`)	21527	15953	74.11%
Head commit (`55300d3`)	21606 (+79)	16001 (+48)	74.06% (-0.05%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#3561)	35	32	91.43%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{You may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation}

nscuro · 2024-03-19T18:20:10Z

Great work @aravindparappil46!

Looking at the generated SARIF, I noticed that you chose to key rules by vulnerability ID. This can lead to situation where duplicate rules are created, if a vulnerability affects more than one component. For example:

{
  "id": "CVE-2023-6378",
  "name": "logback-classic - Deserialization of Untrusted Data",
  "shortDescription": {
    "text": "A serialization vulnerability in logback receiver component part of  logback version 1.4.11 allows an attacker to mount a Denial-Of-Service  attack by sending poisoned data."
  }
},
{
  "id": "CVE-2023-6378",
  "name": "logback-core - Deserialization of Untrusted Data",
  "shortDescription": {
    "text": "A serialization vulnerability in logback receiver component part of  logback version 1.4.11 allows an attacker to mount a Denial-Of-Service  attack by sending poisoned data."
  }
},

Consider removing the component name from the name field, that way you can safely reference the same rule multiple times, for multiple components. You will need to add some logic to ensure that you're not creating duplicate rule as you iterate over the findings, which in turn might require more pre-computation before rendering the template.

The SARIF validator also appears to prefer rule.names in pascal case, e.g.

SARIF2012: runs[0].tool.driver.rules[16].name: 'spring-security-crypto - Use of Insufficiently Random Values' is not a Pascal-case identifier. For uniformity of experience across all tools that produce SARIF, the friendly name should be a single Pascal-case identifier, for example, ProvideRuleFriendlyName.

For the physicalLocation of results, it is necessary to provide a region:

GH1003: runs[0].results[0].locations[0].physicalLocation: The 'region' property is absent. GitHub Advanced Security code scanning can display the correct location only for results that provide a 'region' object with line and optional column information. At minimum, 'region.startLine' is required. 'region' can also provide 'startColumn', 'endLine', and 'endColumn', although all of those have reasonable defaults.

Looking at how Trivy does it, it just sets all fields to 1: https://github.com/aquasecurity/trivy-action/blob/062f2592684a31eb3aa050cc61e7ca1451cecd3d/test/data/image-sarif.test#L58-L63

OTOH, I'm wondering if we should use logical locations instead? https://github.com/microsoft/sarif-tutorials/blob/main/docs/2-Basics.md#physical-and-logical-locations

src/main/resources/templates/findings/sarif.peb

src/test/java/org/dependencytrack/resources/v1/FindingResourceTest.java

aravindparappil46 · 2024-03-20T04:56:40Z

Thanks a lot for the detailed feedback @nscuro !

Regarding name field in the SARIF - makes sense to remove the component name from there. Will do some data wrangling so that only unique vulnerabilities are populated under rules (plus can pascal-case the cweName)

Regarding physicalLocation - Yes, I feel going for logicalLocations.fullyQualifiedName is better than hacking around with region in physicalLocation

I'm done with the above changes in my local branch. Have to work on updating the test and will push soon.

The rules array in the generated SARIF should contain only unique values, so that the same rule can be referenced by multiple results. Rule name in the SARIF should be PascalCased. Using WordUtils from apache commons library to convert cweName to PascalCase. Set default escape strategy to the Pebble Engine to json, to escape linebreaks and double quotes in vulnerability description. Update test case to assert whole SARIF json. Signed-off-by: Aravind Parappil <aravindparappil@gmail.com>

Since fullName is required by Azure DevOps ingestion rules, added it as concatenation of app name and version. shortDescription has been changed to be vulnerability ID and added fullDescription field as the actual vulnerability description (same as Trivy's SARIF) Signed-off-by: Aravind Parappil <aravindparappil@gmail.com>

aravindparappil46 · 2024-03-21T03:13:50Z

Have pushed the changes. More details in the commit messages :)

Here's an updated SARIF sample:
bom-and-sarif-v2.zip

The pascal-casing for some CWE names look weird because of characters like hyphens (e.g., Out-of-boundsWrite ) but the SARIF validator is not complaining anymore.

One thought I had was whether to conditionally add fields to the results.properties object.
For e.x., Right now, even if a vulnerability does not have a recommendation, it is still added as an empty string to its results.properties. Could add a null check on the sarif.peb for each field under properties but wanted to check if it makes sense

src/test/java/org/dependencytrack/resources/v1/FindingResourceTest.java

nscuro · 2024-03-25T19:45:01Z

One thought I had was whether to conditionally add fields to the results.properties object.

Honestly I don't know 😅

It might make sense to omit fields with empty values, on the other hand it might make sense to have properties consistent across all findings. What's your opinion?

For the JSON assertion, removed matchers which can be replaced with string literals directly in the expected JSON string Signed-off-by: Aravind Parappil <aravindparappil@gmail.com>

aravindparappil46 · 2024-03-25T22:52:11Z

Thanks @nscuro ! Have removed the unwanted matchers.

For the empty properties -- I am of the opinion that we should keep the properties object consistent across all findings, so that the SARIF is more predictable and follows a consistent schema. Might even be useful for downstream tools to ingest and display all properties in a uniform manner.

codacy-production · 2024-03-25T23:14:32Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ -0.05% (target: -1.00%)	✅ 91.43% (target: 70.00%)

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`e55e950`)	21527	15953	74.11%
Head commit (`6b1b68c`)	21606 (+79)	16001 (+48)	74.06% (-0.05%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#3561)	35	32	91.43%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{You may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation}

nscuro

Thanks!

aravindparappil46 mentioned this pull request Mar 18, 2024

Add SARIF support #909

Closed

nscuro reviewed Mar 19, 2024

View reviewed changes

src/main/resources/templates/findings/sarif.peb Outdated Show resolved Hide resolved

src/test/java/org/dependencytrack/resources/v1/FindingResourceTest.java Outdated Show resolved Hide resolved

nscuro added the enhancement New feature or request label Mar 19, 2024

aravindparappil46 added 2 commits March 20, 2024 20:43

nscuro reviewed Mar 25, 2024

View reviewed changes

src/test/java/org/dependencytrack/resources/v1/FindingResourceTest.java Outdated Show resolved Hide resolved

Remove unwanted matchers and replace with string literals

6b1b68c

For the JSON assertion, removed matchers which can be replaced with string literals directly in the expected JSON string Signed-off-by: Aravind Parappil <aravindparappil@gmail.com>

nscuro added this to the 4.11 milestone Mar 26, 2024

nscuro approved these changes Mar 26, 2024

View reviewed changes

nscuro merged commit b764bc4 into DependencyTrack:master Mar 26, 2024
12 checks passed

aravindparappil46 deleted the feature/909-export-sarif branch March 26, 2024 15:48

github-actions bot locked as resolved and limited conversation to collaborators Apr 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate SARIF File Of Project Vulnerability Findings #3561

Generate SARIF File Of Project Vulnerability Findings #3561

aravindparappil46 commented Mar 18, 2024 •

edited

Loading

codacy-production bot commented Mar 18, 2024 •

edited

Loading

nscuro commented Mar 19, 2024

aravindparappil46 commented Mar 20, 2024

aravindparappil46 commented Mar 21, 2024

nscuro commented Mar 25, 2024

aravindparappil46 commented Mar 25, 2024

codacy-production bot commented Mar 25, 2024

nscuro left a comment

Generate SARIF File Of Project Vulnerability Findings #3561

Generate SARIF File Of Project Vulnerability Findings #3561

Conversation

aravindparappil46 commented Mar 18, 2024 • edited Loading

Description

Addressed Issue

Additional Details

Feedback Needed

Checklist

codacy-production bot commented Mar 18, 2024 • edited Loading

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

nscuro commented Mar 19, 2024

aravindparappil46 commented Mar 20, 2024

aravindparappil46 commented Mar 21, 2024

nscuro commented Mar 25, 2024

aravindparappil46 commented Mar 25, 2024

codacy-production bot commented Mar 25, 2024

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

nscuro left a comment

Choose a reason for hiding this comment

aravindparappil46 commented Mar 18, 2024 •

edited

Loading

codacy-production bot commented Mar 18, 2024 •

edited

Loading