Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report individual findings rather than top-level file #81

Merged
merged 13 commits into from
Jan 25, 2022

Conversation

glynternet
Copy link
Contributor

@glynternet glynternet commented Jan 19, 2022

Closes: #72

This PR changes the way vulnerabilities are reported when not in
--file-path-only mode.

Each finding is now reported individually, rather than reporting
and aggregation of all findings with only the top-level file on
disk.

For example, a vulnerable jar nested inside an archive will now be
reported with the vulnerability findings, rather than reporting on
the archive with an aggregation of all findings from within it.
Multiple vulnerable jars found within an archive will be reported
separately.

The path reported with a vulnerability finding is the full path
to a finding with archive layers delimited by a "!".
i.e. /path/to/archive!path/to/finding.jar shows that an archive
at /path/to/archive contained a vulnerable jar at
path/to/finding.jar within it.


To do this we now accept a HandleFindingFunc into the Log4jIdentifier, which is called for every finding encountered whilst crawling.

Reporter.Collect has also been renamed to Report to reflect better the behaviour that it has.

@changelog-app
Copy link

changelog-app bot commented Jan 19, 2022

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

When not using --file-path-only, the following output behaviour has been modified.

Each finding is now reported individually, rather than reporting
an aggregation of all findings with only the top-level file on
disk.

For example, a vulnerable jar nested inside an archive will now be
reported with the vulnerability findings, rather than reporting on
the archive with an aggregation of all findings from within it.
Multiple vulnerable jars found within an archive will be reported
separately.

The path reported with a vulnerability finding is the full path
to a finding with archive layers delimited by a "!".
i.e. /path/to/archive!path/to/finding.jar shows that an archive
at /path/to/archive contained a vulnerable jar at
path/to/finding.jar within it.

When using --json mode, the path on disk is still reported
as the filePath field. An extra detailedPath field has been
added, containing the full path the the vulnerable content, which
may be nested in any number of archives.

Check the box to generate changelog(s)

  • Generate changelog entry

@glynternet glynternet changed the title Report individual findings Report individual findings rather than top-level file Jan 19, 2022
// we are in an archive and so can assume separator is '/'
jarVersion, jarNameMatch = pathMatchesLog4JVersion(path)
jarFinding = JarNameInsideArchive
if jarNameMatch {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour is carried over from previous implementation.

@hpryce do we want to log at info level when we match any log4j-core jar name or should we only do it when it is a vulnerable version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps controversial, but I think we should drop vulnerable versions from identify.go entirely. We should just find everything here, then filter out in the reporting if it's not vulnerable. This probably means we need a new found but not vulnerable log line in the report.go for the human friendly output.

@glynternet glynternet force-pushed the gh/report-individual-findings branch 5 times, most recently from 180e61f to bcd0f74 Compare January 19, 2022 15:20
@nmiyake
Copy link
Contributor

nmiyake commented Jan 19, 2022

@glynternet haven't looked at this implementation closely, but if it basically just sets the output path to something of the form /path/to/archive!path/to/finding.jar and outputs that in the same way as top-level findings, then presumably this would be printed in the mode that prints file paths. We already have consumers that pipe that output to another process to them remove the impacted files, so this would be a change in behavior that could cause issues there.

Is there another field (like "details") where this would make more sense? I think it would be more consistent/understandable if the "path" always mapped to a file on disk. If we want further information like a path within an archive, perhaps that would make sense as an additional metadata field instead?

@hpryce
Copy link
Contributor

hpryce commented Jan 19, 2022

This is a very good point @nmiyake, and one I hadn't included when I sketched this out with @glynternet. I think, in the name of avoiding needless breaks, we should add this as another field on the json object. The human readable output I think makes sense to use the paths as done here, and the file path listing should just list the top level path.

@glynternet
Copy link
Contributor Author

Yup, very valid point and sloppy on my part for missing this.
I will update to make sure we are not making breaks to those existing workflows.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@hpryce
Copy link
Contributor

hpryce commented Jan 25, 2022

This looks good - the comment thread discussion around logging/handling of non-vulnerable versions is going to be handled in a follow up PR.

This now also keeps the json format the same.

glynternet and others added 13 commits January 25, 2022 16:19
This PR changes the way vulnerabilities are reported.
Each finding is now reported individually, rather than reporting
and aggregation of all findings with only the top-level file on
disk.

For example, a vulnerable jar nested inside an archive will now be
reported with the vulnerability findings, rather than reporting on
the archive with an aggregation of all findings from within it.

The path reported with a vulnerability finding is the full path
to a finding with archive layers delimited by a "!".
i.e. /path/to/archive!path/to/finding.jar shows that an archive
 at /path/to/archive contained a vulnerable jar at
 path/to/finding.jar within it.
@hpryce
Copy link
Contributor

hpryce commented Jan 25, 2022

👍

@bulldozer-bot bulldozer-bot bot merged commit 264e87f into develop Jan 25, 2022
@bulldozer-bot bulldozer-bot bot deleted the gh/report-individual-findings branch January 25, 2022 16:30
@svc-autorelease
Copy link
Collaborator

Released 1.7.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Report on individual files rather than only resultant aggregation at top level
5 participants