Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: in some cases, try to use pom info to guess name and version to top level jar #2080

Merged
merged 7 commits into from
Aug 31, 2023

Conversation

willmurphyscode
Copy link
Contributor

Implements the following changes in the java archive parser's discoverMainPackage() method:
if there is exactly one pom.xml and pom.properties in the jar and if the filename is a prefix of the artifact ID on the pom.properties, prefer, in order, for name and version:

  1. A non-empty string on pom.properties
  2. A non-empty string on pom.xml
  3. A name based on the filename of the top-level jar
  4. Name info from the manifest.

Definitely open to feedback and refactoring here, but wanted to show a working fix before spending too much time making it look great.

Fixes #2077

Signed-off-by: Will Murphy <will.murphy@anchore.com>
Signed-off-by: Will Murphy <will.murphy@anchore.com>
All tests pass, but hacky.

Signed-off-by: Will Murphy <will.murphy@anchore.com>
Signed-off-by: Will Murphy <will.murphy@anchore.com>
Signed-off-by: Will Murphy <will.murphy@anchore.com>
@github-actions
Copy link

github-actions bot commented Aug 31, 2023

Benchmark Test Results

Benchmark results from the latest changes vs base branch
goos: linux%0Agoarch: amd64%0Apkg: github.com/anchore/syft/test/integration%0Acpu: Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz%0A                                                              │ ./.tmp/benchmark-cfd6903.txt │%0A                                                              │            sec/op            │%0AImagePackageCatalogers/alpmdb-cataloger-2                                       12.96m ±  2%25%0AImagePackageCatalogers/apkdb-cataloger-2                                        739.3µ ±  3%25%0AImagePackageCatalogers/binary-cataloger-2                                       220.0µ ± 15%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                       621.4µ ±  1%25%0AImagePackageCatalogers/dotnet-portable-executable-cataloger-2                   23.16µ ±  1%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                             101.0µ ±  1%25%0AImagePackageCatalogers/java-cataloger-2                                         21.72m ±  2%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                         98.50µ ±  1%25%0AImagePackageCatalogers/javascript-package-cataloger-2                           398.1µ ±  2%25%0AImagePackageCatalogers/nix-store-cataloger-2                                    282.6µ ±  1%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                       799.9µ ±  2%25%0AImagePackageCatalogers/portage-cataloger-2                                      487.4µ ±  2%25%0AImagePackageCatalogers/python-package-cataloger-2                               3.408m ±  1%25%0AImagePackageCatalogers/r-package-cataloger-2                                    203.8µ ±  4%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                       561.7µ ±  2%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                                 925.8µ ±  1%25%0AImagePackageCatalogers/sbom-cataloger-2                                         121.1µ ±  1%25%0Ageomean                                                                         514.9µ%0A%0A                                                              │ ./.tmp/benchmark-cfd6903.txt │%0A                                                              │             B/op             │%0AImagePackageCatalogers/alpmdb-cataloger-2                                       5.138Mi ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2                                        184.4Ki ± 0%25%0AImagePackageCatalogers/binary-cataloger-2                                       30.78Ki ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                       141.4Ki ± 0%25%0AImagePackageCatalogers/dotnet-portable-executable-cataloger-2                   3.695Ki ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                             9.891Ki ± 0%25%0AImagePackageCatalogers/java-cataloger-2                                         3.387Mi ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                         8.594Ki ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                           83.82Ki ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2                                    38.94Ki ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                       155.2Ki ± 0%25%0AImagePackageCatalogers/portage-cataloger-2                                      109.8Ki ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2                               986.0Ki ± 0%25%0AImagePackageCatalogers/r-package-cataloger-2                                    42.90Ki ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                       170.9Ki ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                                 123.3Ki ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2                                         14.20Ki ± 0%25%0Ageomean                                                                         93.58Ki%0A%0A                                                              │ ./.tmp/benchmark-cfd6903.txt │%0A                                                              │          allocs/op           │%0AImagePackageCatalogers/alpmdb-cataloger-2                                        88.07k ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2                                         4.034k ± 0%25%0AImagePackageCatalogers/binary-cataloger-2                                         866.0 ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                        2.911k ± 0%25%0AImagePackageCatalogers/dotnet-portable-executable-cataloger-2                     132.0 ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                               280.0 ± 0%25%0AImagePackageCatalogers/java-cataloger-2                                          44.28k ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                           228.0 ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                            1.264k ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2                                      820.0 ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                        3.845k ± 0%25%0AImagePackageCatalogers/portage-cataloger-2                                       2.194k ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2                                16.13k ± 0%25%0AImagePackageCatalogers/r-package-cataloger-2                                      851.0 ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                        3.914k ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                                  2.291k ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2                                           394.0 ± 0%25%0Ageomean                                                                          2.009k

pomMatches := j.fileManifest.GlobMatch(pomXMLGlob)
var pomPropertiesObject pkg.PomProperties
var pomProjectObject pkg.PomProject
if len(pomPropertyMatches) == 1 && len(pomMatches) == 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's possible that we could have only a pom.xml and no properties, I think we should account for that as well (allow for 0 to 1 pom properties)


func TestWarCatalogedCorrectlyIfRenamed(t *testing.T) {
// install hudson-war@2.2.1 and renames the file to `/hudson.war`
// TODO: is this better expressed in syft/pkg/cataloger/java/archive_parser_test.go?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, I'm not against having this integration test at all, so consider this comment non-blocking, however, testing specific business logic should be pushed as low as possible test-level-wise (unit is preferred for these kinds of assertions). We do have some regression tests at the integration level to test against the specific asset where we found the regression, so keeping this test here is ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I think I'll leave this here for now; it was reported as an end to end regression, and the parser was still finding the correct package, it was just also finding an incorrect extra package, so the end-to-end flow seems a good place of the test.

I'll push a commit to remove the TODO.

return &pkg.Package{
Name: selectName(manifest, j.fileInfo),
Version: selectVersion(manifest, j.fileInfo),
// TODO: maybe select name should just have a pom properties in it?
Copy link
Contributor

@wagoodman wagoodman Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe rename selectName to something more specific like selectNameFromManifest (same for the version function) -- the upside is that you can write a selectNameAndVersion wrapper that is called here and pushes some of this lower level logic to another function, and the existing tests for selectName don't really need to get modified from a logic sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I think I'll do this as a follow up.

Signed-off-by: Will Murphy <will.murphy@anchore.com>
Signed-off-by: Will Murphy <will.murphy@anchore.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Virtual path changes to java cataloger causing creation of extra incorrect packages when jars are renamed
2 participants