-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse license from the pom.xml if not contained in the manifest #2115
Conversation
Thanks for the contribution @coheigea -- NOTE: this modifies the Syft JSON, so you'd need to update the JSON schema. Basically: bump the patch version (since this is just addition) in https://github.com/anchore/syft/blob/main/internal/constants.go#L6, run |
…fest Signed-off-by: Colm O hEigeartaigh <coheigea@apache.org>
4d95e46
to
3f8c7ca
Compare
Done thanks, I rebased the PR. Please re-run the workflows so that I can check if all the tests are fixed. |
@coheigea thanks for the PR here! I really appreciate you doing the schema bump - would you mind if I added a commit that possibly makes it so we don't have to change the schema here? We might not have to add licenses to the metadata for this to work =) We can remove the commit if we don't like it, but I think it might be advantageous to not have duplicate values across I think we can take your code and curry the licenses up to where package creation happens and have them all accounted for on the WDYT? |
@spiffcs Sure, whatever works really! Let me know when the PR is updated and I'll review and test it again. The only thing to add is that I'm working on a few follow-ups, where if the license is not defined in the pom.xml, but in a parent pom, then it will go to maven central and extract the license from the parent. So for this I would need access to the pomPropertiesObject found in guessMainPackageNameAndVersionFromPomInfo. |
@spiffcs Any update on getting this merged? |
Maybe we could just merge this and look at refactoring later? It's blocking a number of contributions I want to make to Syft to improve license detection for Maven projects. |
@coheigea you're right - and huge apologies for dragging my feet on this - let me try and get these conflicts resolved and try to get this merged in for our next release |
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
cc @wagoodman for review so we can get this into tomorrow's planned release |
if name == "" { | ||
name = selectName(manifest, j.fileInfo) | ||
} | ||
if version == "" { | ||
version = selectVersion(manifest, j.fileInfo) | ||
} | ||
if len(pomLicenses) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @spiffcs ! I just have a query about this line, in my PR the pomLicenses were only added if we didn't already get licenses from the Bundle-License. I'm going to contribute PRs after this is merged to reach out to Maven Central to retrieve the license if not contained in the pom, which we probably want to avoid if we already have a license from the Manifest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting! cc @wagoodman for the morning - we discussed this yesterday. I thought that it might be best to surface all licenses found regardless of source to provide the most information to the user (manifest or pom). @coheigea's comment makes sense though with respect to the other PRs coming. Leaving this open for @wagoodman comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason to make this mutually exclusive? Typically we raise up as much information as is useful, and I feel that license info should be fairly unbiased. That is, if there are two licenses then knowing that they agree or disagree is useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a big deal for now, we can revisit it later if it's an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I take back my comment -- today we can't distinguish paths within the jar. We have a future enhancement to the be able to capture paths within archives, but without that enhancement the licenses captured appear to be from the same location (the top-level jar path) and not a manifest vs the pom.xml. I think the results would initially be confusing without this information.
I'll restore the behavior you originally put in @coheigea such that the pom licenses are only added if the license within the manifest cannot be found. When the Location objects get this additional pathing information we could enhance this to show both licenses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been talking about the Location enhancement for a while but I couldn't find an issue for it -- created #2211 to track (and linking with this comment 😄 )
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
I adjusted the parsing logic to preserve what the specific license data we are cataloging (name vs URL) since the |
anchore#2115) * Parse the Maven license from the pom.xml if not contained in the manifest Signed-off-by: Colm O hEigeartaigh <coheigea@apache.org> * chore: restore 10.0.2 schema Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com> * chore: generate new 11.0.1 schema Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com> * refactor: remove schema change Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com> * test: update unit tests to align with new pattern Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com> * chore: pr feedback Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com> * chore: remove struct tags Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com> * keep license name and url semantics preserved on the pkg object Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> --------- Signed-off-by: Colm O hEigeartaigh <coheigea@apache.org> Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com> Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> Co-authored-by: Christopher Angelo Phillips <32073428+spiffcs@users.noreply.github.com> Co-authored-by: Christopher Phillips <christopher.phillips@anchore.com> Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
Right now, the license is only found for a Maven artifact if it's an OSGi Bundle with "Bundle-License" declared in the manifest. This PR adds functionality to also check for the license information in the pom.xml.
Running syft before on https://repo1.maven.org/maven2/eu/neilalexander/jnacl/1.0.0/ does not output any license information, with this PR for cyclonedx-json it outputs correctly: