Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for using an SBOM as input #191

Open
sophokles73 opened this issue Nov 4, 2022 · 16 comments
Open

Support for using an SBOM as input #191

sophokles73 opened this issue Nov 4, 2022 · 16 comments

Comments

@sophokles73
Copy link
Member

With the recent uptake of SBOMs wrt to governance checks, I wonder if the dash tool should also support doing its work based on a BOM created by popular SBOM tools like CycloneDX.

@waynebeaton
Copy link
Member

Seems like a reasonable thing to do. While I have investigated generating SBOMs from the Eclipse Dash License Tool, it hadn't occurred to me to do the reverse. This will require some investigation.

Tools already exist to parse SBOMs, so that shouldn't be a big problem... The fundamental problem is that of turning references to third party content in an SBOM into a format the license tool understands.

@waynebeaton
Copy link
Member

The tool can now interpret purl IDs. These are used in SPDX and CycloneDX to specify references to libraries (I believe that this is consistently true).

Implementing an SBOM file reader is going to require a little restructuring of how we handle files. Currently, we decide what file reader to use based on the file name. Since SBOM formats don't make use of a consistent file name, we'll have to add a switch or something that tells the tool how to interpret the file.

In the meantime, an ugly work around is to grep the SBOM:

$ cat mySBOM.json | grep -Poh 'pkg:(?<type>[^\/]+)\/(?<group>[^\/]+)\/(?<name>[^@]+)@(?<version>[^?]+)(?=.*")' \
| sort | uniq | \
java -jar org.eclipse.dash.licenses-<version>.jar -

@sbernard31
Copy link
Contributor

sbernard31 commented Dec 15, 2023

I'm currently exploring some way to check dependencies vulnerability. (https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/issues/3949)

Looking at this, I see there is some effort to define standard for SBOM and tooling to generate it.
I also understand that SPDX and CycloneDX are the most popular standard.

Thinking to dash-licences, I came up with same idea than both of you . ☝️

I will go a bit further (maybe too further 😅)
Ideally could we imagine that scanning/searching dependencies should not even be in the dash-licences scope ?
And so we will get something like :

            ┌───────────────┐
            │code repository│
            └──────┬────────┘
                   │  Specific tooling
                   │  for different languages
                   ▼
                 ┌────┐
                 │SBOM│  CycloneDx or SPDX format
                 └─┬──┘
 security scanner  │    dash-licences
        ┌──────────┴─────────────┐
        ▼                        ▼
┌────────────────────┐  ┌───────────────┐
│Vulnerabities report│  │licences report│
└────────────────────┘  └───────────────┘

And so dash-licences effort could be put only on :

  • read SBOM standard
  • checking licences
  • IP Team Review Requests

Currently, for scanning repository to generate SBOM, it already exists some tool for lot of language but it seems that all of this is pretty new and so some ecosystem are not yet well covered. And so it will be "better" to identify project needed by eclipse community and contribute to that project than improving dash-licences scanning.

I warned that maybe I go too further with that idea 😅

@sophokles73
Copy link
Member Author

@sbernard31 I am not sure if I understand where you are going beyond what is already discussed in this issue. My intention had been to do exactly what you propose as well: let the dash tool read an SBOM and check the license info of all 3rd party deps declared in the SBOM.

Ideally could we imagine that scanning/searching dependencies should not even be in the dash-licences scope ?

FMPOV this has never been in the scope of the dash tool. Instead, it relies on arbitrary (language/build system specific) mechanisms to create a list of deps which it then processes. The only thing to be done is adding the ability to read an SBOM that has created by another tool upfront.

So, I guess we all want the same. As usual, the only thing left to do is creating a PR ;-)

@sbernard31
Copy link
Contributor

sbernard31 commented Dec 18, 2023

My intention had been to do exactly what you propose as well: let the dash tool read an SBOM and check the license info of all 3rd party deps declared in the SBOM.

Yep I agree with you 👍

I am not sure if I understand where you are going beyond what is already discussed in this issue.

OK I try to explain it better. 🙂

FMPOV this (scanning/ searching dependencies) has never been in the scope of the dash tool.

Maybe, you consider it's wrong to say that "dash-licences scan/search dependencies" because it relies on different tooling to do that ?
What I wanted to say is that currently specific code is needed to support each language tooling specificity. E.g. dash-licences should know "pom.xml", "yarn.lock" and more (see #10)

My point is maybe :
dash-licences should only focus on standard language agnostic format as input (SPDX and CycloneDX) and it's up to each programming language ecosystem to have its SPDX/CycloneDX SBOM generator. If some ecosystem used by eclipse community doesn't have its SPDX/CycloneDX generator (or some feature are missing), so eclipse community should help on that generator.

Let me know if it's clearer. (or if you think I misunderstood something 🙏)

@sbernard31
Copy link
Contributor

sbernard31 commented Jan 5, 2024

(Probably obvious but cyclonedx-core-java library should probably be used to add support of cycloneDX : . it is used by : cyclonedx-maven-plugin)

@waynebeaton
Copy link
Member

(Probably obvious but cyclonedx-core-java library should probably be used to add support of cycloneDX : . it is used by : cyclonedx-maven-plugin)

Yup. There's no point in reinventing this.

By leveraging this library's availability to read existing SBOMs and write new ones, we might even be able to optionally output an equivalent SBOM with licence information taken from our sources.

@sbernard31
Copy link
Contributor

What do you mean by :

we might even be able to optionally output an equivalent SBOM with licence information taken from our sources.

?

@waynebeaton
Copy link
Member

@sbernard31 The tools that generate SBOMs grab licence information directly from the content. The Maven plug-in, for example, grabs licence information from the pom.xml files of dependencies. This licence information is frequently missing, specified inconsistently, or just plain wrong. This is one of the reasons why I've been pushing committers to ensure that their license information is specified consistently in metadata (e.g., in the pom.xml).

What I'm thinking is that we can walk through an SBOM and either add or replace the licence information for the various dependencies (and the project content) with our own.

We could then either overwrite the existing SBOM or generate a new one.

This is just a thought at this point.

@sbernard31
Copy link
Contributor

I think I get it but that sounds strange to me that dash-licenses updates the SBOM files. At first sight, I think this is sbom generator responsibility.

But maybe rather to update SBOM files, it could do some checks and raise error if SBOM doesn't contain expected value for an eclipse project. (e.g. license information is missing or not recognize) ?

@waynebeaton
Copy link
Member

I don't think that it's strange. We'd effectively be post-processing.

Another option is to sort out how to extend the SBOM generators to use our licence information.

@sbernard31
Copy link
Contributor

I understand it is possible to set right information in build configuration files. (e.g. pom.xml or package.json)
So if dash-licences validates that then project can fix their build configuration.

Or maybe you don't talk about project licenses information but its dependencies licenses information which are not well set too ?

@waynebeaton
Copy link
Member

Yes. I'm mostly interested concerned with dependencies. We can coach our own project teams to get the metadata right. Moving forward, it looks like Sonatype is doing a better job of getting folks to specify good metadata before accepting content on Maven Central. I have no idea what sort rigour is applied when adding stuff to npmjs. Regardless, there is still a lot of content already on these software repositories that has licence information that is missing, inconsistently specified, or wrong.

@sbernard31
Copy link
Contributor

sbernard31 commented Jan 15, 2024

Ok I get it your point now. Maybe SBOM generator could also warn to get folks to specify good metadata ?

(this way not only eclipse community could improve the quality of their products)

@waynebeaton
Copy link
Member

Ok I get it your point now. Maybe SBOM generator could also warn to get folks to specify good metadata ?

The biggest challenge here is that many of the libraries that end up in a dependency graph are old, and telling the person assembling a SBOM for their own content that the metadata in the vast array of dependencies over which they have no control should use better metadata isn't all that helpful.

What would be helpful, I think, is to work with the folks creating the SBOM generators to make them pluggable so that the developer can leverage ClearlyDefined (or the Dash License Tool) to get vetted license content.

But... we've diverged considerably from the focus of this issue and should probably have this conversation somewhere else. I'll think about where that is and open an issue later today.

@waynebeaton
Copy link
Member

I played around with this a bit tonight.

The CycloneDX folks produce a CLI Tool that can convert an SBOM into various formats, including CSV.

This seems to work:

$ cyclonedx-linux-x64 convert --input-file stuff-cyclonedx.json --output-file stuff.csv --output-format csv
$ cat stuff.csv | awk -F, '{print $14}' | tail -n +2 | java -jar org.eclipse.dash.licenses-1.1.1-SNAPSHOT.jar -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants