Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idea: maintaining an SBOM for cve-bin-tool #1646

Closed
terriko opened this issue Apr 27, 2022 · 18 comments
Closed

idea: maintaining an SBOM for cve-bin-tool #1646

terriko opened this issue Apr 27, 2022 · 18 comments
Milestone

Comments

@terriko
Copy link
Contributor

terriko commented Apr 27, 2022

We currently maintain two .csv files for scanning components needed or included by cve-bin-tool. Now that we have sbom support, we might want to consider providing an actual SBOM both for that scanning and as information for others intending to use the tool.

I have a minor preference for SPDX format but could be convinced something else is better.

We'd want to have some CI mechanics in place to ensure any SBOM created stays up to date.

@terriko terriko added this to the 3.2 milestone Apr 27, 2022
@anthonyharrison
Copy link
Contributor

@terriko I have looked at the CycloneDX Python tool using the requirements.txt file. It doesn't do what I believe is needed as this report shows:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! Some of your dependencies do not have pinned version !!
!! numbers in your requirements.txt !!
!! !!
!! -> rich !!
!! -> plotly !!
!! -> beautifulsoup4 !!
!! -> toml !!
!! -> pytest !!
!! -> pytest-xdist !!
!! -> pytest-cov !!
!! -> pytest-asyncio !!
!! -> pytest-mock !!
!! -> zstandard !!
!! -> distro !!
!! -> defusedxml !!
!! -> xmlschema !!
!! -> importlib_metadata !!
!! -> requests !!
!! !!
!! The above will NOT be included in the generated !!
!! CycloneDX as version is a mandatory field. !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Running it with a Python virtualenv does produce a SBOM but it also includes the dependencies for the CycloneDX SBOM tool as well.
cve3.txt

@stevespringett
Copy link

Version is an optional field in CycloneDX v1.4

https://cyclonedx.org/docs/1.4/json/#components_items_version

@anthonyharrison
Copy link
Contributor

Version is an optional field in CycloneDX v1.4

https://cyclonedx.org/docs/1.4/json/#components_items_version

Thanks @stevespringett but we need the version string with the package name to allow us to query the vulnerability database. Whilst the version isn't explicity specified in the requirements.txt file, I was expecting that the requirements.txt file would be used to find the version for the specififed python packages (as installed) and their dependencies.

@anthonyharrison
Copy link
Contributor

cve.txt. The SBOM contains all the direct dependencies (specified in the requirements.txt file) together with the implicit dependencies from the included files.

There is no consistency between licence names (there are multiple ways of specifying an Apache 2.0 licence!). The file reports (as a comment, the licence file as detected; noting that there are multiple packages with no licence reported. Think it would be good if PyPi enforced SPDX Licence names as it would make life so much easier :-) and consistent.

There is probably more that should be added to the SBOM e.g. relationship with the version(s) of Python supported and also the development dependencies.

@terriko
Copy link
Contributor Author

terriko commented May 24, 2022

Hm, I wonder if we should work on a PEP or tooling suggesting the license name thing. Having the pypi packaging tools spit out a warning might be helpful and probably wouldn't be too hard to write. I'll try to find time to look at it.

@terriko
Copy link
Contributor Author

terriko commented Jun 16, 2022

A friend pointed me at https://peps.python.org/pep-0639/ -- I haven't read through the details yet but we might want to read through that and throw our support behind it explicitly if it does what we need. I know many peps don't get much in the way of comments, so going and commenting on the discourse thread may be very impactful: https://discuss.python.org/t/pep-639-round-2-improving-license-clarity-with-better-package-metadata/12622

@anthonyharrison
Copy link
Contributor

@terriko Do you want to try sbom4python ? It seems to work when I have tested it on cve-bin-tool.

@terriko
Copy link
Contributor Author

terriko commented Jul 21, 2022

@anthonyharrison ooh, that's a good idea. Do you just want me to run it and make sure you and I get the same results and then we can check in a file (or maybe two so we have multiple formats), or did you have something else in mind? It would be pretty handy if we could run it in CI and have github actions keep it up to date for us too.

@anthonyharrison
Copy link
Contributor

anthonyharrison commented Jul 21, 2022

Here are the results when I run the generated SBOM for cve-bin-tool through cve-bin-tool

image

anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Aug 24, 2022
anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Aug 24, 2022
anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Aug 24, 2022
anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Aug 25, 2022
anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Aug 25, 2022
@terriko
Copy link
Contributor Author

terriko commented Aug 25, 2022

There are some challenges:

  1. The versions change pretty rapidly, so we can't validate against a known sbom without ignoring versions.
  2. pip --update -r requirements.txt only updates dependencies explicitly listed in requirements.txt. We do have some workarounds where 2nd+ level dependencies are listed explicitly in there already to force updates to avoid known CVEs.

So I guess there's some question of what SBOM we should be maintaining. I was thinking "this SBOM should reflect what you get when you install CVE-bin-tool" but we don't have a way of knowing that precisely because your 2nd level dependencies could be different versions than ours.

Some potential ideas:

  1. Give instructions or a script so people can generate their own personal SBOM for cve-bin-tool. We may want to do this one regardless, since we have no way of knowing what older dependencies they might have. And it might be neat to have some sort of way for people to get a "hey, this thing needs an upgrade" warning this way. cve-bin-tool --self-check or somesuch.
  2. Maintain the absolute latest possible. Don't try to validate versions against a checked in one (though we might want to highlight package changes). This might mean a lot of checkins, or maybe just having the sbom available through github actions.
  3. Maintain the version-at-release-time. Update main branch dependencies only in the case of known CVEs or required features, same as we do with requirements.txt now. Run the sbom generator again when I do a version bump commit. Maybe issue minor package updates in case of CVEs too?
  4. Use the sbom generator as a way to make a secondary dependencies list, and either aggressively update it or freeze it so that we can validate the generated version to match.
  5. Consider the merits of freezing packages. I don't feel like this really improves security because there's a gap of several days to a week before I approve the auto-updates (though it might be faster with core packages because we could run the tests; some of the delay is that I have to run black/pre-commit/etc. manually after I see the PR) and it then means we have to update the package in pypi more often, but it's worth discussing the tradeoffs between security and reproducibility and seeing if we need to strike a different balance.

Thoughts?

@anthonyharrison
Copy link
Contributor

@terriko I am finding the SBOM journey fascinating as it is throwing up all sorts of interesting edge cases.

I think we should publish the SBOM with every release as part of the baseline (maybe include it in the distribution and then use a --self-check option to validate the install against the baseline). This is OPTION 3 above

The issue I have found is with the hidden dependencies so the only way I can think of ensuring that every install uses the same version of the dependent packages would be deliver a frozen version of ALL of the packages used by cve-bin-tool in the requirements.txt file. This is OPTION 5 above. It only has to be frozen at the start of the release process so a few days lag shouldn't be too much of a problem. The dynamic ecosystem offered by Python and other languages means that there is a danger that some out of date packages (hidden dependencies) could still be used; I wonder if we should raise an issue on pip to upgrade the hidden dependencies when --upgrade is specified

We should probably be running a job (daily?) to continually scan the SBOM to see if there are any new vulnerabilities being reported (this should be less frequent than updates to the packages) and if so we can then upissue the SBOM/requirements.txt as part of a maintenance release.

Getting users to generate SBOMs (OPTION1 above) isn't going to happen soonj as I think the current thinking is that suppliers should be producing SBOMs rather than consumers as it should be capturing the 'as built' position.

anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Aug 26, 2022
anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Aug 26, 2022
@terriko
Copy link
Contributor Author

terriko commented Aug 29, 2022

We should probably raise some kind of issue with pip. If they don't want to change the behaviour of --upgrade (which they may not, since it's way more likely to break things for folk) we could maybe work on a PR to make --upgrade-all as a new option.

@anthonyharrison
Copy link
Contributor

@terriko There are additional options in pip to consider when upgrading the dependencies.

--upgrade-strategy <upgrade_strategy>
Determines how dependency upgrading should be handled [default: only-if-needed]. 
“eager” - dependencies are upgraded regardless of whether the currently installed version satisfies the 
requirements of the upgraded package(s).
“only-if-needed” - are upgraded only when they do not satisfy the requirements of the upgraded package(s).

There is also

--force-reinstall Reinstall all packages even if they are already up-to-date.

Looks like we need to specify --upgrade --upgrade-strategy eager to get the implicit dependencies updated.

@terriko
Copy link
Contributor Author

terriko commented Aug 31, 2022

Do you think we should recommend --update-strategy eager in our docs too?

anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Oct 25, 2022
@Molkree
Copy link
Contributor

Molkree commented Jan 3, 2023

@anthonyharrison @terriko

Any news on this issue? The current workflow creates SBOM every week but doesn't print the new one and doesn't update SBOM in the repo.

I am not familiar with your release process but if we generate SBOM now it will have version 3.2.1.dev0 for cve-bin-tool. So you'd need to generate the new SBOM for distribution after bumping the version.

Another thing to consider is the fact that installations on different Python versions will have different SBOMs. Edit: we'd need 3 versions as of now, one for 3.7, one for 3.8 and another one for 3.9-3.11.

I've played a bit with it in sbom-ci branch in my fork. I've modified the workflow to update SBOMs for each version of Python we support when there are changes. Would you be interested in this addition?

@anthonyharrison
Copy link
Contributor

@Molkree interesting thought on separate SBOMs based on Python version. It isn't something I have seen done before as the SBOM relates to an instance of the deployment of the module and its dependencies. Certainly something to consider going forward.

@Molkree
Copy link
Contributor

Molkree commented Jan 3, 2023

Certainly something to consider going forward.

@anthonyharrison, you can see here that SBOMs for different Python versions differ due to the fact that packages have different dependencies on different versions.

I guess it's possible that the platform might affect dependencies as well but it's less common and I haven't checked it for cve-bin-tool.

@terriko
Copy link
Contributor Author

terriko commented Jan 3, 2023

Honestly, I haven't thought deeply about this in a while given the chaos of the end of 2022.

I'm game for producing some sboms per python version now that reflect 3.2 (even if we have to hand edit things or be a bit had wavy about exactly what would have installed on release day for us).

I still don't have a really satisfying answer about how to maintain all of this yet. From a release perspective: I probably should have run a script and checked in what I had as part of my 3.2 release artefacts. If we had the script checking in a set of new sboms daily/weekly/monthly then that would have happened automatically in that it would be stored as part of what was extant when the v3.2 release tag was created, so maybe just enabling this to happen is the easiest solution?

We should also have a more formally documented release checklist, which is a good idea for other reasons and wouldn't preclude automation. I'd like to have a public fuzzing policy (and automation) as part of that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants