Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose os candidate type #161

Closed
wants to merge 7 commits into from
Closed

Conversation

captn3m0
Copy link
Contributor

@captn3m0 captn3m0 commented Apr 8, 2022

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think this could be fleshed out a bit? As it is I have no idea of what things would look like for this? A key question is: is an OS package-like?

@captn3m0
Copy link
Contributor Author

captn3m0 commented Oct 7, 2022

Added more details, and some examples.

@bureado
Copy link

bureado commented Oct 7, 2022

Thanks, I think this could be fleshed out a bit? As it is I have no idea of what things would look like for this? A key question is: is an OS package-like?

Generally, no. I can see an argument for an OS as a container image, or an embedded OS build as a flashable image and in both cases I would reference them by digest with existing purl types.

For looking up EOL information about an OS, I would accept CPE strings, User-Agent strings, relevant bits of the os-release specification, the strings tracked by Repology, the output of distro in popular tools like neofetch, facter, etc. I can also see how given a sufficiently big array of sufficiently specific purls, an API could determine those only exist in a given release of a certain OS and return the EOL status with some degree of confidence. For example feeding it the purl for base-files of a Debian based distro.

I can perhaps think about WSL environments as analogous to a package. They have a Windows Store identifier string specified in their manifest. But that sounds a bit like a specific namespace, not unlike snap. My 2e-2.

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@captn3m0 thanks! the update looks good to me!

Note that if this can be of some use I eventually amassed a large collection of /etc/os-release for testing in https://github.com/nexB/container-inspector/tree/main/tests/data/distro/os-release
They come from @natefoo , @zyga , @tas50 , @kmf and many others plus some extra additions. We could move this here to maintain and use as a reference?
See:

@bureado Thanks... your comments make sense. The notion of what's a package is a bit in the eye of the beholder, as basically "your product is a package to me" in a larger software supply sense. This is typical in material, physical BOMs to have assemblies sub-assemblies and there is a rich background to draw from there. I can see an OS as a big assembly and there are times when this could be a valuable shortcut to have as a purl.

The key tests here would be IMHO:

  1. is this useful?
  2. is this package type obvious and can it be easily inferred from looking at the code?

I would answer yes to both.

@@ -337,11 +337,14 @@ os
- In case no CPE is available, the ``ID`` field from ``/etc/os-release`` can be used as both the namespace and name.
- The ``version`` field should be latest version (including patch) that the operating system has been updated to. This should closely match the ``VERSION_ID`` field in the ``/etc/os-release`` data.
- Both ``name`` and ``namespace`` are not case-sensitive and must be lowercased.
- For rolling or testing distributions, the ``version`` should be set to the rolling channel identifier or branch name. Such as ``edge`` for alpine, or ``sid`` for debian. In case no such identifier is available, it should not be set.
- For rolling or testing distributions, the ``version`` should be set to the rolling channel identifier or branch name. Such as ``edge`` for alpine, or ``sid`` for debian. In case no such identifier is available, no version should be set.
- For MacOS, the namespace should be set to ``apple`` and the name as ``macos``. The version string should match the ``ProductVersion`` returned by the ``sw_vers`` command.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know on which OS versions are the sw_vers and winverutils available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't find the exact details, but both appear to be old (oldest references are around 2000 for sw_ver and much older for winver), and well-supported (availability in various windows variants for eg).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing winver to ver, which is the text version, and reports the fields that are mentioned here.

Copy link
Contributor

@jlb-bb jlb-bb Oct 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the group decides to proceed with this also some updates to https://github.com/package-url/purl-spec need to be made. The description for type, namespace, name, version do not align with the proposed use. For example type's description includes the keyword required: "the package "type" or package "protocol" such as maven, npm, nuget, gem, pypi, etc. Required." But in this proposal it is suggested an operating system is a package protocol? This is inconsistent and confusing, I am not sure encoding an operating system in the purl structure will improve comprehension of this specification?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file /etc/os-release is not universally supported by Unix-flavors. Even for Linux it appears to have been introduced initially only by those distros supporting systemd. Perhaps this should be recognized in the text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a change for /etc/os-release alternative. I don't think the spec can be 100% exhaustive to cover all possible operating systems, the best we can do is cover the majority of cases, and offer guidance as to what goes in each field.

I'm unsure about how the description for fields can be aligned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We believe that clarity of the specifications helps its adoption and bug-free implementation.

pombredanne
pombredanne previously approved these changes Oct 7, 2022
@captn3m0
Copy link
Contributor Author

captn3m0 commented Oct 7, 2022

Note that if this can be of some use I eventually amassed a large collection of /etc/os-release for testing

This is very helpful - going to use this to guide the endoflife.date API reporting. Thanks 💯!

@bureado
Copy link

bureado commented Oct 7, 2022

@pombredanne Thanks for bringing up physical BOM. It helped me expand my perspective on this. I don't know how purveyors of physical BOMs encode operating system in their BOMs today. My intuition tells me it'd be via firmware images, and they won't really meet the level of detail of an os-release string, e.g., they'll likely read either like a platform string or be a firmware reference that purl can encode with generic. I could also see the OS being a top-level attribute of the BOM rather than a component along with hardware ones. Maybe that will change in the future.

Conceptually I could see an argument for purl equivalence since at least CycloneDX uses a kind of component for operating systems, and from my read of the spec it allows purl (alongside CPE and SWID) I would estimate most of the usage of this kind uses CycloneDX's native objects (name, supplier, etc.)

Reading SWID and SWIMA got me thinking about IT inventory. I wonder if operating system is encoded flat, next to installed MSIs or rpm packages, or if it's an attribute of a parent Asset.

In general, I think the upkeep and maintenance for purl to norm these strings is significant (see the examples with Windows) and that means backwards compatibility considerations.

Another unforeseen future scenario that worries me the most is one where, for example, a vulnerability scanner makes a determination using the operating system release as a shortcut: imagine an SBOM for a Debian system with 1.5K components, but I only pick the os one, query Debian's tracker with it, test the ~65 there and call it a day. This is the same for EOL determinations; think Python 2.7 in a rolling release.

Of course purl isn't in control of downstream, but these are a couple scenarios to keep in mind.

PURL-TYPES.rst Outdated Show resolved Hide resolved
@captn3m0
Copy link
Contributor Author

captn3m0 commented Oct 7, 2022

There's lots of ways SBOM tooling can make mistakes around distro vulnerabilities and lots of them are true today already (such as scanners missing vulnerabilities in older distros because the distribution never released an advisory). The existence of a os PURL will atleast make it easy to track these between various tools. Mistakes will still happen, but I don't think we can ascribe them to an OS type.

I'm very cognisant of distro v/s upstream support (wrt EOL) in particular, and I think any tooling that tries to determine this will have to take similar care - you just get useless results otherwise.

Tooling today already reports os versions in SBOMs (Syft atleast does), but the non-standard nature of it (no PURL) makes it inaccessible to the rest of the ecosystem.

@bureado
Copy link

bureado commented Oct 7, 2022

Tooling today already reports os versions in SBOMs (Syft atleast does), but the non-standard nature of it (no PURL) makes it inaccessible to the rest of the ecosystem.

Not sure I agree with this part. Some of the reporting is encoding for internal purposes (producer and consumer already have a contract) and some use cases for this will have nothing to do with SBOM. There is an ecosystem out there that goes from a regex on uname, a grep on neofetch and most notably CPEs. But that doesn’t mean it can’t be improved! Thanks for this discussion, I hope it can at least help downstream implementers as a reference.

@bureado
Copy link

bureado commented Oct 7, 2022

I think it’s also important to mention that distros not releasing an advisory, tools that only lookup NVD without looking at project security trackers and scanners that lack contextual assertions such as exploitability or reachability are real problems that are orthogonal to standardized component naming. What I referred to earlier is less about using distro:release as a shortcut but about facilitating this in a purl namespace next to the other components, potentially teeing up an antipattern. It’s speculative but meant to be cautionary. As a purl enthusiast I’m mostly excited that we get to consider and accommodate scenarios like this!

@captn3m0
Copy link
Contributor Author

captn3m0 commented Oct 8, 2022

Thanks for all the feedback, think this should be good to merge now.

@noqcks
Copy link

noqcks commented Jan 6, 2023

It seemed like there was consensus here, is there anything blocking a merge? Would love to implement os types in a project im working on.

Thank you 🙏

@stevespringett
Copy link
Member

I would rather NOT accept this PR.

Reasons:

  • An os is a type of software. To date, we have not done this and if we do, firmware are others are surely to follow.
  • swid is already supported and is designed specifically for software identification (for non-packages, like operating systems)
    • So, in the case of apple, they should be creating SWID tags for OS releases.
    • SWID is supported by purl already.
    • RedHat (possibly others) already support SWID (e.g. /usr/lib/swidtag/redhat.com/)
    • We need to encourage all Linux distros to provide SWID tags
    • SWID is supported by most enterprise CMDB and discovery systems
  • There are hundreds of operating system vendors. Supporting this PR means that we would need to centralize vendor and product naming within the purl project, essentially replicating portions of the CPE dictionary.
    • This is in direct conflict with the decentralized design of purl
    • SWID is also decentralized
    • Supporting this PR would require additional management from the purl project

Of interest to the Gitter post mentioned by @captn3m0, I think the guidance provided by the SBOM Forum should be taken into consideration for your project.

@captn3m0
Copy link
Contributor Author

going to close this PR, but noting that this is not really avoided this easily:

that we would need to centralize vendor and product naming within the purl project

There's already multiple types in the purl spec, that require vendor naming to some degree via qualifiers, or namespaces:

  • deb type uses the distro qualifier, which is unspecified.
  • rpm type uses the namespace and distro to denote vendor, which is again unspecified. I filed [rpm] Add VMWare Photon Example #214 to provide some guidance for other distros, but this needs more thought.

Different applications might use different distro or namespace (or even repository_url) to denote the same package, and unless this information is maintained somewhere, as some guidance - it will lead to conflicts.

@captn3m0 captn3m0 closed this Jan 10, 2023
@bureado
Copy link

bureado commented Jan 10, 2023

@captn3m0 I think the point being made is that this information is being maintained somewhere, but that somewhere is not purl.

You're right, though, that since "somewhere" isn't defined then the exercise is left to the user. The question then becomes whether that's by design, or a negative thing.

I'm eager to discuss this further, maybe in #214, because I think there is a middle ground, other can keep the actual data but purl can help with conventions.

I see this particular PR as slightly orthogonal and more of a scope question on whether purl would represent things beyond packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants