Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that distro advisories and aliasing work well together #249

Open
andrewpollock opened this issue Jul 10, 2024 · 4 comments
Open

Ensure that distro advisories and aliasing work well together #249

andrewpollock opened this issue Jul 10, 2024 · 4 comments

Comments

@andrewpollock
Copy link
Collaborator

Raised @luhring in google/osv.dev#2374 and capturing here:

It looks like the aliases documentation line in question was updated in #193 — that was a great read. I share the concern expressed in that PR: There seems to be a "hole" in the OSV spec when it comes to distros' ability to participate. By moving to related, we're missing out on the opportunity to have strong, automation-usable links to the same vulnerability as described by our advisories. It seems like there should be a new field that's similar to aliases, but for strong "asymmetric" references, to help OSV better support vulnerability workflows beyond language ecosystems and into the world of distros. I can open an issue to capture this, and hopefully we'll have a good dialog there about potential improvements to the spec.

@luhring
Copy link
Contributor

luhring commented Jul 10, 2024

Thanks for opening this!

As #193 points out, related isn't suitable for automation use cases, because the array items aren't guaranteed to have any particular relationship to the OSV record's vulnerability or its affected package(s).

With Linux distributions, we're consuming upstream software components and packaging them into our own distinct downstream software components.

So if Linux distributions provide OSV records to describe the affect of the vulnerability on their own packages, they cannot use aliases. This is because it's not guaranteed that the consumer of the upstream software component is also consuming that distribution's downstream component, and thus the upstream OSV record (e.g. a CVE or GHSA record) would be relevant to them while the distribution's OSV record (e.g. a DSA or CGA record) would not.

So there's no good option for linux distributions to use to store machine-discoverable links to upstream vulnerabilities.

I suggest adding a new field that's a stronger link than implied in related: similar to aliases, but for asymmetric relationships rather than symmetric. I don't know the best name for such a field, but perhaps inherits, consumes, upstream, or something.

To illustrate how this would work, imagine an OSV record from a Linux distribution like this:

{
  "modified": "2024-03-12T08:12:10Z",
  "id": "CGA-pc4f-g53c-c4gq",
  "upstream": [
    "GHSA-rr6r-cfgf-gc6h"
  ],
  // ...

This would have the following ideal outcomes:

  1. Processors of OSV data would not consider CGA-pc4f-g53c-c4gq and GHSA-rr6r-cfgf-gc6h to be the same thing.
  2. Automation systems wanting more information about CGA-pc4f-g53c-c4gq could now consider vulnerability data identified as GHSA-rr6r-cfgf-gc6h as directly applicable, albeit not the final say for the impact on the distro package.
  3. Multiple distros could use the same IDs in their OSV records' upstream field (like GHSA-rr6r-cfgf-gc6h), and while that would let consumers discover more information about the vulnerability's source, it would not link the distros' OSV records to one another in any way.

@andrewpollock
Copy link
Collaborator Author

My initial reactive thought was includes or incorporates or even aggregates (which, to be fair, was my understanding of (at least one of) the intentions behind related).

@oliverchang
Copy link
Contributor

Thank you for the feedback! One of the reasons we went with a more catch-all "related" was it was hard to encapsulate all the different use cases/relationships between vulnerability records. Additionally, having all of these very similar but subtly different fields may complicate and make the schema difficult to understand.

That said, if there is a clear, machine-automation use case for a field such as upstream, I think this is something we should add. Is the primary use case for automation systems here simply to answer the question: "Am I affected by CVE X in my distro?" And with the current related field, this would just give a "maybe" as an answer if it does live in any of the matched OSV records?

@luhring
Copy link
Contributor

luhring commented Jul 11, 2024

My initial reactive thought was includes or incorporates or even aggregates (which, to be fair, was my understanding of (at least one of) the intentions behind related).

👍 These names all sound good to me. And FWIW, I think related could work, but it'd require a substantial tightening of the definition of the field, which I would guess would be breaking and confusing for existing producers/consumers of that field.

One of the reasons we went with a more catch-all "related" was it was hard to encapsulate all the different use cases/relationships between vulnerability records. Additionally, having all of these very similar but subtly different fields may complicate and make the schema difficult to understand.

This definitely makes sense. I wouldn't want to open the door to N more relationship types each getting their own field, and then it becomes impossible to give guidance on which type is the exact right one for each scenario. One of my favorite traits of the OSV schema is its simplicity, and I hesitate to suggest adding a new field; but I'm just not sure how else to solve this for participants outside of the "language ecosystems" category.

Is the primary use case for automation systems here simply to answer the question: "Am I affected by CVE X in my distro?" And with the current related field, this would just give a "maybe" as an answer if it does live in any of the matched OSV records?

Exactly this! OSV's aliases field is really cool for consumers like vulnerability scanners and other security solutions, because it's a simple but powerful way to get more perspective on a vulnerability. By "JOIN"-ing to other aliased records, it's trivial to lookup what the Go team has to say about affected symbols for a package matched to a GHSA record, just as an example. This also means that it's not necessary for every OSV record in the "alias set" to copy each other's data into their own record. The "JOIN"-ability lets each ecosystem state what it knows best about that vulnerability.

So, Linux distributions want to be a part of that! ...without causing disruptions to the alias set itself. Speaking on behalf of Wolfi, it would be great for us to be able to weigh on on how a given vulnerability — expressed as another OSV record like GHSA-..., PSF-..., etc. — affects packages in the ecosystem we control, where the affected ranges are different (because the packages are different) and we can add other ecosystem-specific data of our own to the overall story.

This enables security tools to use the distro OSV data for matching, and then other OSV records to do other useful things, like provide users with more context about the vulnerability itself and cross-check the distro's findings with upstream findings. Zooming out, this also makes it easier for general consumers of the OSV database to see how different distros have handled a given vulnerability (it gets very interesting to compare notes like this during triaging!).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants