Add comments for references and evidence type #50

joshbuker · 2022-06-24T20:21:22Z

Opening this up for discussion

oliverchang · 2022-06-28T03:34:35Z

Thanks for opening this @joshbuker !

As discussed in person, our goal with this schema was to keep as many fields machine readable as possible, and keep the core fields as minimal as possible (i.e. focus on the purpose of enabling vulnerability scanners and triage).

The "type" field was intended to provide this kind of context for references, with a consistent mapping to a text description for what they mean.

However, I recognize that different databases may want to track additional information for humans/triage purposes, e.g. @kurtseifried also suggested adding timestamps here as part of #25. Our mechanism for doing this was database_specific, but it wasn't very generalized.

Perhaps an alternative here might be to extend the places where database_specific could apply. i.e. it can go in any part of the OSV schema, rather than the explicitly listed locations. i.e. this might look like:

{
  "references": [
    {
      "type": "WEB",
      "url": "https://blah.com",
      "database_specific": {
        "timestamp": "....",
        "comment": "GSD specific comment"
      }
    }
  ]
}

We can also build out a more well specified way to define the database_specific specs from all the databases. e.g. linking to a JSON schema/spec page somewhere that describes all the fields that have been extended by that database.

This mechanism doesn't exclude any of these fields from being added as a core field in the future -- if enough databases use similar fields it makes a strong case to include this as a core field.

@rsc @chrisbloom7 thoughts?

kurtseifried · 2022-06-28T03:54:09Z

I think if we're going to have a tag that then contains JSON and is scattered all over the place in order to allow new/different data we should look at how to make this less painful. Some thoughts:

Assume we keep the name database_specific for now. I suggest we usually add some standard metadata to it, e.g.

data_format (e.g. OSV, GSD, CVE, CSAF, whatever)
data_version (e.g. 1.2.3, 5.0, etc.)

So people know what schema in turn to look at rather than having to write magic parsers/etc.

We can also use it to ascertain if stuff should be included in the schema, e.g. if we see thousands of timestamp tags then maybe we should make that part of the official schema, e.g. how HTTP headers work (if enough people do it, it's a de facto standard).

chrisbloom7 · 2022-06-28T14:17:30Z

While it might be nice to augment the machine-parsable data with info relevant to human readers, ultimately that's what the references are for - "go here for more info". I have been viewing the existing database_specific fields as info that is necessary for the publishing system to track and manage the vulnerability. So far we have resisted publishing parsing documentation for those fields because the relevant info should be represented by the core schema (already documented) and the extra info is only useful internally. It is of course possible to dump any structured data in those fields, for humans or machines, but I would worry that if we start nesting schemas inside schemas to map database specific info then it might be easier for publishers to just fork the OSV schema and extend it with their own core fields leading to fractionalization of the standard - OSV, GSD-OSV, CVE-OSV, etc.

joshbuker · 2022-06-28T15:36:45Z

An example of where a human focused optional description/comment for references would be useful is something like Log4Shell where there are massive amounts of references, a time pressure for the reader, and potentially non-security/dev folks consuming the ID.

The type field would help narrow down overall categories of links, and humans could manually review each reference to understand its relevance, but that approach would go counter to providing both machine-readable and human-readable interfaces (and at scale would also waste of lot of people-hours). In the same way that providing a human-centric description of the ID itself is valuable for humans but unnecessary from a machine point of view, I feel that the same is true for references. It's an optional field that can be added specifically with the intent of aiding humans viewing an ID, and generally will be ignored by automation.

All that being said, it might be that we can expand the type field to have enough nuance to provide both machine readability and a "good enough" experience for humans by attaching descriptions to the various types. It's too early to say one way or the other I think.

joshbuker · 2022-07-11T22:45:46Z

Having thought this through more over the last week or two, I think we look at being very quick to iterate on adding new categories/reference types, and try to avoid adding a prose field for references until the actual demand from folks using it comes up.

Add comments for references and evidence type

ebf3b49

joshbuker closed this Jul 11, 2022

joshbuker deleted the feature/additional-reference-meta branch July 11, 2022 22:45

joshbuker mentioned this pull request Jul 11, 2022

Rename database_specific to experimental? #66

Closed

oliverchang mentioned this pull request Aug 30, 2022

Additional options for references field to support reproducers/exploit code #78

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comments for references and evidence type #50

Add comments for references and evidence type #50

joshbuker commented Jun 24, 2022

oliverchang commented Jun 28, 2022

kurtseifried commented Jun 28, 2022

chrisbloom7 commented Jun 28, 2022 •

edited

Loading

joshbuker commented Jun 28, 2022

joshbuker commented Jul 11, 2022

Add comments for references and evidence type #50

Add comments for references and evidence type #50

Conversation

joshbuker commented Jun 24, 2022

oliverchang commented Jun 28, 2022

kurtseifried commented Jun 28, 2022

chrisbloom7 commented Jun 28, 2022 • edited Loading

joshbuker commented Jun 28, 2022

joshbuker commented Jul 11, 2022

chrisbloom7 commented Jun 28, 2022 •

edited

Loading