-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add comments for references and evidence type #50
Conversation
Thanks for opening this @joshbuker ! As discussed in person, our goal with this schema was to keep as many fields machine readable as possible, and keep the core fields as minimal as possible (i.e. focus on the purpose of enabling vulnerability scanners and triage). The "type" field was intended to provide this kind of context for references, with a consistent mapping to a text description for what they mean. However, I recognize that different databases may want to track additional information for humans/triage purposes, e.g. @kurtseifried also suggested adding timestamps here as part of #25. Our mechanism for doing this was Perhaps an alternative here might be to extend the places where {
"references": [
{
"type": "WEB",
"url": "https://blah.com",
"database_specific": {
"timestamp": "....",
"comment": "GSD specific comment"
}
}
]
} We can also build out a more well specified way to define the This mechanism doesn't exclude any of these fields from being added as a core field in the future -- if enough databases use similar fields it makes a strong case to include this as a core field. @rsc @chrisbloom7 thoughts? |
I think if we're going to have a tag that then contains JSON and is scattered all over the place in order to allow new/different data we should look at how to make this less painful. Some thoughts: Assume we keep the name database_specific for now. I suggest we usually add some standard metadata to it, e.g. data_format (e.g. OSV, GSD, CVE, CSAF, whatever) So people know what schema in turn to look at rather than having to write magic parsers/etc. We can also use it to ascertain if stuff should be included in the schema, e.g. if we see thousands of timestamp tags then maybe we should make that part of the official schema, e.g. how HTTP headers work (if enough people do it, it's a de facto standard). |
While it might be nice to augment the machine-parsable data with info relevant to human readers, ultimately that's what the references are for - "go here for more info". I have been viewing the existing |
An example of where a human focused optional description/comment for references would be useful is something like Log4Shell where there are massive amounts of references, a time pressure for the reader, and potentially non-security/dev folks consuming the ID. The type field would help narrow down overall categories of links, and humans could manually review each reference to understand its relevance, but that approach would go counter to providing both machine-readable and human-readable interfaces (and at scale would also waste of lot of people-hours). In the same way that providing a human-centric description of the ID itself is valuable for humans but unnecessary from a machine point of view, I feel that the same is true for references. It's an optional field that can be added specifically with the intent of aiding humans viewing an ID, and generally will be ignored by automation. All that being said, it might be that we can expand the type field to have enough nuance to provide both machine readability and a "good enough" experience for humans by attaching descriptions to the various types. It's too early to say one way or the other I think. |
Having thought this through more over the last week or two, I think we look at being very quick to iterate on adding new categories/reference types, and try to avoid adding a prose field for references until the actual demand from folks using it comes up. |
Opening this up for discussion