Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for new vulnerability report format #200

Closed
6 of 8 tasks
lirantal opened this issue Mar 29, 2018 · 42 comments
Closed
6 of 8 tasks

Suggestion for new vulnerability report format #200

lirantal opened this issue Mar 29, 2018 · 42 comments

Comments

@lirantal
Copy link
Member

lirantal commented Mar 29, 2018

Motivation

The current state of a report is not fully normalized which makes it hard to parse it as machine readable format. Moreover, the report has mixed formatting, for example: a newline indicated by \n and code emphasis indicated by markdown as ``

Recommended format

Title

The weakness description of the vulnerability.
Always a clear-text string, no formatting allowed.

Overview

Markdown parsable text

Author

Name would be a full name, username is as set in the hackerone report, and website as a social reference for the user (will be taken from the hackerone platform).

Author field will consist of an object with the following fields:

author: {
  name: "",
  username: "",
  website: ""
}

Dates

  • created_at: should be specified as ISO 8601 YYYY-MM-DD
  • updated_at: should be specified as ISO 8601 YYYY-MM-DD
  • publish_date: should be specified as ISO 8601 YYYY-MM-DD

References

Instead of a string, this should be an array of strings:

references: [
  "https://hackerone.com/...",
  "https://github.com/..."
]

Slug

Probably refers to the nodejs, or nsp blog where the vulnerability was disclosed.
Recommending to remove this entry.

Recommendation

I'm wondering if we can have some status enum thing here, where the path is easily understood such as either you should upgrade, or drop it completely due to no fix available. Something like:

recommendation: "UPGRADE|DROP|any other status we can have?"

If not this, then we should have a formatted string that follows a standard, such as:

  • for upgrading: "It is recommended to upgrade to a newer version"
  • for dropping support: "At the time of the report, there is no available upgrade path"
    They aren't the final suggestions, just to convey the point. Not sure that the module name or version should even be in them since that information is already available through other fields in the vulnerability report.

Existing format for reference:

{
  "id": 399,
  "title": "Command Injection - Generic",
  "overview": "`whereis` concatenates unsanitized input into exec() command",
  "created_at": "2018-02-25",
  "updated_at": "2018-03-28",
  "publish_date": "2018-03-28",
  "author": "Сковорода Никита Андреевич (https://github.com/ChALkeR)",
  "module_name": "whereis",
  "cves": [
    ""
  ],
  "vulnerable_versions": "<=0.4.0",
  "patched_versions": ">=0.4.1",
  "slug": "whereis-command-injection---generic",
  "recommendation": "use npm package `which` instead",
  "references": "- https://hackerone.com/reports/319476\n- https://github.com/vvo/node-whereis/commit/0f64e3780235004fb6e43bfd153ea3e0e210ee2b",
  "cvss_vector": "CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H",
  "cvss_score": 9.9,
  "coordinating_vendor": ""
}

Status

A reference to the proposed changes:

{
  "id": 399,
  "title": "Command Injection - Generic",
  "overview": "`whereis` concatenates unsanitized input into `exec()` command",
  "created_at": "2018-02-25",
  "updated_at": "2018-03-28",
  "publish_date": "2018-03-28",
  "author": {
    "name": "Сковорода Никита Андреевич",
    "username": "ChALkeR",
    "website": "https://github.com/ChALkeR"
  },
  "module_name": "whereis",
  "cves": [
  ],
  "vulnerable_versions": "<=0.4.0",
  "patched_versions": ">=0.4.1",
  "recommendation": "DROP: the package and use npm package `which` instead",
  "references": [
    "https://hackerone.com/reports/319476",
    "https://github.com/vvo/node-whereis/commit/0f64e3780235004fb6e43bfd153ea3e0e210ee2b"
  ],
  "cvss_vector": "CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H",
  "cvss_score": 9.9,
  "coordinating_vendor": ""
}

Relates to #115

@lirantal lirantal self-assigned this Mar 29, 2018
@mhdawson
Copy link
Member

mhdawson commented Apr 4, 2018

slug I think refers to the nodejs blog where the vulnerability was disclosed. This often has more info so a link is useful, although that could fit in the references.

Is this for core vulnerabilities or those in 3rd party modules?

@lirantal
Copy link
Member Author

lirantal commented Apr 5, 2018

Yeah, so I think the slug field can be removed entirely.

The scope of this issue is currently only ecosystem vulns. The core node vulns have a different format entirely, which we can consider aligning up to this but I'm not sure what it will break if we do so.

@mhdawson
Copy link
Member

mhdawson commented Apr 5, 2018

I'd agree that for ecosystem vulns, dropping slug and using the references makes the most sense to me.

@vdeturckheim
Copy link
Member

LGTM.
Also, when we have multiple versions range to handle, machine readability is not optimal.
Maybe having versions related field as

 "vulnerable_versions": [
    {"min": "=0.0.0", "max": "=4.0.0"} // the "=" means that the version is in the range
    {"min": "5.0.0", "max": "=5.1.0"}
]

would be clearer.

@lirantal
Copy link
Member Author

@vdeturckheim I'm not sure if this further simplifies it or not.
it is however more machine-friendly I guess.

@greysteil would be great if you could chime in here with feedback regarding your recent changes on the formatting through out the vulns (i.e: #216)

@vdeturckheim
Copy link
Member

@lirantal yeah, parsing versions ranges can be pretty complicated sometimes. Mostly trying to make my life easier here ^^

@greysteil
Copy link
Contributor

greysteil commented Apr 15, 2018

Interesting one. On requirement ranges I would go with either:

  1. An array of version requirement strings, the way Ruby does (so [">= 1.0.0 < 1.3.2", "^2.1.2"] would mean either ">= 1.0.0 < 1.3.2" or "^2.1.2").
  2. Use || as an or separator, as specified in JS's semver setup (read the bit about || here).

I think my preference is for (2), given that JS has support for that. Actually, looking at it, (2) is way cleverer than it needs to be. My recommendation would be to use the smallest amount of the language possible and go for (1).

(I mess about with these in a lot of languages in Dependabot. Everyone does it differently, but the || requirement is pretty easy to massage into other formats. If you're working in Ruby, I do it here.)

@greysteil
Copy link
Contributor

On the other suggestions, I think it's going to be easier to discuss these over individual PRs that change the validations. For starters, I'll put in PRs for:

  • title: validate that it is fewer than 100 characters, with no new lines (or something like that)
  • slug: drop it
  • dates: make created_at and updated_at YYYY-MM-DD (this one needs some discussion)

@pxlpnk
Copy link
Member

pxlpnk commented Jun 14, 2018

I think #171 can be closed in favour of this PR.

The changes proposed until now (checked are merged already):

I took the liberty updated the reference provided in the issue with the proposed and applied changes:

{
  "id": 399,
  "title": "Command Injection - Generic",
  "overview": "`whereis` concatenates unsanitized input into `exec()` command",
  "created_at": "2018-02-25",
  "updated_at": "2018-03-28",
  "publish_date": "2018-03-28",
  "author": {
    "name": "Сковорода Никита Андреевич",
    "username": "ChALkeR",
    "website": "https://github.com/ChALkeR"
  },
  "module_name": "whereis",
  "cves": [
  ],
  "vulnerable_versions": "<=0.4.0",
  "patched_versions": ">=0.4.1",
  "recommendation": "DROP: the package and use npm package `which` instead",
  "references": [
    "https://hackerone.com/reports/319476",
    "https://github.com/vvo/node-whereis/commit/0f64e3780235004fb6e43bfd153ea3e0e210ee2b"
  ],
  "cvss_vector": "CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H",
  "cvss_score": 9.9,
  "coordinating_vendor": ""
}

Overall those changes make all sense to me and would make working with the npm now and in the future way easier.

What needs some final discussion imoh:

After working through the PRs I think this should not be too much work, and I would like to spend some time on this.
WDYT?

@vdeturckheim
Copy link
Member

@pxlpnk I like that. What would be your take on version ranges format?

@pxlpnk
Copy link
Member

pxlpnk commented Jun 14, 2018

@vdeturckheim I like the [">= 1.0.0 < 1.3.2", "^2.1.2"] version more. It is a basic data type and does not need extra parsing as || would.
However I have no strong opinion on this.
The two versions:
2: [">= 1.0.0 < 1.3.2", "^2.1.2"]
2: ">= 1.0.0 || < 1.3.2 || ^2.1.2"

@greysteil Please correct me if I understood it wrong.

@greysteil
Copy link
Contributor

greysteil commented Jun 14, 2018

Agree on avoiding ||. In your example above, the second one would be ">= 1.0.0 < 1.3.2 || ^2.1.2", because npm treats the space as an AND which is what we want there. Avoiding using || avoids that kind of complexity entirely 🙂

@pxlpnk
Copy link
Member

pxlpnk commented Jun 14, 2018

I also think || is rather confusing.

So what are the next steps for this then?

@vdeturckheim
Copy link
Member

@pxlpnk would you have time to open a PR to iterate uppon?

@pxlpnk
Copy link
Member

pxlpnk commented Jun 14, 2018

I have time to work on this and move it forward.
I try to do it in the following order:

  • author
  • overview
  • references
  • vulnerable versions

and will propose individual PRs for this.

@lirantal
Copy link
Member Author

@pxlpnk do you mind if I edit the original issue with your summary and proposed final format so we have one source of reference instead of having to scroll for it in the comments?

@pxlpnk
Copy link
Member

pxlpnk commented Jun 15, 2018

Please go ahead. 👍

@pxlpnk
Copy link
Member

pxlpnk commented Jun 15, 2018

@lirantal @vdeturckheim I proposed two PRs with changes for the author and references fields.
Please review them when you have some time and let me know if this is the direction we should go.
At the moment this is mostly about data hygiene and getting the data into a good machine-readable format.

@lirantal
Copy link
Member Author

@nodejs/security-wg we'd like to move forward with 2 PRs to be merged and update the vulnerability db format according with the suggestions made in this issue, specifically they are:

  1. References field refactor(npm-db): update references field to be an array #316
  2. Author field nswg-reporter: initial proposition for hackerone reporter #234

We'd like to ask your feedback with regards to any tooling or otherwise dependency around the db incase this may affect you.

** This has been open for a while (both PRs and this issue) so we'll merge in a week time if there are no objections.

After the PRs will be merged you should expect the following format to be applied:

Author

Name would be a full name, username is as set in the hackerone report, and website as a social reference for the user (will be taken from the hackerone platform).

Author field will consist of an object with the following fields:

author: {
  name: "",
  username: "",
  website: ""
}

References

Instead of a string, this should be an array of strings:

references: [
  "https://hackerone.com/...",
  "https://github.com/..."
]

@mhdawson
Copy link
Member

@lirantal the changes are ok with me provided we are not going to cause undue breakage on those using the existing format.

@lirantal
Copy link
Member Author

Well we some what are... we'll be updating retroactively all the vuln db to the new format.
So if a tool used the vuln db and expected that references is a string with end of line control chars inside, then now it will be a JSON array with each reference URL.

@pxlpnk
Copy link
Member

pxlpnk commented Jun 27, 2018

@mhdawson How do you suggest providing an upgrade path could work here?
Changes to the data structure are something that I can and will happen in the future. I think it makes sense to come up with a strategy to support this in the future.

@mhdawson
Copy link
Member

@pxlpnk I'm not necessarily suggesting an upgrade path, it is more about being comfortable that we've done our due diligence around understanding who uses the existing data/format, understanding if they are concerned with the change and, if they are, considering what we can do to avoid undue impact to them.

For example instead of just throwing the switch maybe we need to publicise, give people time to react etc. #200 (comment) is a start at that, but I wonder if we need to do broader messaging (blogs, tweets etc.). If we know there are npm modules using then reach out to maintainers directly etc. We might also need to give people more time to get ready to adjust.

I don't think I know enough about the users of the existing data to make a good call on what we should do in advance.

I agree a strategy that is well-documented helps. It can set expectations about how we'll communicate changes will be made, how much lead time we'll give consumers (as they will need time to make adjustments).

@lirantal
Copy link
Member Author

Agree with Michael on this. We've had people chime in on building toolset around it before on the issue queue and sharing it so we know some people are definitely depended on it.

By the way, this connects to another topic we raised in the last working group - if you remember I mentioned that we can have a "who uses our db" kind of page, and while suggesting it at that time for better engagement and evangelism I now see it being useful for also understanding whom are our consumers. I definitely think now it is worth pursuing this and would love to get at it.

@mhdawson
Copy link
Member

Some key things in the strategy might be (just off the top of my head):

When changes are planned to the database format we will communicate through:

  • Issue in the security-wg repo
  • Email/@mention to all of those who have registered as data users
  • Tweet from the Node.js foundation twitter account

The community will include the date the format of the change to the data will be made.

The first communication will be sent out at least 2 months before the scheduled change and will be repeated every 2 weeks.

@mhdawson
Copy link
Member

@lirantal getting a list of users, for notifications is definitely a good idea.

@pxlpnk
Copy link
Member

pxlpnk commented Jun 28, 2018

@mhdawson This seems all very reasonable to me. I am happy to take this on drive this forward.

I think what we should do here is:

  • Open a new issue to discuss the matter.
  • Propose a process in this that is simple but covers the problems (pretty much exactly what you mentioned)
  • Start collecting users that want to be registered as data users.
  • Use refactor(npm-db): update author field to be more verbose #314 and the vulnerable versions to exercise it for the first time.

Another approach would be to try versioning the format. However, I have not too much of an idea how this could be done.

Can any of the consumers of the data here maybe chip in and let us know what would work for them.

@greysteil
Copy link
Contributor

greysteil commented Jun 28, 2018

I’m fine with just being notified, but that’s because Dependabot (my app) is just SaaS. Imagine if any other data users are working with CLIs etc they’ll want versioning. Actually, if I was building a deployed app that used this data, I'd definitely build a web proxy for it, which is where I'd handle changes in format. I think notifications should be totally fine - no need for versioning.

@mhdawson
Copy link
Member

@pxlpnk +1 to the next steps you proposed.

@lirantal
Copy link
Member Author

Regardless to that process we can probably continue with the proposed changes (one of which was merged) in a few days if no one jumps here to say otherwise.

@pxlpnk
Copy link
Member

pxlpnk commented Jul 3, 2018

@mhdawson do you have any strong opinions on getting #314 merged soonish?

@mhdawson
Copy link
Member

mhdawson commented Jul 4, 2018

@lirantal, not sure why #314 could progress without following a good process to make sure we are not breaking anybody? I was thinking we should notify/follow what we believe will be the process in the future.

@lirantal
Copy link
Member Author

lirantal commented Jul 4, 2018

In #200 which was already opened several months ago we touched all of these updates, and @pxlpnk's PR is just applying them. I would assume any significant service/user of the DB will monitor our issue queue and jump in already with all of the changes we've made so far (i.e: enforcing nulls instead of empty strings, and so on, which are breaking changes as well).

I do agree with you, and maybe we can use #314 as a "use case" to outline a process of db changes (cc @pxlpnk this might be where you could jump in and formalize a new process), and follow it through.

@pxlpnk
Copy link
Member

pxlpnk commented Jul 13, 2018

@lirantal I added an announcement issue #345
After this one has been exercised I will draft a document and propose it.

@mhdawson
Copy link
Member

Would it also make sense to update the header on the Node.js website to help get the message out?

@mhdawson
Copy link
Member

I also think we want to build a list of db users can we contact directly as well.

@lirantal
Copy link
Member Author

I'm not sure what's the policy on that (the Node.js website), but definitely any formal means of announcing such change would make it easier for subscribers to get alerts.

At the moment the only thing I can think of is the repo's issue queue and tweeting from our own personal accounts.

@waveywaves
Copy link
Contributor

waveywaves commented Sep 23, 2018

Hello guys !
Please do check the pull request I opened recently for the markdown parsable overview .
Currently working on the format of the vulnerable versions. #408

@sam-github
Copy link
Contributor

@lirantal, you have done a lot of work on the vuln json format, can this be closed, or is it still ongoing?

@lirantal
Copy link
Member Author

Let's leave open so I can further review the remaining open items

@fraxken
Copy link
Member

fraxken commented Jul 18, 2022

I guess we should close this one @lirantal ?

@lirantal
Copy link
Member Author

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants