Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardised formatting for OSV Schema #2135

Closed
yashrsharma44 opened this issue Apr 24, 2024 · 5 comments
Closed

Standardised formatting for OSV Schema #2135

yashrsharma44 opened this issue Apr 24, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@yashrsharma44
Copy link

Is your feature request related to a problem? Please describe.
I was playing with the feed in OSV database, and found that OSV.dev exports a feed of vulnerabilities for packages, which are enlisted in git repositories - https://storage.googleapis.com/osv-vulnerabilities/index.html?prefix=GIT/.

I wanted to understand, if the OSV schema has a standardised formatting for the versions entry, as I have found, some of the entries have versions formatted with - (1) prefix of v like v1.2.3 (2) some don't follow any SEMVER/other versioning standard (2) some contain tags from the github repositories.

Describe the solution you'd like
A guideline on the standard for the versions should be quite helpful, in setting an expectation, and any details on the different types of version formatting we can expect as the users of the feed.

Describe alternatives you've considered
N/A as I am raising a feature request.

Additional context
N/A

Thanks for maintaining such an awesome feed!

@yashrsharma44 yashrsharma44 added the enhancement New feature or request label Apr 24, 2024
@oliverchang
Copy link
Collaborator

oliverchang commented Apr 29, 2024

Hi @yashrsharma44 , thanks for the issue!

The versions entry follows the standard of the ecosystem specified in package. This is the same for versions specified in ranges.

For example, https://osv.dev/vulnerability/GHSA-9wmf-xf3h-r8pr is for the Maven ecosystem, and the OSV JSON looks like:

"affected": [
    {
      "package": {
        "name": "org.jberet:jberet-core",
        "ecosystem": "Maven",
        "purl": "pkg:maven/org.jberet/jberet-core"
      },
      "ranges": [
        {
          "type": "ECOSYSTEM",
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "2.2.1.Final"
            }
          ]
        }
      ],
      "versions": [
        "1.0.0.Alpha1",
        "1.0.0.Alpha2",
        "1.0.0.Alpha3",
        "1.0.0.Alpha4",
        "1.0.0.Beta1",
        "1.0.0.Beta2",
        "1.0.0.CR1",
        "1.0.0.CR2",
        "1.0.0.Final",
        "1.0.1.Beta",
        ...

Where every version listed is a version in the Maven registry, following Maven's version numbering rules.

Where this is a bit more complicated is when there is no well-defined packaging ecosystem specified, like for general C/C++ libraries (i.e. there are only GIT version ranges). In this case, the values formats are technically undefined according to the spec, but in OSV.dev's case, this typically means this is the upstream git version tags derived from the given GIT commit ranges. In these cases, the GIT commit ranges should be used to match git commit hashes to vulnerabilities.

Does this answer your question?

@andrewpollock
Copy link
Contributor

Additionally, thank you for your feedback, and if you have any data quality observations on the records (at https://storage.googleapis.com/osv-vulnerabilities/index.html?prefix=GIT/ in particular) please file issues to capture them.

@yashrsharma44
Copy link
Author

Hi @yashrsharma44 , thanks for the issue!

The versions entry follows the standard of the ecosystem specified in package. This is the same for versions specified in ranges.

For example, https://osv.dev/vulnerability/GHSA-9wmf-xf3h-r8pr is for the Maven ecosystem, and the OSV JSON looks like:

"affected": [
    {
      "package": {
        "name": "org.jberet:jberet-core",
        "ecosystem": "Maven",
        "purl": "pkg:maven/org.jberet/jberet-core"
      },
      "ranges": [
        {
          "type": "ECOSYSTEM",
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "2.2.1.Final"
            }
          ]
        }
      ],
      "versions": [
        "1.0.0.Alpha1",
        "1.0.0.Alpha2",
        "1.0.0.Alpha3",
        "1.0.0.Alpha4",
        "1.0.0.Beta1",
        "1.0.0.Beta2",
        "1.0.0.CR1",
        "1.0.0.CR2",
        "1.0.0.Final",
        "1.0.1.Beta",
        ...

Where every version listed is a version in the Maven registry, following Maven's version numbering rules.

Where this is a bit more complicated is when there is no well-defined packaging ecosystem specified, like for general C/C++ libraries (i.e. there are only GIT version ranges). In this case, the values formats are technically undefined according to the spec, but in OSV.dev's case, this typically means this is the upstream git version tags derived from the given GIT commit ranges. In these cases, the GIT commit ranges should be used to match git commit hashes to vulnerabilities.

Does this answer your question?

Thanks for the response. I agree with the versions returned; would be nice if we have this documented somewhere, ideally in the OSV Schema itself. The git tags are indeed useful, as the tags can be used for matching the version(or if it doesn't exist).

Thanks for maintaining this awesome archive 😄

@yashrsharma44
Copy link
Author

Additionally, thank you for your feedback, and if you have any data quality observations on the records (at https://storage.googleapis.com/osv-vulnerabilities/index.html?prefix=GIT/ in particular) please file issues to capture them.

My pleasure. I will add more GH issues, as I find any discrepancies in the data.

andrewpollock added a commit to andrewpollock/osv-schema that referenced this issue Apr 30, 2024
andrewpollock added a commit to andrewpollock/osv-schema that referenced this issue Apr 30, 2024
Based on query raised in google/osv.dev#2135

Signed-off-by: Andrew Pollock <apollock@google.com>
@andrewpollock
Copy link
Contributor

Thanks for the response. I agree with the versions returned; would be nice if we have this documented somewhere, ideally in the OSV Schema itself. The git tags are indeed useful, as the tags can be used for matching the version(or if it doesn't exist).

@yashrsharma44 Please take a look at ossf/osv-schema#238 and provide any feedback on how well you feel it addresses this deficiency in the schema documentation.

andrewpollock added a commit to andrewpollock/osv-schema that referenced this issue May 1, 2024
Based on query raised in google/osv.dev#2135

Signed-off-by: Andrew Pollock <apollock@google.com>
oliverchang pushed a commit to ossf/osv-schema that referenced this issue Jul 22, 2024
Based on query raised in google/osv.dev#2135

---------

Signed-off-by: Andrew Pollock <apollock@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants