Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingest https://github.com/psf/advisory-database/tree/main/advisories #1552

Closed
oliverchang opened this issue Aug 10, 2023 · 19 comments
Closed
Labels
datasource Requests for new data sources

Comments

@oliverchang
Copy link
Collaborator

No description provided.

@oliverchang oliverchang added the datasource Requests for new data sources label Aug 10, 2023
@sethmlarson
Copy link
Contributor

@oliverchang Would be happy to help out implementing this. I saw in the contributing guide it mentioned creating an issue but not much else, is there anything I can help with?

@oliverchang
Copy link
Collaborator Author

Thanks for reaching out!

https://github.com/psf/advisory-database/tree/main/advisories appears to all be keyed on commits, so there's not much to do here other than for us to set up the infra config to start importing. We'll get to it this week :)

@andrewpollock
Copy link
Contributor

https://oss-vdb-test.wl.r.appspot.com/vulnerability/PSF-2023-9 got on our radar because it's invalid to have an ECOSYSTEM range without an affected[].package field.

OSV.dev will attempt to enumerate affected[].versions for GIT ranges as part of the importing process, so there's no need to supply them.

@oliverchang
Copy link
Collaborator Author

https://oss-vdb-test.wl.r.appspot.com/vulnerability/PSF-2023-9 got on our radar because it's invalid to have an ECOSYSTEM range without an affected[].package field.

OSV.dev will attempt to enumerate affected[].versions for GIT ranges as part of the importing process, so there's no need to supply them.

+1 see the example linked:

image

These are dervied from git tags that match the provided git ranges.

@sethmlarson
Copy link
Contributor

Got it, one hitch to that is it seems that historical git branches get deleted from the GitHub repository after they're no longer supported, so advisories affecting versions before 3.8.0 might not get populated?

@sethmlarson
Copy link
Contributor

Also looks like the advisory https://oss-vdb-test.wl.r.appspot.com/vulnerability/PSF-2023-8 isn't rendering correctly, I'm assuming that I'm encoding the newlines incorrectly.

@oliverchang
Copy link
Collaborator Author

Got it, one hitch to that is it seems that historical git branches get deleted from the GitHub repository after they're no longer supported, so advisories affecting versions before 3.8.0 might not get populated?

Hmm, we managed to extract older versions (including 2.x) for https://oss-vdb-test.wl.r.appspot.com/vulnerability/PSF-2023-8. Perhaps that's a non-issue?

Also looks like the advisory https://oss-vdb-test.wl.r.appspot.com/vulnerability/PSF-2023-8 isn't rendering correctly, I'm assuming that I'm encoding the newlines incorrectly.

https://github.com/psf/advisory-database/blob/4d5ebbc62f10b00e10f10e165587aedfd85134da/advisories/python/PSF-2023-8.json#L10 looks to have double quoted newlines.

@sethmlarson
Copy link
Contributor

Got it, one hitch to that is it seems that historical git branches get deleted from the GitHub repository after they're no longer supported, so advisories affecting versions before 3.8.0 might not get populated?

Hmm, we managed to extract older versions (including 2.x) for https://oss-vdb-test.wl.r.appspot.com/vulnerability/PSF-2023-8. Perhaps that's a non-issue?

That is missing a lot of tags for v2.x so it's not a complete import, the 2.7.x series went until v2.7.18.

@oliverchang
Copy link
Collaborator Author

Got it, one hitch to that is it seems that historical git branches get deleted from the GitHub repository after they're no longer supported, so advisories affecting versions before 3.8.0 might not get populated?

Hmm, we managed to extract older versions (including 2.x) for https://oss-vdb-test.wl.r.appspot.com/vulnerability/PSF-2023-8. Perhaps that's a non-issue?

That is missing a lot of tags for v2.x so it's not a complete import, the 2.7.x series went until v2.7.18.

Hmm, that's rather unfortunate. Without historical branches the analysis is purely on the main and available branches, which is probably why we're missing later versions in older CPython branches that branched off main. Is there any way to advocate for keeping these branches?

In any case, having ECOSYSTEM ranges is not valid when there is no defined ecosystem, and will break our import sadly.

@andrewpollock
Copy link
Contributor

and will break our import sadly.

More to the point it currently doesn't break our import and we're winding up with vulnerabilities that don't meet certain assumptions and that causes all sorts of issues :-(

@oliverchang
Copy link
Collaborator Author

@sethmlarson any chance we can remove the ECOSYSTEM ranges for now to unblock our ingestion? As @andrewpollock correctly pointed out, technically it doesn't fully break our ingestion, but it does break other parts of our UI/infra in subtle ways.

Assuming that the branch heads to all the historical released versions of cpython still exist (e.g. via tags), there's likely a technical solution that will enable us to enumerate every version correctly (i.e. correctly enumerating all the way up to v2.7.18 per #1552 (comment)). The challenge for us is doing this efficiently, but I've filed #1590 to track this.

@sethmlarson
Copy link
Contributor

I can remove the ecosystem version ranges, it would be good to somehow capture the correct affected versions though. Would listing out the tags in "affected[].versions" work for that purpose?

@sethmlarson
Copy link
Contributor

I've updated the records to remove the ECOSYSTEM type: psf/advisory-database#21

@andrewpollock
Copy link
Contributor

Would listing out the tags in "affected[].versions" work for that purpose?

https://ossf.github.io/osv-schema/#affectedversions-field

OSV.dev, as part of the importing process will do its best to enumerate the versions between the introduced and fixed (and soon to be last_affected) GIT ranges. I'd have to go back and review the code involved to refresh my understanding of how it treats a pre-supplied affected[].versions field.

@oliverchang
Copy link
Collaborator Author

I can remove the ecosystem version ranges, it would be good to somehow capture the correct affected versions though. Would listing out the tags in "affected[].versions" work for that purpose?

This would work. Our own tag enumeration would only add new tags that we match that don't already exist to the original record. We don't ever delete things already in affected[].versions.

@sethmlarson
Copy link
Contributor

sethmlarson commented Aug 31, 2023

@oliverchang I talked with Release Manager Łukasz and he mentioned that while branches are deleted all of the commits are kept in the repo, so in theory by cloning the entire repository all of the tags should be discoverable. Maybe this is causing an issue in the discovery process?

@oliverchang
Copy link
Collaborator Author

@oliverchang I talked with Release Manager Łukasz and he mentioned that while branches are deleted all of the commits are kept in the repo, so in theory by cloning the entire repository all of the tags should be discoverable. Maybe this is causing an issue in the discovery process?

Thanks for checking! Yep, this is what we will be trying as part of #1590

@oliverchang
Copy link
Collaborator Author

These are now ingested in our prod instance. e.g. https://osv.dev/vulnerability/PSF-2019-7

#1590 tracks the remaining git enumeration improvement

@sethmlarson
Copy link
Contributor

Awesome, thanks @oliverchang and @andrewpollock! 💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasource Requests for new data sources
Projects
None yet
Development

No branches or pull requests

3 participants