-
-
Notifications
You must be signed in to change notification settings - Fork 38
Description
I would like to have a flexible library to find the corresponding source code for a package
Core features for this are already part of the find source module and here what I would like if a specific reusable and documented library that wraps this feature so it can be reused in ScanCode.io, VulnerableCode and PurlDB.
The library can be stored in the purldb repo (like the purldb toolkit) but should be released as it own PyPI package for reuse
For PurlDB, an outcome would be also to update or create the package set once the source repo is found and ensure it is further indexed.
Some notes:
Finding the repo and the commits of a version is sometimes difficult because in many cases the information is not directly available in a package archive metadata and we may need to dive deeper in key files, other files or other packages in the set.
The typical flow would be assuming PURL inputs:
- Given a binary package, find the source repos
- Given a source package archive, find the source repos
- Given a package set, find the source repos
- Then once the repo is found, find the matching commit for a package version.