Skip to content

Create library to find the corresponding source code for a package  #374

@pombredanne

Description

@pombredanne

I would like to have a flexible library to find the corresponding source code for a package

Core features for this are already part of the find source module and here what I would like if a specific reusable and documented library that wraps this feature so it can be reused in ScanCode.io, VulnerableCode and PurlDB.

The library can be stored in the purldb repo (like the purldb toolkit) but should be released as it own PyPI package for reuse

For PurlDB, an outcome would be also to update or create the package set once the source repo is found and ensure it is further indexed.

Some notes:
Finding the repo and the commits of a version is sometimes difficult because in many cases the information is not directly available in a package archive metadata and we may need to dive deeper in key files, other files or other packages in the set.

The typical flow would be assuming PURL inputs:

  • Given a binary package, find the source repos
  • Given a source package archive, find the source repos
  • Given a package set, find the source repos
  • Then once the repo is found, find the matching commit for a package version.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions