Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine and simplify Package models #275

Closed
pombredanne opened this issue Jun 26, 2016 · 6 comments
Closed

Refine and simplify Package models #275

pombredanne opened this issue Jun 26, 2016 · 6 comments
Assignees
Milestone

Comments

@pombredanne
Copy link
Member

pombredanne commented Jun 26, 2016

Things that need to be corrected:

  • the use of subclassing for Package does not make sense as implemented. We need instead some composition IMHO: the current way smells funny.
  • And we need a plugin approach such that new package formats are implement as external libraries, eventually using different languages than Python to reuse the many parsers natively available (Ruby, Js, Java, etc). This would likely make the pure subclassing likely naturally vanish.
  • schematics is likely not the right library for our use case: attrs or marshmallow need to be reviewed. The schematics-based code smells funny too

We need a few more types and change some types:

  • Project: several packages are part of a larger project that defines many packages attributes
  • Repo: a package name/version is typically unique only within the confine of a repo
  • FQN: fully qualified name such as in repo:name:version (and possibly os/arch for many packages beyond deb and rpms) and defining some kind of global identification
  • Grouping: several packages are often derived for different architecture/platforms (Java for Ruby, egg/wheels/sdist/bdist on Python, various os/arch on RPMs and debs, etc.) or different artifacts are available for a package (src/bin/doc/pom for Maven)
  • Dependencies: while they live in some group, a flat sequence is likely better than a by-group tree. Also there is always a notion of potential deps vs resolved, locked deps. This needs to be more simply modelled.
  • Downloads: we have a 1-to-1 between a download and a package (even though each download can be several urls, this is for the same exact file). This could be refined in one package -> many downloads where each download is for the same package version but packaged differently (arch/os/platform/source/binary/etc)

Some extra data fields needed too:

  • documentation_url
  • the web page if any for a package release, as is typical for many package repos
  • beyond keywords, support for Trove-like classifiers (in sf.net or Pypi)
  • download stats when provided?
  • file size and filename
  • release or upload date

Or should be retired:

  • keywords_doc_url
@pombredanne
Copy link
Member Author

From #241
So now that we have used schematics, I am not super happy with the way this works. In particular anything that would be stored in the DB in a near future would require having yet another schema being defined in yet another library. I made a few local tests storing things as raw JSON in a DB but that raise another problem of schema migrations.
I am considering using Django's ORM to define the models instead. I will try out some tests in a branch. The semantics are essentially the same as schematics or marshmallow (or the other way around) and there could be some side benefits to store an-flight scan in a temp sqlite DB. I will work out something.
Feedback welcomed

pombredanne added a commit that referenced this issue Sep 22, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 22, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 23, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 4, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 4, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 4, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 8, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 8, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 8, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 15, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 15, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 15, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Dec 29, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Dec 29, 2016
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Feb 23, 2017
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Feb 23, 2017
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne pombredanne modified the milestones: v2.0, v2.1 Mar 24, 2017
pombredanne added a commit that referenced this issue Apr 11, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * Add new PackageIdentifier class for #805 as Package property and as
   discrete type:namespace/name@version?qualifiers#path fields
 * Improved DependentPackage definitions using a package idenitifier
   and simpler flags. Do not use a mapping per scope anymore.
 * Improve related packages definitions with a PackageRelationship
   class using from/to package identifiers
 * Add OrderedDictType for schematics
 * Remove unused Package methods for versions

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * Support PackageIdentifier class for #805 as Package property and as
   discrete type:namespace/name@version?qualifiers#path fields
 * Improved DependentPackage definitions using a package idenitifier
   and simpler flags. Do not use a mapping per scope anymore.
 * Improve related packages definitions with a PackageRelationship
   class using from/to package identifiers

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * is_runtime defaults to True
 * is_resolved and is_optional default to False
 * also improve scope handling for maven poms and other cosmetics

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * migrated from scancode-toolkit-contrib

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * migrated from scancode-toolkit-contrib

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * migrated from scancode-toolkit-contrib

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * migrated from scancode-toolkit-contrib

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * this is otherwsie gitignored

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
* also renamed the info function to pe_info

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * as specified at https://github.com/package-url/purl-spec
 * at the package, dependencies and related levels
 * rename path to subpath to conform with purl
 * not directly tested for now

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * This is a first rough implmentation using
   https://github.com/package-url/packageurl-python
 * Based on package-url/purl-spec#1

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * this is an early release that implements the to-be spec
  at package-url/purl-spec#31
 * update tests accordingly

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * handle namespace for npm scoped packages properly
 * add missing description to npm
 * Add new package attributes default_web_baseurl,
   default_download_baseurl and default_api_baseurl and the
   corresponding methods to override in subclasses
 * Improve npm URLs computation and implement these methods.
 * Other minor refinements.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
 * these come from the latest rebase on develop

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 11, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Apr 16, 2018
 Using "licensing" rather than license makes it clear that this is not a
 normalized scancode "license" but is instead the original, as
 "asserted" or "declared" licensing in a package manifest.
 Declared is the term used by SPDX.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Member Author

This is completed and merged in develop.

@jhgoebbert
Copy link

This commit adds in setup.py
'pefile == 1.2.10-132',
which makes it impossible to install on Linux with pip, as there is no version like this in PyPi:
https://pypi.org/project/pefile/#history

@pombredanne
Copy link
Member Author

pombredanne commented Sep 19, 2018

@jhgoebbert good point. That's never been a released version on Pypi!
Do you mind to create a separate ticket specifically for this issue?
thank you!

@pombredanne
Copy link
Member Author

Thanks ... This is now tracked in #1183

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants