Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Primary License. Create and maintain a primary declared_license_expression field. #2065

Closed
chinyeungli opened this issue Jun 10, 2020 · 8 comments

Comments

@chinyeungli
Copy link
Contributor

chinyeungli commented Jun 10, 2020

Note that package.declared_license can be any data structure. It has been confusing to several because "declared license" is used differently by other orgs, such as SPDX.
Rename to: extracted_license_statement
The original text and data structure in a software package manifest that indicates the applicable license.  This value is not necessarily a key to any license list, and it is not validated.

package.license_expression is A single detected license expression using our keys.
Rename to: package.declared_license_expression
This is the primary license expression as determined from the declaration(s) of the authors of the package.

create a parallel field called
package.declared_license_expression_spdx
where the expression will use SPDX identifiers.

We should also update the license field names on the Resource model to reflect the changes we are making to the license field names on the Package model. This would mean that:

Resource.licenses should be renamed to Resource.license_detections

Resource.license_expressions should be renamed Resource.detected_license_expressions

Resource.detected_license_expressions_spdx should be added. This field contains the the same data as Resource.detected_license_expressions but with SPDX identifiers.

There is also a new codebase-level field we should add named license_references. This would be a list of unique license records detected during a scan.

Older comments here:

It may be an good idea to keep track of the primary license for a package.

For instance,
a package is under mit but contain test/not-deployed code which is under gpl-2.0.
It may be a good idea to report this package's primary license as mit instead of mit AND gpl-2.0
OR better, mit AND (gpl-2.0) so that users know the primary license is mit and the secondary is gpl-2.0 (meaning code that don't affect the primary license).

Another example,
think of a debian copyright file:
adwaita-icon-theme/copyright

This package was originally debianized by Takuo KITAME <kitame@debian.org> on
Fri, 17 Jan 2003 14:57:28 +0900.
Andreas Henriksson <andreas@fatal.se> later reused the gnome-icon-theme
packaging for the new adwaita-icon-theme package name.

It was downloaded from <http://download.gnome.org/sources/adwaita-icon-theme/>

Files: *
Copyright:
 © 2002-2014:
 .
  Full Color Icons
  ================
 .
  Ulisse Perusin <uli.peru@gmail.com>
  Riccardo Buzzotta <raozuzu@yahoo.it>
  Josef Vybíral <cornelius@vybiral.info>
  Hylke Bons <h.bons@gmail.com>
  Ricardo González <rick@jinlabs.com>
  Lapo Calamandrei <calamandrei@gmail.com>
  Rodney Dawes <dobey@novell.com>
  Luca Ferretti <elle.uca@libero.it>
  Tuomas Kuosmanen <tigert@gimp.org>
  Andreas Nilsson <nisses.mail@home.se>
  Jakub Steiner <jimmac@novell.com>
 .
  Some external 3D Assets used:
  Geraldo Cockerhan - http://www.blendswap.com/blends/view/40495 CCBYSA
 .
  Symbolic Icons
  ==============
 .
  Metaphors
  ---------
  Claire Alexander <claire.alexander@intel.com>
  Hylke Bons <hylke.bons@intel.com>
  Darren Wilson <darren.wilson@intel.com>
 .
  Artwork
  -------
  Jakub Steiner <jimmac@novell.com>
  Lapo Calamandrei <calamandrei@gmail.com>
  Hylke Bons <hylke.bons@intel.com>
 .
License: CC-BY-SA-3.0 or LGPL-3
 This work is licenced under the terms of either the GNU LGPL v3 or
 Creative Commons Attribution-Share Alike 3.0 United States License.
 .
 To view a copy of the CC-BY-SA licence, visit
 http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative
 Commons, 171 Second Street, Suite 300, San Francisco, California 94105, USA.
 .
 When attributing the artwork, using "GNOME Project" is enough.
 Please link to http://www.gnome.org where available.
Comment:
 See below for the full text of the CC-BY-SA-3.0.
 .
 On Debian GNU/Linux systems, the complete text of the GNU Lesser General
 Public License can be found in `/usr/share/common-licenses/LGPL-3'.

Files:
 po/*
Copyright:
 © 2004 Abdulaziz Al-Arfaj
.
.
.
 © 2004 Åsmund Skjæveland
 © 2004-2014 Žygimantas Beručka
License: CC-BY-SA-3.0-US or LGPL-3
 This work is licenced under the terms of either the GNU LGPL v3 or
 Creative Commons Attribution-Share Alike 3.0 United States License.
 .
 To view a copy of the CC-BY-SA licence, visit
 http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative
 Commons, 171 Second Street, Suite 300, San Francisco, California 94105, USA.
 .
 When attributing the artwork, using "GNOME Project" is enough.
 Please link to http://www.gnome.org where available.

Files:
 po/tk.po
Copyright:
 © 2004 Free Software Foundation
 © 2004 Gurban Mühemmet Tewekgeli and Kakilik - Turkmen free software developers community
License: GPL-unspecified
 This file is distributed under the terms of GNU General Public License (GPL)
Comment:
 On Debian systems, the complete text of the GNU General
 Public License can be found in `/usr/share/common-licenses/GPL'.

Files:
 src/fullcolor/accessories-dictionary.svg
Copyright:
 © Ulisse Perusin
 © Lapo Calamandrei
 © SoylentGreen
 © Luigi Chiesa
 © unknown contributor to FreeSeamlessTextures.com
License: GFDL-1.2+ or CC-BY-SA-3.0-Unported or CC-BY-SA-2.0-IT, and CC-BY-3.0-US

License: CC-BY-SA-3.0-Unported
 This file is licensed under the Creative Commons Attribution-Share
 Alike 3.0 Unported license.
 .
 You are free:
 .
 • to share – to copy, distribute and transmit the work
 • to remix – to adapt the work
 .
 Under the following conditions:
 • attribution – You must attribute the work in the manner specified
   by the author or licensor (but not in any way that suggests that they
   endorse you or your use of the work).
 • share alike – If you alter, transform, or build upon this work,
   you may distribute the resulting work only under the same or similar
   license to this one.

As we can see from the above example, we should capture the primary license and report as, for instance, CC-BY-SA-3.0 or LGPL-3 (GPL-unspecified AND GFDL-1.2+ or CC-BY-SA-3.0-Unported or CC-BY-SA-2.0-IT, and CC-BY-3.0-US AND CC-BY-SA-3.0-Unported) instead of reported all the

CC-BY-SA-3.0 or LGPL-3
GPL-unspecified
GFDL-1.2+ or CC-BY-SA-3.0-Unported or CC-BY-SA-2.0-IT, and CC-BY-3.0-US
CC-BY-SA-3.0-Unported

so that user get easily identify which is the primary license for the package.

@mjherzog
Copy link
Member

One key question is whether we try to capture this idea with parends or a multi-level approach where we call out the "summary" vs "detail" license data in separate fields - i.e. collect the base data and then optionally report a summary.

@DennisClark
Copy link
Member

Alternative implementation idea:

Proposed new relationship operator in license expression: PLUS.

Everything to the left of the PLUS is a "core" (aka "primary") license expression, typically found in a root level file in a project (such as LICENSE, NOTICE, README, etc.) or a package manifest or similar.

Everything to the right of the PLUS belongs to the group of "other" licenses that can be found in the lower level files of a software package.

Such an enhancement must of course be coordinated with updates to the license expression libraries. (tickets needed)

For compatibility purposes the PLUS operator can be simply translated to an AND operator when exporting information to another application. (a bit of analysis needed here)

How does the PLUS get into the expression? scancode-toolkit should be able to figure it out when it scans a project.

  • Note that any license expression has zero-to-one instances of the PLUS operator; it is optional, and there can be at most one instance of a PLUS.

@pombredanne
Copy link
Member

See also nexB/license-expression#50

@AyanSinhaMahapatra AyanSinhaMahapatra added this to the v31.0 milestone Oct 13, 2021
@DennisClark DennisClark changed the title potential primary license field ? Primary License. Create and maintain a declared_license_expression field. Feb 1, 2022
@pombredanne pombredanne changed the title Primary License. Create and maintain a declared_license_expression field. RFC: Primary License. Create and maintain a declared_license_expression field. Feb 17, 2022
@pombredanne pombredanne changed the title RFC: Primary License. Create and maintain a declared_license_expression field. RFC: Primary License. Create and maintain a primary declared_license_expression field. Feb 17, 2022
AyanSinhaMahapatra added a commit that referenced this issue Jun 9, 2022
Return LicenseDetection in packages and rename license attribute
names.

Reference: #2065

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@pombredanne pombredanne modified the milestones: v31.0, v32.0 Jun 14, 2022
@DennisClark
Copy link
Member

pending resolution of some other related issues to complete this

@AyanSinhaMahapatra
Copy link
Member

So we have file level license field name changes here in this commit and all the package level field name changes are here in this commit. But the later one is WIP as there are a lot of other related changes and improvements that I'm working on.

AyanSinhaMahapatra added a commit that referenced this issue Jul 8, 2022
Return LicenseDetection in packages and rename license attribute
names.

Reference: #2065

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
AyanSinhaMahapatra added a commit that referenced this issue Jul 12, 2022
* Rename `declared_license` to `extracted_license_statement`
* Add `license_detections`
* Rename `license_expressions` to `declared_license_expression`
* Add `detected_license_expression_spdx`

Also add related functions to populate these fields after
package parsing.

Reference: #2065
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@pombredanne
Copy link
Member

For package licenses, we have a PR in progress to use "declared_license_expression" and "declared_license_expression_spdx", but for Debian license (and the long list we get from copyright files) have a specific problem: primary vs. other or "secondary licenses"

  • All package types except debian tend to have a terse declared license that we can use as a primary license
  • Debian in contrast has a primary a long list of secondary or other licenses need to be tracked and that we detected

After a chat, we agreed to use other_licenses_* (expression, detection, etc) as the name for these extra licenses. And this is the same name (e.g. other_license_expressions used in summary.

@pombredanne
Copy link
Member

This is not yet merged but completed in the #2961 PR

AyanSinhaMahapatra added a commit that referenced this issue Aug 8, 2022
- Add `other_license_expression` attribute
- Add `other_license_expression_spdx` attribute
- Add `other_license_detections` attribute

For debian copyright files `declared_license_expression` is
set to the primary license, if present, and other licenses
are set to the `other_license_expression` introduced here.

Reference: #2065
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
AyanSinhaMahapatra added a commit that referenced this issue Aug 8, 2022
- Add `other_license_expression` attribute
- Add `other_license_expression_spdx` attribute
- Add `other_license_detections` attribute

For debian copyright files `declared_license_expression` is
set to the primary license, if present, and other licenses
are set to the `other_license_expression` introduced here.

Reference: #2065
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra
Copy link
Member

We have this merged, closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment