Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add identifiers and splits #3

Closed
wants to merge 14 commits into from
Closed

Add identifiers and splits #3

wants to merge 14 commits into from

Conversation

goneall
Copy link
Contributor

@goneall goneall commented Oct 21, 2017

Made some progress going through the fsf web pages and adding SPDX identifiers.

The attached file has a list of FSF ID's that do not have obvious matches in SPDX followed by a list of code snippets that need to be investigated and competed.
list.txt

Signed-off-by: Gary O'Neall <gary@sourceauditor.com>
Signed-off-by: Gary O'Neall <gary@sourceauditor.com>
Signed-off-by: Gary O'Neall <gary@sourceauditor.com>
@goneall
Copy link
Contributor Author

goneall commented Oct 22, 2017

Ammended PR with identifiers up through the letter N - I'll add some more later tonight or tomorrow

pull.py Outdated
'Condor': {'spdx': 'Condor-1.1'},
'ECL2.0': {'spdx': 'ECL-2.0'},
'eCos11': {'spdx': 'RHeCos-1.1'},
'eCos2.0': {'spdx': 'eCos-2.0'},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This SPDX ID is deprecated in favor of GPL-2.0+ WITH eCos-exception-2.0. I'd rather use the modern license expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to allow license expressions for the 'spdx' tag or restrict it to the license ID's only?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to allow license expressions for the 'spdx' tag...

Yes, I think we do. If someone finds GPL-2.0+ WITH eCos-exception-2.0 content, they should be able to determine its FSF tags without going through a deprecated SPDX ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I'll add it to the amended PR - coming up soon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed eCos2.0, but left ecos11 since RHeCos-1.1 is not marked as deprecated.

pull.py Outdated
'ModifiedBSD': {'spdx': 'BSD-3-Clause'},
'MPL': {'spdx': 'MPL-1.1'},
'MPL-2.0': {'spdx':'MPL-2.0'},
'ms-pl': {'spdx': 'MPL-1.1'},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops - that must have been a typo. Good catch.

pull.py Outdated
'NCSA': {'spdx':'NCSA'},
'newOpenLDAP': {'spdx': 'OLDAP-2.7'},
'Nokia': {'spdx': 'Nokia'},
'NoLicense': {'spdx': 'UNSPECIFIED'},
Copy link
Owner

@wking wking Oct 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the FSF's NoLicense entry as "all rights reserved". NOASSERTION is different, and I don't see UNSPECIFIED in the spec. I expect NoLicense has no SPDX equivalent at the moment. I think the equivalent SPDX expression is NONE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - I'll amend the PR

@wking
Copy link
Owner

wking commented Oct 22, 2017

Also, can you drop the Signed-off-by from your commit messages? (or I can squash when I merge). For those to mean anything, you need repository-local documentation for them (e.g. see https://github.com/wking/signed-off-by), and I expect this project to be small enough that that doesn't matter.

@goneall
Copy link
Contributor Author

goneall commented Oct 22, 2017

@wking Hold off on merging the PR - I made a mistaking and I'm currently fighting git to get the fix added

@goneall
Copy link
Contributor Author

goneall commented Oct 22, 2017

OK - Everything should be added and all review comments addressed.

Please review - especially those SPDX identifiers with comments.

I also added a file to track the FSF licenses which do not have associated SPDX license ID's or expressions.

@goneall
Copy link
Contributor Author

goneall commented Oct 22, 2017

BTW - you may want to squash the commits - I added quite a few commits as part of the PR to make it easier to find specific changes during the review.

@@ -0,0 +1,47 @@
# The following FSF license tags did not have any obvious match to an SPDX license
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this in version control, because you can generate it from licenses.json with jq. Can we remove it here? And if you like open an issue with a list of unchecked boxes?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jq for this is:

$ jq -r 'to_entries[] | select(.value.identifiers.spdx | not) | .key' licenses.json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll replace it with an issue since some of the items have comments that would not be retained by using jq.

pull.py Outdated
'Zend': {'spdx': 'Zend-2.0'},
'Zimbra': {'spdx': 'Zimbra-1.3'},
'ZLib': {'spdx': 'Zlib'},
'Zope': {'spdx': 'ZPL-1.1'}, # Note the FSF refers to version 1.0 and SPDX uses version 1.1 - it should be verified that 1.1 should be included
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather leave these inexact matches off. If someone is looking up GPL-compat in this API, for example, false negatives are much less problematic than false positives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to take a more conservative approach, I would remove the SPDX ID's for Zope and Ruby. I consider SISSL to be an internal inconsistency on the FSF website and I believe they intend it to match the referenced SPDX license.

In your review comment, you included Zlib, Zimbra and Zend. I'm assuming your comment only referred to Zope - let me know that is incorrect or if there are other license matches that concern you.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your review comment, you included Zlib, Zimbra and Zend. I'm assuming your comment only referred to Zope - let me know that is incorrect or if there are other license matches that concern you.

I'd rather have this bulk PR only cover exact matches. Can you drop anything you consider questionable and file those in per-license(-group) PRs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - I already removed the matches I consider questionable and added them to the list of FSF license ids without an associated SPDX license ID.

pull.py Outdated
@@ -47,6 +47,20 @@
'CC-BY-ND-3.0',
'CC-BY-ND-4.0',
],
'CC-BY': [ # any version
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any CC-BY slug on their page. This link should not take you to any specific entry: https://www.gnu.org/licenses/license-list.html#CC-BY

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added checks in master's pull.py to watch out for this sort of thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a correction for this and CC-BY-SA to the pull request branch

…se matches to the spdx license list and remove the unassociated-license-ids.md file per pull request review
@goneall
Copy link
Contributor Author

goneall commented Oct 22, 2017

Amended PR to address last 2 review comments.

'OSL-2.1',
'OSL-3.0',
],
'RPL': [ # any version - Note that FSF website does not state any version, but references version 1.3 in the URL. It is assumed that it also covers version 1.1 and 1.5, but this should be verified with FSF.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FSF at least occasionally says “any version” when that's their intention. Until we get clarity on this, can we drop it from this PR (and the associated RPL-* IDENTIFIERS entries) and come back to them in follow-up work?

],
'Unicode': [ # any version
'Unicode-DFS-2015',
'Unicode-DFS-2016',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another case where the FSF doesn't say “any version”. The version they link is from 2012 (based on the copyright) and differs from Unicode-DFS-2015 at least in:

the above copyright notice(s) and this permission notice appear with all copies of the Data Files or Software,

vs. Unicode-DFS-2015's

this copyright and permission notice appear with all copies of the Data Files or Software,

Until we get clarity on this, can we drop it from this PR (and the associated Unicode-DFS-* IDENTIFIERS entries) and come back to them in follow-up work?

],
'W3C': [ # any version
'W3C',
'W3C-20150513',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another case where the FSF doesn't say “any version”. The version they link is from 2002-12-31 (based on the URI) which matches our W3C. Our W3C-19980720, on the other hand, has “This W3C Work”, and our W3C-20150513 has “Permission to copy, modify, and distribute this work”. Until we get clarity on this, can we drop the SPLITS entry from this PR (and all but the W3C IDENTIFIERS entries) and come back to them in follow-up work?

'W3C-20150513',
'W3C-19980720',
],
'Zope2.0': [ # Versions 2.0 and later
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FSF wording for this isn't “Versions 2.0 and later”, it's “versions 2.0 and 2.1”.

'FreeBSDDL': ['FreeBSD'], # unify (multi-tag)
# FIXME: still working through this
'NPL': [ #any version
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FSF wording for this isn't “any version”, it's “versions 1.0 and 1.1”.

@@ -57,8 +57,40 @@
'FDLv1.2',
'FDLv1.3',
],
'FreeArt': [ # any version
'LAL-1.2',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another case where the FSF doesn't say “any version”. The version they link is 1.3. Until we get clarity on this, can we drop the SPLITS entry (and the associated LAL-1.2 IDENTIFIERS entry) and come back to them in follow-up work?

They also link the English translation, while we currently include only the canonical French version. Until we sort out translations (spdx/license-list-XML#438), we should remove the LAL-1.3 IDENTIFIERS entry as well.

@goneall
Copy link
Contributor Author

goneall commented Oct 23, 2017

@wking I'm not going to have much time to work on this today - do you mind taking the PR and adding your recommended changes above? They all seem reasonable.

@wking
Copy link
Owner

wking commented Oct 23, 2017

… do you mind taking the PR and adding your recommended changes above?

Done (with a few more as well) in my splits-identifiers branch. Differences vs. your current tip:

$ git diff -U0 origin/pr/3..splits-identifiers
diff --git a/pull.py b/pull.py
index b37d874..6edc611 100755
--- a/pull.py
+++ b/pull.py
@@ -60,6 +60,2 @@ SPLITS = {
-    'FreeArt': [ # any version
-        'LAL-1.2',
-        'LAL-1.3',
-    ],
-    'FreeBSDDL': ['FreeBSD'],  # unify (multi-tag)
-    'NPL': [ #any version
+    'FreeBSDDL': ['FreeBSD'], # unify (multi-tag)
+    'NPL': [ # versions 1.0 and 1.1
@@ -76,17 +72,3 @@ SPLITS = {
-    'RPL': [ # any version - Note that FSF website does not state any version, but references version 1.3 in the URL.  It is assumed that it also covers version 1.1 and 1.5, but this should be verified with FSF.
-        'RPL-1.1',
-        'RPL-1.3',
-        'RPL-1.5',
-    ],
-    'Unicode': [ # any version
-        'Unicode-DFS-2015',
-        'Unicode-DFS-2016',
-    ],
-    'W3C': [ # any version
-        'W3C',
-        'W3C-20150513',
-        'W3C-19980720',
-    ],
-    'Zope2.0': [ # Versions 2.0 and later
-        'ZPL-2.0',
-        'ZPL-2.1',
+    'Zope2.0': [ # versions 2.0 and 2.1
+        'Zope2.0',
+        'Zope2.1',
@@ -147,3 +129,3 @@ IDENTIFIERS = {
-    'FDL1.1': {'spdx': 'GFDL-1.1'},
-    'FDL1.2': {'spdx': 'GFDL-1.2'},
-    'FDL1.3': {'spdx': 'GFDL-1.3'},
+    'FDLv1.1': {'spdx': 'GFDL-1.1'},
+    'FDLv1.2': {'spdx': 'GFDL-1.2'},
+    'FDLv1.3': {'spdx': 'GFDL-1.3'},
@@ -165,2 +146,0 @@ IDENTIFIERS = {
-    'LAL-1.2': {'spdx':'LAL-1.2'},
-    'LAL-1.3': {'spdx':'LAL-1.3'},
@@ -196 +175,0 @@ IDENTIFIERS = {
-    'Python': {'spdx': 'Python-2.0'}, # Note: references 'later versions which are not in the SPDX license list
@@ -198,2 +176,0 @@ IDENTIFIERS = {
-    'RPL-1.1': {'spdx': 'RPL-1.1'},
-    'RPL-1.5': {'spdx': 'RPL-1.5'},
@@ -203 +179,0 @@ IDENTIFIERS = {
-    'SISSL': {'spdx': 'SISSL'}, # Note that the header on the 'FSF website states version 1.0, but the link points to 'version 1.1.  The SPDX license is version 1.1
@@ -206,2 +181,0 @@ IDENTIFIERS = {
-    'Unicode-DFS-2015': {'spdx': 'Unicode-DFS-2015'},
-    'Unicode-DFS-2016': {'spdx': 'Unicode-DFS-2016'},
@@ -212,2 +185,0 @@ IDENTIFIERS = {
-    'W3C-20150513': {'spdx': 'W3C-20150513'},
-    'W3C-19980720': {'spdx': 'W3C-19980720'},
@@ -223,4 +195,2 @@ IDENTIFIERS = {
-    'ZPL-2.0': {'spdx': 'ZPL-2.0'},
-    'ZPL-2.1': {'spdx': 'ZPL-2.1'},
-
-    # FIXME: still working through this
+    'Zope2.0': {'spdx': 'ZPL-2.0'},
+    'Zope2.1': {'spdx': 'ZPL-2.1'},
@@ -236,0 +207 @@ def extract(root, base_uri=None):
…

If that looks acceptable to you, let me know, and I'll push it and close this PR. Then we can open follow-up issues/PRs for the questionable splits and identifiers.

@wking
Copy link
Owner

wking commented Oct 23, 2017

I pushed 50e1556ef88572 to my branch fixing the oldOpenLDAP association to be SPDX's OLDAP-2.3.

@wking
Copy link
Owner

wking commented Nov 11, 2017

@goneall, have you had time to look over my splits-identifiers reroll yet? It's still at ef88572.

@goneall
Copy link
Contributor Author

goneall commented Nov 12, 2017

@wking Just reviewed it, looks fine to me.

@wking
Copy link
Owner

wking commented Nov 12, 2017

Merged and published the ef88572 reroll.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants