This repository has been archived by the owner. It is now read-only.

Clarify license metadata guidelines #8179

Closed
wants to merge 5 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@kemitchell
Contributor

kemitchell commented May 5, 2015

This PR changes the documentation visible with npm help 7 package.json to make a few clarifications about license metadata in package.json:

  1. "license" should be a string SPDX license expression. Thus "GNU Public License" is not good metadata, while "GPL-3.0" is fine. So is "(MIT OR GPL-2.0)" for a "multi-licensed" project.
  2. Old-style {type:..., url:...} objects and "licenses" arrays are deprecated.
  3. If you're non-standard enough that SPDX expressions don't do it, leave "license" out, and make sure you include a LICENSE file in the package.

There are a couple of open issues on point, both of which would benefit by an official word: #6241 and #4473.

I have also opened PRs to make SPDX expression validation part of npm init and metadata normalization: npm/init-package-json#42 and npm/normalize-package-data#61.

There are several ways to to specify machine-readable license metadata that would seem "right". Past affirmatively requiring a valid SPDX identifier, it's a bikeshed, since so few projects are multi-licensed. Some important ones are, but they are few.

I only bring this up early because so many of the most-used npm packages are older, and haven't had their metadata updated since the licenses: [{...}] days. Normalizing those packages would help spread the word, and give us a shot at making machine-readable licensing the norm on npm.

@kemitchell kemitchell force-pushed the kemitchell:update-license-guidelines branch 2 times, most recently from ddf797e to 5da3f5a May 5, 2015

@kemitchell

This comment has been minimized.

@kemitchell

This comment has been minimized.

Contributor

kemitchell commented May 5, 2015

The vast majority of url properties are to LICENSE files in-repo on GitHub. A great many others are to opensource.org or apache.org. I count about 10 oddballs pointing to Wikipedia, gnu.org, or project- or person-specific domains.

@kemitchell

This comment has been minimized.

Contributor

kemitchell commented May 5, 2015

Of the 1,000 most-depended-upon npm modules, only the following have licenses arrays with more than one element:

  • amdefine ("BSD", MIT)
  • fs.extra (MIT, Apache-2.0)
  • json-schema (AFL-2.0, "BSD")
  • node.extend (MIT, "GPL")
  • promised-io (AFL-2.1, "BSD")
  • rc ("BSD", MIT, Apache-2.0)
  • requirejs ("BSD", MIT)
  • walk (MIT, Apache-2.0)
@sindresorhus

This comment has been minimized.

sindresorhus commented May 5, 2015

👍 This is so much better. @kemitchell thanks for taking the lead on cleaning up this mess :)

@kemitchell

This comment has been minimized.

Contributor

kemitchell commented May 5, 2015

@sindresorhus: Thanks for your work on spdx-license-list! Thanks to @shinnn, it's is now an indirect dependency of spdx.js.

I'd love to make this metadata guidance part of npm@3 if at all possible. Given the growth rate of npm packages, I get the feeling it's early or never for compliance as the norm.

@kemitchell kemitchell force-pushed the kemitchell:update-license-guidelines branch from 5da3f5a to e5be5f2 May 6, 2015

@kemitchell

This comment has been minimized.

Contributor

kemitchell commented May 6, 2015

I force-pushed a small change that makes clear that license identifiers should come from the most recent SPDX license identifier list (http://spdx.org/licenses), while the syntax for license expressions is version 2.0 of the spec.

@kemitchell kemitchell force-pushed the kemitchell:update-license-guidelines branch from e5be5f2 to e4e508a May 6, 2015

@kemitchell kemitchell referenced this pull request May 6, 2015

Merged

Validate license fields #61

]
}
Those styles are now deprecated.

This comment has been minimized.

@othiym23

othiym23 May 6, 2015

Contributor

Can you add a little to this deprecation, including a terse explanation of how to rewrite the above into SPDX syntax?

This comment has been minimized.

@othiym23

othiym23 May 6, 2015

Contributor

It might be useful to make the second example an actual array with two licenses so you can show another example of using a conjunction in an SPDX expression.

or MIT, is to just specify the standard SPDX ID of the license you're using,
like this:
If you're using a common license such as BSD-2-Clause or MIT, add a
current SPDX license identifier for the license you're using, like this:

This comment has been minimized.

@othiym23

othiym23 May 6, 2015

Contributor

Can you make this a link to whichever piece of documentation of SPDX you think is most approachable for developers who are unfamiliar with it? I had never heard of it before you got in touch with me, and I have an enterprise background, so I imagine many other people will be confused as well.

If you are using a license that hasn't been assigned an SPDX identifier, or if
you are using an uncommon or custom license, do not include a "license" string
in package.json. In those cases especially, but also more generally, it's a good
idea to include a LICENSE file at the top level of the package.

This comment has been minimized.

@othiym23

othiym23 May 6, 2015

Contributor

Will they still get a warning in these cases, or are your changes to normalize-package-data aware of the existence of a LICENSE file? i.e. how can we make sure that users don't get a warning when they're doing what these docs recommend?

This comment has been minimized.

@kemitchell

kemitchell May 6, 2015

Contributor

Great point. It didn't think about making sure warning scenarios are coextensive with guidelines noncompliance.

I take it normalize-package-data should remain a pure function of the package.json object, so hitting the file system or the API to stat a LICENSE is out.

That leaves us at least three ways to go at the problem:

  1. SPDX does provide for a way to reference licenses that don't have assigned identifiers, LicenseRef-X, but they are designed as references to other RDF objects within a larger package definition, of which license IDs and expressions are just a part. npm could define a special value, like LicenseRef-LICENSE that means "check the LICENSE file" in the ecosystem.
  2. Acknowledge that failure to indicate common license terms is a warning-worthy offense. But ... private modules.
  3. Don't issue warnings when license is missing. Accept a far greater number of false negatives (should have license, doesn't see a warning) for fewer false positives (shouldn't have license for lack of a way to specify it as an SPDX expression, but warn anyway).

This comment has been minimized.

@othiym23

othiym23 May 6, 2015

Contributor

I think it may make sense to do some special filtering of the warnings reported by init-package-json in npm itself to do something commonsense, like looking for "see LICENSE file" (or some other, bikesheddable phrase) in that field and then verifying that ./LICENSE does in fact exist. I've been meaning to do some work on how those warnings are handled for a while now, because I'm not sure it's useful to print the warnings for dependencies on install.

I don't think those changes need to be part of landing this change, though. We can get to them later. It just means that npm's install warnings are going to be noisier than usual for a while.

This comment has been minimized.

@kemitchell

kemitchell May 6, 2015

Contributor

I am down for getting this done right the first time. LicenseRef-LICENSE is Magic Words, but it works.

Alas, even detecting whether a LICENSE file exists is messy. Sometimes LICENSE, other times LICENSE.md, LICENSE.markdown, LICENSE.BSD, &c. That's ignoring case and licenses pasted at the end of README, which has the same problem.

This comment has been minimized.

@kemitchell

kemitchell May 6, 2015

Contributor

Even if npm checked for LICENSE, the guidelines recommend that all packages have a LICENSE file. It won't be clear when a LICENSE file exists whether package.json should have an SPDX expression or not, since the LICENSE file could contain a standard or non-standard license.

Though GitHub and others are trying it, detection of standard licenses in plaintext is non-trivial. I may go there someday, but I doubt npm really wants to, or should.

This comment has been minimized.

@othiym23

othiym23 May 7, 2015

Contributor

I also don't think that getting fancy and trying to parse things out of LICENSE is a good idea. I think we should follow the lead of fstream-npm and use a similar regexp or glob pattern to just verify that if "license" in package.json says LicenseRef-LICENSE, that glob pattern matches a file. It's heuristic, but all of this behavior must be documented anyway, so as long as those docs exist, I see only upside to making the rules very simple and deterministic.

@kemitchell kemitchell force-pushed the kemitchell:update-license-guidelines branch 2 times, most recently from c69986d to fb16d7b May 6, 2015

@kemitchell kemitchell force-pushed the kemitchell:update-license-guidelines branch from fb16d7b to 266771a May 6, 2015

@kemitchell

This comment has been minimized.

Contributor

kemitchell commented May 7, 2015

@othiym23, I have pushed a commit explaining the use of "LicenseRef-LICENSE".

@kemitchell

This comment has been minimized.

Contributor

kemitchell commented May 8, 2015

@othiym23, would it help if I sent a first-stab PR to implement the if (LicenseRef-LICENSE) { glob(...) } logic?

@othiym23

This comment has been minimized.

Contributor

othiym23 commented May 8, 2015

I went ahead and rebased, squashed, and landed what you had so far as 8669f7d and b01ba1a. I also landed eb18245, so npm no longer warns on missing READMEs or invalid license stanzas on transitive dependencies (it logs them at info level instead, so you can see the full list by using either -d or --loglevel=info on the command line). This should make this feature both useful and non-annoying while everybody converts to using SPDX license stanzas.

If you want to take a shot at implementing the logic behind LicenseRef-LICENSE, I wouldn't object, but this is definitely enough functionality to get things started. Thank you for putting this together, being so responsive, and making spdx.js in the first place!

@othiym23 othiym23 closed this May 8, 2015

@kemitchell

This comment has been minimized.

Contributor

kemitchell commented May 8, 2015

@othiym23 thanks for landing the PR! This probably closes #4473 and #6241, too.

@othiym23

This comment has been minimized.

Contributor

othiym23 commented May 8, 2015

#4473 is definitely closed, but there's a bunch of extra discussion of other features and documents in #6241 that I don't think we've quite finished. Regardless, thanks for the reminder!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.