Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deal with "This file is distributed under the same license as the..." #1379

Closed
pombredanne opened this issue Feb 16, 2019 · 4 comments
Closed

Comments

@pombredanne
Copy link
Member

pombredanne commented Feb 16, 2019

This unfortunate and hard to parse license reference is common and promoted by the GNU Gettext tool that generates templates for translations that yield eventually pretty weird things such as this:
https://github.com/sugarlabs/physics/blob/5e73470fc3c730c05f3bdcae830f630fd5ceb45c/po/mk.po

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.

... and so repeated over and over.

Or this kind of hard to dereference statement:
# This file is distributed under the same license as the cpufrequtils package.

The culprit code is there:
http://git.savannah.gnu.org/cgit/gettext.git/tree/gettext-tools/src/xgettext.c#n1907

@ueno @bhaible this is a source of less accurate license detection for scancode and a lot of noise and there are 100K+ examples of such license reference out there... I might be able to deal with it somehow. But may be there would be a way to fix it upstream in gettext proper to avoid propagating weak and inconclusive licensing statements in the future? that would be really nice!

@ian-kelling ping... has this been an issue when you scan code at the FSF too?

@bhaible
Copy link

bhaible commented Feb 18, 2019

I moved this question to the gettext tracker: https://savannah.gnu.org/bugs/index.php?55733 . Please discuss it there.

@pombredanne
Copy link
Member Author

@bhaible thank ++ for taking the time to answer!

pombredanne added a commit that referenced this issue Feb 18, 2019
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Feb 18, 2019
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne pushed a commit that referenced this issue Jul 14, 2021
* Add CDDL lexer (thanks to Fabian Neumann)

* Add CDDL to mappings

* Fix inline flag in CDDL regex

* Update AUTHORS

* Fix explosive backtracking

* Comment invalid CDDL syntax for automated tests

* Update following Georg Brandl's review

* Update tests for CDDL to new framework

* Pylint pass

* Update links to CDDL RFC

* Update copyright header

* Solve regexlint issues in CDDL parser

* Add link to CDDL in documentation
pombredanne pushed a commit that referenced this issue Sep 8, 2022
* Add CDDL lexer (thanks to Fabian Neumann)

* Add CDDL to mappings

* Fix inline flag in CDDL regex

* Update AUTHORS

* Fix explosive backtracking

* Comment invalid CDDL syntax for automated tests

* Update following Georg Brandl's review

* Update tests for CDDL to new framework

* Pylint pass

* Update links to CDDL RFC

* Update copyright header

* Solve regexlint issues in CDDL parser

* Add link to CDDL in documentation
AyanSinhaMahapatra added a commit that referenced this issue Oct 13, 2022
There are unknown license statements like "This file is the same
license as the package django" which refers to a package which this
file is a part of. This is fixed by extending the dereferencing logic
to look for packages that the file belongs to and using the detected
licenses from the package.

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
AyanSinhaMahapatra added a commit that referenced this issue Oct 13, 2022
There are unknown license statements like "This file is the same
license as the package django" which refers to a package which this
file is a part of. This is fixed by extending the dereferencing logic
to look for packages that the file belongs to and using the detected
licenses from the package.

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
AyanSinhaMahapatra added a commit that referenced this issue Oct 18, 2022
* in case of unknown references being present without top-level
  detected, dereference using license detections in legalese/readme
  files at codebase root.
* add example cases from samba/samba, sugarlabs/physics, debian fusiondirectory,
  and paddlenlp as tests.

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra
Copy link
Member

This is now fixed in the PR #2961

@pombredanne
Copy link
Member Author

Closing!

@pombredanne pombredanne added this to the v32.0 milestone Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants