Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copyright in LICENSE file not detected (email issue?) #3764

Open
vw-anton opened this issue May 3, 2024 · 1 comment
Open

Copyright in LICENSE file not detected (email issue?) #3764

vw-anton opened this issue May 3, 2024 · 1 comment
Labels

Comments

@vw-anton
Copy link

vw-anton commented May 3, 2024

Description

Copyright in the following file is not detected by ScanCode: https://github.com/videojs/vhs-utils/blob/main/LICENSE

Detected:

{
  "path": "vhs-utils-main/LICENSE",
  "type": "file",
  "name": "LICENSE",
  "status": "application-package",
  "tag": "",
  "extension": "",
  "size": 1078,
  "md5": "b5e2dbf622c44f93baf779123b4c1cc7",
  "sha1": "cb191af7ec58c84aae40b49d4be2646239ab087d",
  "sha256": "83c604241478b9801530198a4dd46fca5fc422015d5a01eeacf96162401ab31a",
  "sha512": "",
  "mime_type": "text/plain",
  "file_type": "ASCII text",
  "programming_language": "",
  "is_binary": false,
  "is_text": true,
  "is_archive": false,
  "is_media": false,
  "is_key_file": true,
  "detected_license_expression": "mit",
  "detected_license_expression_spdx": "MIT",
  "license_detections": [
 ...
  ],
  "license_clues": [],
  "percentage_of_license_text": 96.41,
  "compliance_alert": "",
  "copyrights": [],
  "holders": [],
  "authors": [],
  "package_data": [],
  "for_packages": [
    "pkg:npm/%40videojs/vhs-utils@4.1.0?uuid=7803ef84-762b-4555-b5cf-0935ad59f1f5"
  ],
  "emails": [
    {
      "email": "brandonocasey@gmail.com",
      "end_line": 1,
      "start_line": 1
    }
  ],
  "urls": [],
  "extra_data": {}
}

Expected:
brandonocasey <brandonocasey@gmail.com> or at least brandonocasey is returned as copyright

How To Reproduce

scnacode.io config below

System configuration

  "tool_name": "scanpipe",
  "tool_version": "34.2.0",
  "other_tools": [
    "pkg:pypi/scancode-toolkit@32.1.0"
@vw-anton vw-anton added the bug label May 3, 2024
pombredanne added a commit that referenced this issue May 6, 2024
Reference: #3764
Reported-by: Anton Augsburg @vw-anton
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Member

@vw-anton Thanks for the report!
See the fix in ab6699f ... the issue was not about the email, but rather about the lack of year in this copyright form using a (somewhat uncommon) all lower case made up name.

pombredanne added a commit that referenced this issue May 9, 2024
Make detection of copyright with a single lowercase name more specific

Reference: #3764
Reported-by: Anton Augsburg @vw-anton
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
AyanSinhaMahapatra added a commit that referenced this issue Jun 26, 2024
* Detect odd name in copyright #3655

Reported-by: Anton Augsburg @vw-anton
Reference: #3655
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Do not detect trailing Distributed in copyright #3735

Reported-by:  Dimitris Iliou @dimitris-iliou
Reference: #3735
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve misc. copyright detections

Spotted in some common python libraries such as numpy and scipy

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Add new script to generate copyright tests

Use an input file where each line is either:
- a URL to fetch
- a text to test

Then generate a test data files pair accordingly

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve copyright detection

- Start detecting "is held by"
- Do not include some trailing junk

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Detect NN/EMAIL copyright combo #3764

Reference: #3764
Reported-by: Anton Augsburg @vw-anton
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Detect NN/EMAIL copyright combo #3764

Make detection of copyright with a single lowercase name more specific

Reference: #3764
Reported-by: Anton Augsburg @vw-anton
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Align license with improved copyrights

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve copyright detection of "distributed"

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Do not detect some words as NNP

This makes copyright detection more specific

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve copyright tests

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Detect OpenStreetMap correctly

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Add new copyright detection tests

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve copyright detection side-effects

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Enable generation of copyright test file

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve copyright debug tracing

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Detect new form of copyright

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Do not add arbitrary space around markup

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve handle of parens in copyright

Also improve NOTICEs, and other misc. variants
Don not detect "The Initial Developer"

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Correctly filter copyrights in licenses #3797

Reference: #3797
Reported-by: Jörg Arndt @Joerki
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve copyright detection

Handle corner cases with markup
Detect new copyright forms.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Rename README file

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve copyright detection

* Handle better various parens, markup and quotes

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Improve copyright detection

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Refine copyright detection

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Use latest commoncode

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Enable generation of copyright test data files

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Do not regen demarkup tests

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

Co-authored-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>

---------

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Co-authored-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants