Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surprising false positives of mit_or_gpl-3.0_17.RULE #3738

Open
leslielazzarino opened this issue Apr 12, 2024 · 2 comments · May be fixed by #3750
Open

Surprising false positives of mit_or_gpl-3.0_17.RULE #3738

leslielazzarino opened this issue Apr 12, 2024 · 2 comments · May be fixed by #3750
Labels

Comments

@leslielazzarino
Copy link

Description

With variations of the copyright header in the test file, the rule mit_or_gpl-3.0_17.RULE is matched for a file that is clearly MIT and Apache 2.0. The first line of the text is not marked as matched by the rule, but if not present will prevent the false positive.

How To Reproduce

Create the test file test.py with the following content:

# Copyright (c) Someone and affiliates.
#
# This source code is licensed under both the MIT license found in the
# LICENSE_MIT file in the root directory of this source tree and the Apache
# License, Version 2.0 found in the LICENSE_APACHE file in the root directory
# of this source tree.

Run any scancode 32, including the latest one, with the following flags:
-clpu --only-findings -n35 --license-text --verbose --json results.json ./test.py

The output, formatted, will look more or less (I've removed my system information) like this:

{
 "headers": [
   {
     "tool_name": "scancode-toolkit",
     "tool_version": "32.0.6",
     "options": {
       "input": [
         "/test_sc.py"
       ],
       "--copyright": true,
       "--json": "test.json",
       "--license": true,
       "--license-text": true,
       "--only-findings": true,
       "--package": true,
       "--processes": "35",
       "--url": true,
       "--verbose": true
     },
     "notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
     "start_timestamp": "2024-04-12T093906.516396",
     "end_timestamp": "2024-04-12T093910.030508",
     "output_format_version": "3.0.0",
     "duration": 3.5141329765319824,
     "message": null,
     "errors": [],
     "warnings": [],
     "extra_data": {
       "spdx_license_list_version": "3.21",
       "files_count": 1
     }
   }
 ],
 "packages": [],
 "dependencies": [],
 "license_detections": [
   {
     "identifier": "mit_or_gpl_3_0__and_apache_2_0-c8dec2e8-7c20-200b-983a-ad0cf4c9dbbb",
     "license_expression": "(mit OR gpl-3.0) AND apache-2.0",
     "detection_count": 1
   }
 ],
 "files": [
   {
     "path": "test_sc.py",
     "type": "file",
     "package_data": [],
     "for_packages": [],
     "detected_license_expression": "(mit OR gpl-3.0) AND apache-2.0",
     "detected_license_expression_spdx": "(MIT OR GPL-3.0-only) AND Apache-2.0",
     "license_detections": [
       {
         "license_expression": "(mit OR gpl-3.0) AND apache-2.0",
         "matches": [
           {
             "score": 60.87,
             "start_line": 3,
             "end_line": 4,
             "matched_length": 14,
             "match_coverage": 60.87,
             "matcher": "3-seq",
             "license_expression": "mit OR gpl-3.0",
             "rule_identifier": "mit_or_gpl-3.0_17.RULE",
             "rule_relevance": 100,
             "rule_url": ["https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/mit_or_gpl-3.0_17.RULE"](https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/mit_or_gpl-3.0_17.RULE),
             "matched_text": "# This source code is licensed under both the MIT license found in the\n# LICENSE_MIT file in the root directory of this source tree and the Apache"
           },
           {
             "score": 100,
             "start_line": 4,
             "end_line": 5,
             "matched_length": 6,
             "match_coverage": 100,
             "matcher": "2-aho",
             "license_expression": "apache-2.0",
             "rule_identifier": "apache-2.0_182.RULE",
             "rule_relevance": 100,
             "rule_url": ["https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/apache-2.0_182.RULE"](https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/apache-2.0_182.RULE),
             "matched_text": "# LICENSE_MIT file in the root directory of this source tree and the Apache\n# License, Version 2.0 found in the LICENSE_APACHE file in the root directory"
           },
           {
             "score": 90,
             "start_line": 5,
             "end_line": 5,
             "matched_length": 2,
             "match_coverage": 100,
             "matcher": "2-aho",
             "license_expression": "apache-2.0",
             "rule_identifier": "apache-2.0_161.RULE",
             "rule_relevance": 90,
             "rule_url": ["https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/apache-2.0_161.RULE"](https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/apache-2.0_161.RULE),
             "matched_text": "# License, Version 2.0 found in the LICENSE_APACHE file in the root directory"
           }
         ],
         "identifier": "mit_or_gpl_3_0__and_apache_2_0-c8dec2e8-7c20-200b-983a-ad0cf4c9dbbb"
       }
     ],
     "license_clues": [],
     "percentage_of_license_text": 44,
     "copyrights": [
       {
         "copyright": "Copyright (c) Someone and affiliates",
         "start_line": 1,
         "end_line": 1
       }
     ],
     "holders": [
       {
         "holder": "Someone and affiliates",
         "start_line": 1,
         "end_line": 1
       }
     ],
     "authors": [],
     "urls": [],
     "scan_errors": []
   }
 ]
}

System configuration

For bug reports, it really helps us to know:

  • What OS are you running on? Linux
  • What version of scancode-toolkit was used to generate the scan file? 32.0.6 (same results when trying with 32.1.0)
  • What installation method was used to install/run scancode? source download
@pombredanne
Copy link
Member

Thanks!
Are you sure you tried with version 32.1.0?

@pombredanne
Copy link
Member

Never mind ... I can see this too. PR welcomed!

vasily-pozdnyakov added a commit to vasily-pozdnyakov/scancode-toolkit that referenced this issue Apr 26, 2024
Signed-off-by: Vasily Pozdnyakov <vasily.pozdnyakov@tngtech.com>
@vasily-pozdnyakov vasily-pozdnyakov linked a pull request Apr 26, 2024 that will close this issue
3 tasks
vasily-pozdnyakov added a commit to vasily-pozdnyakov/scancode-toolkit that referenced this issue Apr 26, 2024
Signed-off-by: Vasily Pozdnyakov <vasily.pozdnyakov@tngtech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants