-
-
Notifications
You must be signed in to change notification settings - Fork 635
Description
Description
Hi! We're using GitHub advanced security license enforcing tools which under the hood uses clearlydefined which under the hood uses scancode to detect license on maven packages.
Recently, we've been seeing a major increase in various popular packages being detected with a LicenseRef-scancode-unknown-license-reference license. For example google errorprone annotations or jetty-util, both with a long standing Apache-2.0 license. You can easily see the raw output of scancode through their API.
The scan results shows that it's rule license-intro_72.RULE getting a match.
I tried to debug locally, initally thinking that it might be a scancode version problem since clearlydefined runs a fairly old version (32.1.0), however I get the same result with the latest scancode version, and also the latest version built from the source.
I then thought that maybe google had changed their pom.xml because v2.33.0 does not show an invalid license but v2.34.0 does! But then, looking at the source code nothing has changed, and I've also corroborated this observation by unzipping the jar of v2.33.0 and v2.34.0, like clearlydefined dows.
Can you help me figure it out? Thanks!
How To Reproduce
Easy to reproduce locally (the output is from the tip of the main branch, built from source on ubuntu 24.04) :
wget https://repo1.maven.org/maven2/com/google/errorprone/error_prone_annotations/2.35.0/error_prone_annotations-2.35.0.pom
scancode --license error_prone_annotations-2.35.0.pom --json-pp -
Complete output
Setup plugins...
Collect file inventory...
Scan files for: licenses with 1 process(es)...
[####################] 2
{
"headers": [
{
"tool_name": "scancode-toolkit",
"tool_version": "v32.3.3-9-g4b57a7fe86",
"options": {
"input": [
"error_prone_annotations-2.35.0.pom"
],
"--json-pp": "-",
"--license": true
},
"notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
"start_timestamp": "2025-04-02T201803.563720",
"end_timestamp": "2025-04-02T201810.073061",
"output_format_version": "4.0.0",
"duration": 6.50935697555542,
"message": null,
"errors": [],
"warnings": [],
"extra_data": {
"system_environment": {
"operating_system": "linux",
"cpu_architecture": "64",
"platform": "Linux-6.8.0-1024-aws-x86_64-with-glibc2.35",
"platform_version": "#26~22.04.1-Ubuntu SMP Wed Feb 19 06:54:57 UTC 2025",
"python_version": "3.10.12 (main, Feb 4 2025, 14:57:36) [GCC 11.4.0]"
},
"spdx_license_list_version": "3.26",
"files_count": 1
}
}
],
"license_detections": [
{
"identifier": "apache_2_0-4a7f6517-9601-5209-8528-3707be4a550e",
"license_expression": "apache-2.0",
"license_expression_spdx": "Apache-2.0",
"detection_count": 1,
"reference_matches": [
{
"license_expression": "apache-2.0",
"license_expression_spdx": "Apache-2.0",
"from_file": "error_prone_annotations-2.35.0.pom",
"start_line": 39,
"end_line": 43,
"matcher": "3-seq",
"score": 11.83,
"matched_length": 11,
"match_coverage": 11.83,
"rule_relevance": 100,
"rule_identifier": "apache-2.0_1270.RULE",
"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/apache-2.0_1270.RULE"
}
]
},
{
"identifier": "apache_2_0-c4e30bcd-ccfd-bbc3-d2f1-196ab911e47d",
"license_expression": "apache-2.0",
"license_expression_spdx": "Apache-2.0",
"detection_count": 1,
"reference_matches": [
{
"license_expression": "apache-2.0",
"license_expression_spdx": "Apache-2.0",
"from_file": "error_prone_annotations-2.35.0.pom",
"start_line": 5,
"end_line": 15,
"matcher": "2-aho",
"score": 100.0,
"matched_length": 85,
"match_coverage": 100.0,
"rule_relevance": 100,
"rule_identifier": "apache-2.0_7.RULE",
"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/apache-2.0_7.RULE"
}
]
},
{
"identifier": "unknown_license_reference-4b2709b8-15aa-b2be-e8fc-8043272b18ce",
"license_expression": "unknown-license-reference",
"license_expression_spdx": "LicenseRef-scancode-unknown-license-reference",
"detection_count": 1,
"reference_matches": [
{
"license_expression": "unknown-license-reference",
"license_expression_spdx": "LicenseRef-scancode-unknown-license-reference",
"from_file": "error_prone_annotations-2.35.0.pom",
"start_line": 39,
"end_line": 41,
"matcher": "2-aho",
"score": 16.0,
"matched_length": 3,
"match_coverage": 100.0,
"rule_relevance": 16,
"rule_identifier": "license-intro_72.RULE",
"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/license-intro_72.RULE"
}
]
}
],
"files": [
{
"path": "error_prone_annotations-2.35.0.pom",
"type": "file",
"detected_license_expression": "apache-2.0 AND unknown-license-reference",
"detected_license_expression_spdx": "Apache-2.0 AND LicenseRef-scancode-unknown-license-reference",
"license_detections": [
{
"license_expression": "apache-2.0",
"license_expression_spdx": "Apache-2.0",
"matches": [
{
"license_expression": "apache-2.0",
"license_expression_spdx": "Apache-2.0",
"from_file": "error_prone_annotations-2.35.0.pom",
"start_line": 5,
"end_line": 15,
"matcher": "2-aho",
"score": 100.0,
"matched_length": 85,
"match_coverage": 100.0,
"rule_relevance": 100,
"rule_identifier": "apache-2.0_7.RULE",
"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/apache-2.0_7.RULE"
}
],
"identifier": "apache_2_0-c4e30bcd-ccfd-bbc3-d2f1-196ab911e47d"
},
{
"license_expression": "apache-2.0",
"license_expression_spdx": "Apache-2.0",
"matches": [
{
"license_expression": "apache-2.0",
"license_expression_spdx": "Apache-2.0",
"from_file": "error_prone_annotations-2.35.0.pom",
"start_line": 39,
"end_line": 43,
"matcher": "3-seq",
"score": 11.83,
"matched_length": 11,
"match_coverage": 11.83,
"rule_relevance": 100,
"rule_identifier": "apache-2.0_1270.RULE",
"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/apache-2.0_1270.RULE"
}
],
"identifier": "apache_2_0-4a7f6517-9601-5209-8528-3707be4a550e"
},
{
"license_expression": "unknown-license-reference",
"license_expression_spdx": "LicenseRef-scancode-unknown-license-reference",
"matches": [
{
"license_expression": "unknown-license-reference",
"license_expression_spdx": "LicenseRef-scancode-unknown-license-reference",
"from_file": "error_prone_annotations-2.35.0.pom",
"start_line": 39,
"end_line": 41,
"matcher": "2-aho",
"score": 16.0,
"matched_length": 3,
"match_coverage": 100.0,
"rule_relevance": 16,
"rule_identifier": "license-intro_72.RULE",
"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/license-intro_72.RULE"
}
],
"identifier": "unknown_license_reference-4b2709b8-15aa-b2be-e8fc-8043272b18ce"
}
],
"license_clues": [],
"percentage_of_license_text": 22.51,
"scan_errors": []
}
]
}
For bug reports, it really helps us to know:
- What OS are you running on? Ubuntu 24.04
- Latest from source and latest release 32.3.3
- tar + from source
Related PR : clearlydefined/curated-data#29516