-
Notifications
You must be signed in to change notification settings - Fork 24
release/1.41.0 #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release/1.41.0 #170
Conversation
WalkthroughThis PR introduces configurable license source filtering for copyleft inspection, integrates OSADL dataset-backed license checking, updates to version 1.41.0, refactors spinner lifecycle management across multiple scanning paths using context managers, and adds comprehensive test coverage. Changes
Sequence DiagramsequenceDiagram
actor User
participant CLI
participant Copyleft
participant ScanResultProcessor
participant LicenseUtil
participant Osadl
User->>CLI: scanoss inspect copyleft --license-sources component_declared
CLI->>Copyleft: __init__(license_sources=['component_declared'])
Copyleft->>ScanResultProcessor: __init__(license_sources=['component_declared'])
loop For each scan result
ScanResultProcessor->>LicenseUtil: Process component licenses
LicenseUtil->>Osadl: is_copyleft(spdx_id)
Osadl-->>LicenseUtil: bool (from cached OSADL data)
LicenseUtil->>ScanResultProcessor: copyleft status
alt license_sources filtering mode
ScanResultProcessor->>ScanResultProcessor: _select_licenses(filters by source)
else priority mode (no filtering)
ScanResultProcessor->>ScanResultProcessor: _select_licenses(default priority order)
end
ScanResultProcessor-->>ScanResultProcessor: append filtered licenses
end
Copyleft-->>CLI: inspection results
CLI-->>User: formatted output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (9)
src/scanoss/filecount.py (1)
109-109: Minor: Unnecessary change from f-string to regular string.The change from
f'ERROR: Please specify a folder to scan'to a regular string literal is cosmetic since no variables are interpolated. While not harmful, it's an optional refactor.src/scanoss/scanners/scanner_hfh.py (1)
113-124: Consider logging the exception details for debugging.The exception handler at line 123 prints an error message but doesn't include the full traceback. For debugging gRPC issues, trace-level logging with full exception details would be helpful.
def _execute_grpc_scan(self, hfh_request: Dict) -> None: """ Execute folder hash scan. Args: hfh_request: Request dictionary for the gRPC call """ try: self.scan_results = self.client.folder_hash_scan(hfh_request, self.use_grpc) except Exception as e: self.base.print_stderr(f'Error during folder hash scan: {e}') + self.base.print_trace(f'Full traceback: {e}', exc_info=True) self.scan_results = Nonesrc/scanoss/scanner.py (1)
29-31: Spinner/nullcontext refactor is sound; consider a small helper to reduce duplicationUsing
Spinner('Fingerprinting ')only whennot self.quiet and self.isatty, withnullcontext()otherwise, pluswith spinner_ctx as spinner:andif spinner: spinner.next()keeps behavior correct and fixes cleanup on exceptions/CTRL+C. The same pattern now appears inscan_folder,scan_files, andwfp_folder, so a private helper like_spinner_ctx(label: str)could DRY this up if you decide to touch this again.Also applies to: 367-425, 624-707, 1058-1078
src/scanoss/inspection/policy_check/scanoss/copyleft.py (1)
29-30: License source wiring looks correct; empty-list behavior is worth confirmingImporting
DEFAULT_COPYLEFT_LICENSE_SOURCES, storingself.license_sources = license_sources or DEFAULT_COPYLEFT_LICENSE_SOURCES, and passing that intoScanResultProcessorlines up cleanly with_select_licenses’ filtering mode, so the new default (“component_declared,license_fileplusunknown) and-ls ...` overrides should work as intended.Note that using
license_sources or DEFAULT_COPYLEFT_LICENSE_SOURCESmeans an explicit empty list will also fall back to the default set. If you ever need to distinguish “no filtering, use priority mode” from “use default sources”, you’ll want a sentinel (e.g.,None) instead of relying on truthiness. If treating[]as “use defaults” is intentional, this is fine as-is.Also applies to: 68-69, 92-101
src/scanoss/cli.py (2)
704-713: Clarify/verify argparseextendusage for--license-sourcesThe option wiring is logically sound (multiple
-lsvalues, validated againstVALID_LICENSE_SOURCES, default handled in downstream logic viaNone). Two things to verify/consider:
action='extend'is only available in Python 3.8+. If this project still supports 3.7, this will break argument parsing; otherwise it’s fine but should match the documented supported versions.- Right now the default behavior (no
-ls) is implemented in business logic rather than viadefault=.... That’s OK, but the help string is the only place where the default list appears; if the constant changes, help stays correct, but tests/docs should cover the implied default behavior.If 3.7 support is still required, switching to
action='append'plus flattening later would avoid the compatibility concern.
1756-1769: Defensive access tolicense_sourceswhen callingCopyleftPassing
license_sources=args.license_sourcescleanly threads the CLI option through toCopyleft. However,inspect_copyleftcan be called programmatically with a customNamespacethat may not definelicense_sources, which would raiseAttributeError.To make this more robust without affecting CLI behavior, consider:
- license_sources=args.license_sources, # License sources to check (list) + license_sources=getattr(args, "license_sources", None), # License sources to check (list)Also consider documenting
license_sourcesin the function docstring alongsideinclude/exclude/explicit.src/scanoss/inspection/utils/scan_result_processor.py (2)
65-82: Constructor change is reasonable; consider normalizinglicense_sourcesAdding
license_sourcestoScanResultProcessor.__init__and storing it on the instance is fine and backwards compatible when callers passNone.You might want to normalize it once here (e.g., convert to a
setor ensure it’sNoneor a non‑empty list) so_select_licensesdoesn’t have to worry about unexpected types or accidental[]vsNonedifferences:- self.license_sources = license_sources + self.license_sources = list(license_sources) if license_sources else NoneThis keeps the “priority mode when not provided” behavior while guarding against non‑iterables being passed in.
316-362: License selection logic: semantics are sound, but tighten edge‑case handlingThe two‑mode behavior is clear and useful:
- Filtering mode when
license_sourcesis provided, including'unknown'andNone.- Priority mode otherwise, preferring
component_declared → license_file → file_header → scancode, with a full fallback list.A couple of refinements to consider:
Empty
license_sourcesvsNone
Currently any truthy value triggers filtering mode; an explicit empty list (e.g.,license_sources=[]from non‑CLI callers) falls back to priority mode. That’s slightly surprising. If you want “explicitly no sources” to mean “no licenses”, you could handle it explicitly:if self.license_sources is not None: if not self.license_sources: return [] ...Minor readability tweak
The filter comprehension can be slightly clearer:
if self.license_sources:sources_to_include = set(self.license_sources) | {'unknown'}return [lic for lic in licenses_dataif lic.get('source') in sources_to_include or lic.get('source') is None]
if self.license_sources is not None:sources_to_include = set(self.license_sources) | {'unknown'}return [lic for lic in licenses_dataif lic.get('source') in sources_to_include or lic.get('source') is None]Functionally you’re in good shape; these are just to make behavior around edge cases more explicit.
src/scanoss/osadl.py (1)
108-131:is_copyleftbehavior is sensible; consider explicit “data not loaded” guardThe lookup and normalization logic is straightforward:
- Empty/None SPDX IDs return
False.- Unknown IDs log a debug message and return
False.- Status values are treated as copyleft when they start with
"Yes"(covers"Yes"and"Yes (restricted)").One optional robustness improvement:
- status = Osadl._shared_copyleft_data.get(spdx_id_lc) + if not Osadl._data_loaded: + self.print_debug('OSADL copyleft data not loaded; treating as non-copyleft') + return False + + status = Osadl._shared_copyleft_data.get(spdx_id_lc)This makes it explicit when results are falling back due to a load failure rather than a missing SPDX ID.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
CHANGELOG.md(1 hunks)LICENSE(1 hunks)README.md(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(4 hunks)src/scanoss/constants.py(1 hunks)src/scanoss/data/osadl-copyleft.json(1 hunks)src/scanoss/filecount.py(2 hunks)src/scanoss/inspection/policy_check/scanoss/copyleft.py(4 hunks)src/scanoss/inspection/utils/license_utils.py(1 hunks)src/scanoss/inspection/utils/scan_result_processor.py(3 hunks)src/scanoss/osadl.py(1 hunks)src/scanoss/scanner.py(5 hunks)src/scanoss/scanners/folder_hasher.py(1 hunks)src/scanoss/scanners/scanner_hfh.py(2 hunks)src/scanoss/threadedscanning.py(3 hunks)tests/test_osadl.py(1 hunks)tests/test_policy_inspect.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (10)
src/scanoss/scanner.py (3)
src/scanoss/file_filters.py (2)
get_filtered_files_from_folder(311-352)get_filtered_files_from_files(354-404)src/scanoss/threadedscanning.py (3)
stop_scanning(154-158)queue_add(141-149)run(168-191)src/scanoss/scanners/scanner_hfh.py (1)
scan(126-152)
src/scanoss/inspection/policy_check/scanoss/copyleft.py (1)
src/scanoss/inspection/utils/scan_result_processor.py (1)
ScanResultProcessor(53-412)
src/scanoss/scanners/scanner_hfh.py (2)
src/scanoss/scanossgrpc.py (1)
folder_hash_scan(429-447)src/scanoss/scanossbase.py (1)
print_stderr(45-49)
src/scanoss/filecount.py (2)
src/scanoss/scanossbase.py (2)
print_msg(51-56)print_trace(65-70)src/scanoss/cli.py (1)
file_count(1310-1341)
src/scanoss/scanners/folder_hasher.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/utils/crc64.py (2)
CRC64(29-96)get_hash_buff(82-96)
src/scanoss/inspection/utils/scan_result_processor.py (1)
src/scanoss/inspection/utils/license_utils.py (2)
LicenseUtil(30-116)init(47-74)
tests/test_policy_inspect.py (2)
src/scanoss/inspection/policy_check/scanoss/copyleft.py (2)
Copyleft(50-240)run(220-240)src/scanoss/inspection/policy_check/policy_check.py (2)
run(87-102)PolicyStatus(33-44)
src/scanoss/osadl.py (1)
src/scanoss/inspection/utils/license_utils.py (1)
is_copyleft(76-107)
src/scanoss/inspection/utils/license_utils.py (2)
src/scanoss/osadl.py (3)
Osadl(31-130)print_debug(62-67)is_copyleft(108-130)src/scanoss/scanossbase.py (2)
ScanossBase(28-107)print_debug(58-63)
tests/test_osadl.py (2)
src/scanoss/osadl.py (2)
Osadl(31-130)is_copyleft(108-130)src/scanoss/inspection/utils/license_utils.py (1)
is_copyleft(76-107)
🪛 GitHub Actions: Lint
src/scanoss/filecount.py
[error] 100-100: PLR0912 Too many branches (15 > 12).
[error] 100-100: PLR0915 Too many statements (54 > 50).
[error] 152-152: PLC0206 Extracting value from dictionary without calling .items().
🪛 LanguageTool
README.md
[uncategorized] ~140-~140: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...vecommons.org/licenses/by/4.0/) by the [Open Source Automation Development Lab (OSADL) eG](...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[uncategorized] ~142-~142: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...g/). Attribution: A project by the Open Source Automation Development Lab (OSADL) eG. ...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
CHANGELOG.md
[uncategorized] ~17-~17: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... - Copyleft detection now uses [OSADL (Open Source Automation Development Lab)](https://ww...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (15)
src/scanoss/__init__.py (1)
25-25: Version bump is appropriate for the release scope.The minor version bump from 1.40.1 to 1.41.0 aligns with the feature additions and improvements introduced in this PR (new CLI option, OSADL data, progress bar refactors, etc.).
src/scanoss/filecount.py (1)
114-148: LGTM! Clean refactoring to context-managed spinner.The refactoring properly uses
nullcontext()when no spinner is needed and ensures cleanup across all paths. The conditionalif spinner:check before callingspinner.next()is appropriate.README.md (1)
139-142: LGTM! Proper attribution for OSADL dataset.The license notice correctly attributes the OSADL copyleft license dataset with appropriate licensing information (CC-BY-4.0) and source link.
LICENSE (1)
21-31: LGTM! Proper dual-license attribution.The addition clearly separates the MIT-licensed code from the CC-BY-4.0 licensed OSADL dataset, with appropriate references to the data file location.
src/scanoss/data/osadl-copyleft.json (1)
1-133: LGTM! Well-structured OSADL copyleft dataset.The JSON file includes proper metadata with license attribution (CC-BY-4.0), copyright notice, disclaimer, and timestamp. The copyleft mapping covers 121 license identifiers with appropriate status values.
src/scanoss/constants.py (1)
20-22: LGTM! Well-defined license source constants.The constants appropriately define valid license sources and sensible defaults for copyleft inspection (focusing on component-declared and license file sources).
tests/test_policy_inspect.py (2)
31-31: LGTM! Proper import of new constants for testing.
393-645: Excellent comprehensive test coverage for license source filtering!The test suite thoroughly covers:
- Default behavior and explicit None handling
- Individual license sources (component_declared, license_file, file_header, scancode)
- Multiple source combinations
- All valid sources
- Integration with markdown output
- Interaction with include/exclude filters
- No-copyleft scenarios
The tests validate both the filtering logic and integration with other features.
src/scanoss/scanners/folder_hasher.py (1)
160-191: Context-managed progress bar looks correctWrapping the hashing loop in
with Bar(...) as bar:keeps behavior the same while ensuring the bar is cleaned up on exceptions/KeyboardInterrupt, andbar.next()is still called once per file (even on errors). No issues spotted in the refactor.src/scanoss/scanner.py (1)
785-852: Bar/nullcontext handling inscan_wfp_fileis correct
bar_ctx = Bar('Scanning', max=file_count) if (not self.quiet and self.isatty) else nullcontext()pluswith bar_ctx as bar:andif bar: bar.next(...)gives you a progress bar only in interactive, non-quiet runs and guarantees teardown on errors. The batching/scan logic around it is unchanged, so behavior should remain identical aside from the more robust bar lifecycle.CHANGELOG.md (1)
10-28: Unreleased notes accurately describe the new copyleft behaviorThe
--license-sourcesoption, OSADL switch, expanded copyleft coverage, new defaults, and the Ctrl+C cursor fix are clearly explained and aligned with the implementation. No changes needed here.src/scanoss/inspection/utils/license_utils.py (1)
25-27: OSADL-based copyleft logic and filter precedence look correctInitializing
Osadlonce perLicenseUtil, resetting the include/exclude/explicit sets ininit(), and applying precedence inis_copyleftas:
- explicit-only if provided,
- include overrides (force copyleft),
- exclude overrides (force non-copyleft),
- otherwise defer to OSADL,
matches the intended behavior and plays nicely with repeated
init()calls. This is a clear improvement over the previous hard-coded list approach.Also applies to: 34-36, 40-47, 47-76, 76-107
src/scanoss/cli.py (1)
56-69: New copyleft constants import looks consistentImporting
DEFAULT_COPYLEFT_LICENSE_SOURCESandVALID_LICENSE_SOURCEShere is appropriate and keeps all inspect‑related defaults centralized inconstants.py. No issues from this change alone.src/scanoss/inspection/utils/scan_result_processor.py (1)
161-190: Use of_select_licensesin_append_license_to_componentis correctThe new call to
_select_licenses(new_component['licenses'])is placed after the early‑return whenlicensesis missing, so it won’t seeNoneand won’t regress the existing warning path. Looping overselected_licensespreserves the previous counting behavior while allowing the new filtering/priority semantics.No functional issues spotted here.
src/scanoss/osadl.py (1)
31-54: Overall Osadl design and shared caching look goodThe
Osadlclass cleanly encapsulates OSADL copyleft data, and the class‑level_shared_copyleft_data/_data_loadedpattern avoids repeated file I/O across instances. Debug printing viaprint_debugandprint_stderris consistent with the rest of the codebase.No structural issues here.
6db2eac to
a42511a
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
tests/test_osadl.py (1)
83-99: Fix the vacuous assertion to verify shared data.Line 98 contains
self.assertIs(Osadl._shared_copyleft_data, Osadl._shared_copyleft_data)which always passes. To properly test that multiple instances share the same class-level data, use:- # Verify both instances reference the same class-level data - self.assertIs(Osadl._shared_copyleft_data, Osadl._shared_copyleft_data) + # Verify both instances reference the same class-level shared data + self.assertIs(osadl1._shared_copyleft_data, osadl2._shared_copyleft_data)src/scanoss/osadl.py (1)
73-73: Critical: Fix resource loading path.Using
__name__(which resolves to"scanoss.osadl") will look for the data file atscanoss/osadl/data/osadl-copyleft.json, but the actual file is atscanoss/data/osadl-copyleft.json. This will cause data loading to fail silently.Use
__package__instead to anchor at the package level:- f_name = importlib_resources.files(__name__) / 'data/osadl-copyleft.json' + f_name = importlib_resources.files(__package__) / 'data/osadl-copyleft.json'
🧹 Nitpick comments (1)
src/scanoss/scanner.py (1)
450-450: Consider fixing typos in error messages.While reviewing, I noticed two typos in error messages (pre-existing, but worth fixing):
- Line 450: "encounted" → "encountered"
- Line 940: "uncounted" → "encountered"
Apply this diff:
- self.print_stderr('Warning: Some errors encounted while scanning. Results might be incomplete.') + self.print_stderr('Warning: Some errors encountered while scanning. Results might be incomplete.')And:
- 'Warning: Some errors uncounted while scanning. Results might be incomplete.' + 'Warning: Some errors encountered while scanning. Results might be incomplete.'Also applies to: 940-940
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
CHANGELOG.md(2 hunks)README.md(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/data/osadl-copyleft.json(1 hunks)src/scanoss/inspection/utils/license_utils.py(1 hunks)src/scanoss/osadl.py(1 hunks)src/scanoss/scanner.py(5 hunks)tests/test_osadl.py(1 hunks)tests/test_policy_inspect.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/scanoss/data/osadl-copyleft.json
🧰 Additional context used
🧬 Code graph analysis (5)
src/scanoss/osadl.py (2)
src/scanoss/scanossbase.py (1)
ScanossBase(28-107)src/scanoss/inspection/utils/license_utils.py (1)
is_copyleft(76-108)
tests/test_policy_inspect.py (2)
src/scanoss/inspection/policy_check/scanoss/copyleft.py (2)
Copyleft(50-240)run(220-240)src/scanoss/inspection/policy_check/policy_check.py (2)
run(87-102)PolicyStatus(33-44)
src/scanoss/inspection/utils/license_utils.py (2)
src/scanoss/osadl.py (2)
Osadl(33-120)is_copyleft(97-120)src/scanoss/scanossbase.py (2)
ScanossBase(28-107)print_debug(58-63)
src/scanoss/scanner.py (3)
src/scanoss/file_filters.py (2)
get_filtered_files_from_folder(311-352)get_filtered_files_from_files(354-404)src/scanoss/threadedscanning.py (3)
stop_scanning(154-158)queue_add(141-149)run(168-191)src/scanoss/scanossapi.py (1)
scan(134-244)
tests/test_osadl.py (1)
src/scanoss/osadl.py (2)
Osadl(33-120)is_copyleft(97-120)
🪛 GitHub Actions: Lint
src/scanoss/osadl.py
[error] 26-26: F401 'sys' imported but unused. Remove unused import.
🪛 LanguageTool
README.md
[uncategorized] ~140-~140: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...vecommons.org/licenses/by/4.0/) by the [Open Source Automation Development Lab (OSADL) eG](...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[uncategorized] ~142-~142: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...g/). Attribution: A project by the Open Source Automation Development Lab (OSADL) eG. ...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
CHANGELOG.md
[uncategorized] ~19-~19: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... - Copyleft detection now uses [OSADL (Open Source Automation Development Lab)](https://ww...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (11)
README.md (1)
139-142: LGTM! Dataset attribution properly documented.The license notice clearly documents the CC-BY-4.0 licensed OSADL data and provides proper attribution as required. The static analysis hint about hyphenating "Open Source" is a false positive—this is the official name of the organization.
src/scanoss/__init__.py (1)
25-25: LGTM! Version correctly updated.Version bump to 1.41.0 is consistent with the release branch and CHANGELOG.md.
tests/test_policy_inspect.py (2)
31-31: LGTM! Correct imports for license source filtering.The imports of
DEFAULT_COPYLEFT_LICENSE_SOURCESandVALID_LICENSE_SOURCESalign with the new license source filtering feature.
393-646: Excellent test coverage for license source filtering.The test suite comprehensively covers:
- Default behavior with
DEFAULT_COPYLEFT_LICENSE_SOURCES- Explicit
Nonehandling- Individual source filtering (component_declared, license_file, file_header)
- Multiple source combinations
- All valid sources
- Integration with output formats (json, md, jira_md)
- Interaction with include/exclude filters
- Empty result scenarios
The tests properly validate both the filtering behavior and the expected component counts for each scenario.
CHANGELOG.md (1)
11-31: LGTM! Comprehensive changelog documentation.The 1.41.0 release notes clearly document:
- New
--license-sourcesoption with usage details- Migration to OSADL authoritative data with expanded coverage (21→32 licenses)
- Default behavior change to reduce noise from file-level detections
- Bug fix for terminal cursor disappearing
The documentation is clear and will help users understand the changes and migration path.
src/scanoss/scanner.py (1)
29-29: Excellent refactoring of spinner lifecycle management.The migration to context-managed spinners and progress bars using
withstatements andnullcontext()as a fallback ensures proper cleanup regardless of how the code exits (normal completion, exception, or Ctrl+C). This directly addresses the terminal cursor issue mentioned in the CHANGELOG.The pattern is consistently applied across all scanning paths:
scan_folder(lines 367-424)scan_files(lines 637-694)scan_wfp_file(lines 785-851)wfp_folder(lines 1060-1069)This is a solid improvement in resource management.
Also applies to: 367-424, 637-694, 785-851, 1060-1069
src/scanoss/inspection/utils/license_utils.py (3)
25-45: Well-designed integration of OSADL data.The refactored
LicenseUtilclass cleanly integrates OSADL authoritative data through the newOsadlinstance while maintaining backward compatibility through filter sets. The class-level design allows for efficient data sharing across instances.
47-74: Clear and well-documented filter initialization.The
init()method properly implements the filter precedence:
- Explicit licenses (when provided) ignore OSADL data entirely
- Include/exclude filters overlay on top of OSADL data
- All license IDs are normalized to lowercase for consistent comparison
The early return when explicit licenses are provided is a good optimization.
76-108: Improved copyleft determination logic with clear precedence.The refactored
is_copyleft()method implements a clear and well-documented precedence order:
- Explicit filter (if set) → uses only those licenses
- Include filter → forces copyleft=True
- Exclude filter → forces copyleft=False
- No filters → delegates to OSADL authoritative data
This design provides flexibility while maintaining deterministic behavior, and the comprehensive docstring clearly explains the logic flow.
src/scanoss/osadl.py (2)
33-55: LGTM! Well-designed singleton-like data loading.The class-level caching approach (
_shared_copyleft_dataand_data_loaded) ensures the JSON file is loaded only once and shared across all instances, which is efficient and appropriate for read-only reference data.
97-120: LGTM! Robust copyleft determination logic.The
is_copyleft()method properly:
- Handles None/empty inputs
- Normalizes license IDs to lowercase for case-insensitive comparison
- Treats both "Yes" and "Yes (restricted)" as copyleft (case-insensitive)
- Provides debug feedback for missing data
The implementation aligns well with OSADL's data structure.
| """ | ||
|
|
||
| import json | ||
| import sys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused sys import.
The sys module is imported but never used in this file. This is flagged by the linter and should be removed.
Apply this diff:
import json
-import sys
import importlib_resources📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import sys | |
| import json | |
| import importlib_resources |
🧰 Tools
🪛 GitHub Actions: Lint
[error] 26-26: F401 'sys' imported but unused. Remove unused import.
🤖 Prompt for AI Agents
In src/scanoss/osadl.py around line 26, the file imports the unused module
`sys`; remove the `import sys` line to satisfy the linter and eliminate the
unused import.
Summary by CodeRabbit
New Features
--license-sourcesoption to filter copyleft inspection by source types (component_declared, license_file, file_header, file_spdx_tag, scancode)Bug Fixes
Documentation