Skip to content

feat(python): collect SOURCES-listed package files#321

Merged
mstykow merged 3 commits intomainfrom
feat/python-sources-collection-followup
Mar 12, 2026
Merged

feat(python): collect SOURCES-listed package files#321
mstykow merged 3 commits intomainfrom
feat/python-sources-collection-followup

Conversation

@mstykow
Copy link
Copy Markdown
Owner

@mstykow mstykow commented Mar 12, 2026

Summary

  • collect sibling SOURCES.txt entries next to source-layout .egg-info/PKG-INFO files so standalone source-package metadata exposes explicit ancillary file references
  • resolve those SOURCES.txt refs back onto scanned source-tree files and assign them to the assembled Python package without broad whole-tree harvesting
  • update the Python improvement docs and workboard so this PR is the narrowed #144 follow-up while #149 remains open for a later evidence-driven wheel-vs-sdist slice

Notes

  • this PR is intentionally narrow: it supports .egg-info/PKG-INFO + sibling SOURCES.txt and file-reference-driven source-layout assignment
  • it does not implement generic whole-tree Python file harvesting
  • it does not attempt speculative wheel-vs-sdist parity work; #149 remains open
  • installed-layout metadata assignment from RECORD / installed-files.txt was handled separately in fix(python): assign installed metadata files in scans #320 and is not reworked here

Verification

  • cargo fmt --all -- --check
  • cargo test test_extract_pkg_info_reads_sibling_sources_txt --lib
  • cargo test python_pkg_info_scan_assigns_sources_entries --bin scancode-rust
  • cargo test test_python_sources_file_references_do_not_escape_project_root --lib
  • cargo test test_extract_pkg_info_reads_sibling_installed_files_txt --lib
  • cargo test test_extract_metadata_reads_sibling_record_csv --lib
  • cargo test test_resolve_python_metadata_file_references --lib
  • cargo test test_resolve_python_pkg_info_installed_files_references --lib
  • cargo test test_resolve_python_metadata_file_references_in_dist_packages --lib
  • cargo test test_python_metadata_file_references_do_not_assign_outside_packages_dirs --lib
  • cargo test python_metadata_scan_assigns_referenced_site_packages_files --bin scancode-rust
  • cargo test python_pkg_info_scan_assigns_installed_files_entries --bin scancode-rust
  • cargo test python --lib
  • cargo test --features golden-tests python_golden --lib
  • cargo build
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test assembly::assemblers::tests::test_every_datasource_id_is_accounted_for --lib
  • npm run check:docs

Closes #144

@mstykow mstykow merged commit 3c224b8 into main Mar 12, 2026
6 checks passed
@mstykow mstykow deleted the feat/python-sources-collection-followup branch March 12, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Python: Follow-up ancillary package file collection

1 participant