Skip to content

refactor(python): split monolithic python.rs into nested submodules#725

Merged
abraemer merged 2 commits intomainfrom
refactor/python-parser-submodules
Apr 17, 2026
Merged

refactor(python): split monolithic python.rs into nested submodules#725
abraemer merged 2 commits intomainfrom
refactor/python-parser-submodules

Conversation

@abraemer
Copy link
Copy Markdown
Collaborator

Summary

  • Split the 5,407-line python.rs into a python/ directory with surface-focused submodules, following ADR 0009
  • Resolve circular dependencies between groups by relocating cross-cutting helpers (e.g., extract_setup_cfg_dependency_name → utils, pip-origin-json items → archive, detect_pkg_info_datasource_id → utils, apply_project_url_mappings/parse_setup_cfg_keywords → utils)
  • Co-locate test files inside the python/ directory per ADR 0009 (python_test.rspython/test.rs, python_scan_test.rspython/scan_test.rs)

Changes

New File Contents
python/mod.rs PythonParser struct, PackageParser dispatcher, re-exports, register_parser!
python/utils.rs Shared helpers: dependency building, normalization, build_pypi_urls, read_toml_file, has_private_classifier, etc.
python/archive.rs sdist/wheel/egg archive extraction, pip-origin-json parsing
python/rfc822_meta.rs PKG-INFO/METADATA RFC 822 parsing
python/pyproject.rs pyproject.toml + Poetry parsing
python/setup_py.rs AST and regex setup.py parsing
python/setup_cfg.rs setup.cfg INI parsing
python/pypi_json.rs pypi.json + pip-inspect parsing

Clean dependency order: utils → setup_cfg → archive → rfc822_meta → pyproject → setup_py → pypi_json → dispatcher

Testing

  • All 128 Python tests pass (116 unit + 12 scan tests)
  • cargo clippy --all-targets -- -D warnings clean
  • cargo check --tests clean
  • Pre-commit hooks all pass (clippy, markdown-lint, prettier, rustfmt)

ADR

  • ADR 0009 added and marked Accepted — documents the nested-submodule convention, ~1,500 line threshold, visibility rules, and test co-location

Split the 5,407-line python.rs into python/ directory with surface-focused
submodules following ADR 0009:

- mod.rs: PythonParser struct, PackageParser dispatcher, re-exports
- utils.rs: shared helpers (dependency building, normalization, etc.)
- archive.rs: sdist/wheel/egg extraction, pip-origin-json parsing
- rfc822_meta.rs: PKG-INFO/METADATA RFC 822 parsing
- pyproject.rs: pyproject.toml + Poetry parsing
- setup_py.rs: AST and regex setup.py parsing
- setup_cfg.rs: setup.cfg INI parsing
- pypi_json.rs: pypi.json + pip-inspect parsing

Circular dependencies resolved by relocating cross-cutting helpers:
- extract_setup_cfg_dependency_name → utils
- pip-origin-json items → archive
- detect_pkg_info_datasource_id → utils
- apply_project_url_mappings, parse_setup_cfg_keywords → utils

Test files co-located in python/ per ADR 0009:
- python_test.rs → python/test.rs
- python_scan_test.rs → python/scan_test.rs
@abraemer abraemer force-pushed the refactor/python-parser-submodules branch from 9df34e3 to 7762aeb Compare April 17, 2026 09:54
Update parser structure guidance to reflect the nested-submodule
convention for large ecosystems:

- HOW_TO_ADD_A_PARSER.md: reference ADR 0009 for directory-structured
  parsers, update templates, module wiring, test file paths, and done
  checklist to include both flat and nested options
- ADR 0009: replace speculative python/ layout with actual structure
- add-parser SKILL.md: same updates, plus add ADR 0009 to references
@abraemer abraemer merged commit dfaf2de into main Apr 17, 2026
14 checks passed
@abraemer abraemer deleted the refactor/python-parser-submodules branch April 17, 2026 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant