feat: add Ruff code quality guardrails (replaces black + isort)#999
feat: add Ruff code quality guardrails (replaces black + isort)#999sergio-sisternes-epam wants to merge 8 commits intomainfrom
Conversation
299cb3e to
076efaa
Compare
APM Review Panel VerdictDisposition: REQUEST_CHANGES (two required pre-merge actions: SHA-pin Per-persona findingsPython Architect: This is a toolchain modernization PR -- no class hierarchy is restructured, no new abstract base is introduced, and no execution-path fork is added. The one semantic change is an f-string variable extraction in 1. OO / class diagramclassDiagram
direction TB
class pyproject_toml {
<<Configuration>>
+line-length : 100
+target-version : py310
+select : 10 rule sets
+lint.pylint thresholds
+lint.mccabe thresholds
}
class ci_yml {
<<IOBoundary>>
+lint job
+build-and-test job
}
class ruff_action {
<<ExternalDep>>
astral-sh/ruff-action@v3
}
class AuditReporter {
<<Pure>>
+generate_report() str
-_format_cell(x str) str
}
class pre_commit_config {
<<Configuration>>
+ruff hook
+ruff-format hook
}
pyproject_toml ..> ci_yml : governs thresholds
ci_yml ..> ruff_action : invokes
pre_commit_config ..> ruff_action : mirrors
class pyproject_toml:::touched
class ci_yml:::touched
class AuditReporter:::touched
class pre_commit_config:::touched
classDef touched fill:#fff3b0,stroke:#d47600
2. Execution flow diagramflowchart TD
PR[PR push / merge_group] --> LINT[lint job -- new]
LINT --> RL["[EXEC] astral-sh/ruff-action@v3\nruff check src/ tests/"]
LINT --> RF["[EXEC] astral-sh/ruff-action@v3\nruff format --check src/ tests/"]
LINT --> YE["[I/O] YAML encoding safety\ngrep for non-ASCII bytes"]
LINT --> FL["[I/O] File length guardrail\nawk max 2600 lines"]
LINT --> NRS["[I/O] No raw str(relative_to)\ngrep check"]
RL -->|fail -- ~3s| FAIL[CI fails fast]
RF -->|fail -- ~3s| FAIL
YE -->|fail| FAIL
FL -->|fail| FAIL
NRS -->|fail| FAIL
RL -->|pass| BT
RF -->|pass| BT
YE -->|pass| BT
FL -->|pass| BT
NRS -->|pass| BT
BT["build-and-test job\nuv run pytest tests/unit tests/test_console.py -n auto --dist worksteal\n4029 tests pass"]
BT -->|pass| DONE[Ready for merge]
style FAIL fill:#ffcccc
style DONE fill:#ccffcc
style LINT fill:#fff3b0,stroke:#d47600
Design patterns
Thresholds set just above current worst-case maximums is architecturally sound for a first-pass gate. The Python 3.10 compatibility bug caught by CLI Logging Expert: DevX UX Expert: No user-facing CLI surface changes. The PR is developer-infrastructure only. Positive contributor DX signals: Minor concern: No quickstart, README, or Supply Chain Security Expert: Critical finding -- # Required form:
uses: astral-sh/ruff-action@f6dc84c3be43b9ba22f1d1e0bd3bd11b96fb3f25 # v3.x.xLook up the current SHA with Secondary (informational): Positive findings:
Auth Expert: Fast-path trigger fired: Reviewed changes:
AuthResolver precedence chain: unchanged. Token fallback semantics: unchanged. Host classification: unchanged. Credential fill behavior: unchanged. No auth regression. No credential exposure surface added. OSS Growth Hacker: Three growth-relevant signals for CEO consideration:
Side-channel to CEO: The strangler-fig roadmap table (Stages 1-4 with exact file counts) is an unusually contributor-friendly artifact -- it tells future contributors exactly which files to target and what the expected complexity reduction is. Worth calling out in the release note as "we published our technical debt backlog." This lowers the barrier for a first contribution. CEO arbitrationAll six specialists agree on the substance: this is a well-executed toolchain modernization from an external contributor that improves code quality without introducing regressions. The specialists do not disagree with each other -- the only actionable finding is the Supply Chain Security expert's SHA-pin requirement, which is correct and non-negotiable. The fix is mechanical (30 seconds to look up the SHA and substitute it in two places in The CHANGELOG omission is also required: per repo convention, every PR changing code, tests, docs, or dependencies gets a CHANGELOG entry. This PR replaces two dev dependencies, adds a CI job, and changes the line-length policy -- all three are developer-visible. The entry should credit the external contributor. The strangler-fig threshold approach (thresholds at current-worst-plus-one, published tightening roadmap) is the correct strategic call for a growing OSS codebase: it stops quality regression immediately without requiring a big-bang refactor that would block this PR for months. Stage 1 (8 files, dominated by The Final call: Two mechanical pre-merge actions (SHA-pin, CHANGELOG). Everything else merges as-is. Required actions before merge
Optional follow-ups
|
2a4474d to
dc6553c
Compare
Replace black + isort with Ruff as the sole linter and formatter. Configure comprehensive rule set (E, F, I, UP, B, SIM, RUF, S, PLR, C90) with empirically-derived complexity thresholds set just above current codebase maximums to prevent regression while allowing existing code. Changes: - pyproject.toml: Replace [tool.black]/[tool.isort] with [tool.ruff], remove black/isort from dev deps, add ruff>=0.11.0 - .github/workflows/ci.yml: Add fast lint job using astral-sh/ruff-action@v3, move grep-based checks from build-and-test to lint job - .pre-commit-config.yaml: Add optional pre-commit hooks (ruff + ruff-format) - CONTRIBUTING.md, docs: Update coding style references from black/isort to Ruff - src/, tests/: Reformat to line-length=100, apply safe auto-fixes, add noqa directives for existing violations to address in future PRs - src/apm_cli/security/audit_report.py: Fix Python 3.10 compat bug (backslash in f-string only valid in 3.12+) All 4029 tests pass. Ruff check and format are fully clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add awk-based file-length check in CI lint job (max 2600 lines; current worst: 2527 in github_downloader.py) - Enable B904 (raise-without-from-inside-except): add 60 targeted noqa directives for existing violations - Enable E402 (module-import-not-at-top): add 8 targeted noqa directives for intentional lazy imports - Reorganise ignore list into permanent vs deferred categories Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop astral-sh/ruff-action@v3 in favour of uv run ruff, which resolves the pinned version from uv.lock instead of querying the GitHub Releases API at runtime. This eliminates a transient failure mode (API outages) and removes a third-party action from the trust surface. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
uv run without --extra dev only installs runtime dependencies. Ruff lives under [project.optional-dependencies] dev. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
publisher.py and discovery.py use yaml.safe_dump/yaml.dump to produce strings (no file handle argument). The grep-based YAML encoding safety check false-positives on the keyword arguments. Add yaml-io-exempt comments to suppress. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… 2917) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d0b1f26 to
a68d11c
Compare
- Add CHANGELOG.md entry under [Unreleased] crediting @sergio-sisternes-epam - Update pre-commit ruff rev from v0.11.12 to v0.15.12 (matches uv.lock) - Clarify in CONTRIBUTING.md that CI is authoritative gate Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Panel feedback -- all items addressed
|
Main was refactored since original thresholds were set. Update to current empirical maximums: - max-statements: 300 -> 250 (was mcp_integrator 285, now install.py 244) - max-complexity: 110 -> 65 (was mcp_integrator 104, now validation.py 62) - max-branches: 130 -> 70 (was mcp_integrator 128, now install.py 67) - max-returns: 20 -> 18 (was target_detection 19, now publisher.py 16) - max-args: 16 -> 18 (was services.py 15, now install.py 16) Remove 4 now-unnecessary noqa: PLR0913 directives. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Replace
black+isortwith Ruff as the sole linter and formatter for the APM codebase. Add a deterministic CI lint gate, configure 10 rule sets with empirically-derived complexity thresholds, and establish a "strangler fig" roadmap for progressively tightening standards over time.523 files changed | 6564 tests pass | CI lint job runs in ~3s
Motivation
APM had
blackandisortas dev dependencies but neither was enforced in CI -- they were purely advisory. This PR introduces a proper quality gate that:pyproject.toml, no.flake8, no.isort.cfg, no.pylintrcWhy Ruff, not black + isort + pylint?
Irule, isort-compatible)Erules)F401)B-- bugbear)S-- bandit)PLR,C90)Ruff is a strict superset of what we had. The formatting output is black-compatible, the import sorting is isort-compatible. We gain 7 additional rule categories and a proper CI gate. Nothing is lost.
What changed
Configuration (
pyproject.toml)Removed
[tool.black]and[tool.isort]sectionsRemoved
blackandisortfrom dev dependenciesAdded
ruff>=0.11.0to dev dependenciesAdded
[tool.ruff]with 10 rule sets:EFIUPBSIMRUFSPLRC90Line length: 88 --> 100
Changed from black's default 88 to 100. This is the primary cause of the large diff (345 files reformatted). The wider line length reduces unnecessary line wrapping in a CLI codebase with many long path strings and f-string messages.
CI (
ci.yml)Added a new
lintjob that runs in parallel withbuild-and-test:The lint job uses
uv run --extra dev ruffwhich runs the lockfile-pinned Ruff version (currently 0.15.12) -- same trust chain as the build job, no third-party GitHub Action needed, no GitHub API calls at runtime.Pre-commit (
.pre-commit-config.yaml)Added optional pre-commit hooks for local development. Install with
uv run pre-commit install. This is optional -- CI is the authoritative gate. The pre-commit hook rev may lag behind the CI version.Bug fix (
audit_report.py)Fixed a real Python 3.10 compatibility bug discovered by Ruff's
target-version = "py310"setting. A backslash inside an f-string (f"...{x.replace('|', '\\|')}...") is only valid in Python 3.12+. Extracted to a local variable.Complexity thresholds -- the "strangler fig" approach
Current thresholds (set in this PR)
We ran Ruff against the entire codebase with
max-*=1and extracted every function's actual metric values using JSON output. Thresholds were set just above the current worst offenders -- existing code passes, but new code cannot make things worse:awkgithub_downloader.pyPLR0915commands/install.pyC901install/validation.pyPLR0912commands/install.pyPLR0913commands/install.pyPLR0911marketplace/publisher.pyTightening roadmap (simulation results)
We simulated progressively stricter thresholds to identify which files would break at each stage:
Stage 1 -- quick wins (8 files)
Only the worst outliers. See follow-up issue #1004 for details.
commands/install.pyinstall/validation.pyintegration/mcp_integrator.pycommands/compile/cli.pydeps/github_downloader.pycommands/audit.pymarketplace/publisher.pycore/target_detection.pyStage 2 -- meaningful refactoring (34 files)
Hits real structural debt. Top offender clusters:
install.py,deps/cli.py,audit.py,uninstall/cli.py(large Click handlers doing too much)mcp_integrator.py,skill_integrator.py,hook_integrator.py(god-class integrators)github_downloader.py(2917 lines),plugin_parser.pyStage 3 -- disciplined codebase (58 files)
Industry best practices territory (50 stmts, 20 complexity). Nearly every command and integrator needs splitting. Each file becomes genuinely readable and testable in isolation.
Target -- gold standard (79 files)
400-line files, 30-statement functions, complexity 15. This is what a mature, well-decomposed CLI looks like.
Key insight:
commands/install.pyis the current single worst outlier across statements, complexity, branches, and args. Refactoring it alone would unlock Stage 1 across multiple metrics.Globally ignored rules
Permanently ignored (intentional patterns in a CLI tool)
PLR2004E501SIM117withstatements -- style preference, 65 violationsS110try-except-pass-- intentional graceful degradation in CLI error handlingS602subprocesswithshell=True-- intentional in a CLI wrapping git/pipDeferred to future PRs
RUF003+RUF002SIM102ifstatementsExisting violations --
# noqadirectives908
# noqadirectives across the codebase. Each names the specific rule code for searchability. Top categories:F401UP035typing.Optional)RUF013Optional(e.g.def f(x: str = None))F841B904raisewithoutfromin except (needs manualfrom err/from Nonedecision)RUF059These are all pre-existing patterns. The noqa directives make them visible and searchable -- each can be addressed in a dedicated cleanup PR by grepping for the rule code.
Follow-up issues
plugin_parser.py(pre-existing)good first issue)How to reproduce the simulation
Testing
ruff check src/ tests/-- all checks passedruff format --check src/ tests/-- 546 files cleanuv run pytest tests/unit tests/test_console.py -x-- 6564 passed, 0 failed