fix(ci): migrate sonarcloud to org workflow and pin @main to SHA#27
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (31)
WalkthroughPins many org-level reusable GitHub Actions workflows to a fixed commit SHA, refactors SonarCloud to call a pinned reusable workflow, standardizes UTC usage to timezone.utc across source/tests, updates dependencies, and adds template feedback entries. ChangesGitHub Actions Workflow Updates
Python UTC compatibility
Dependencies and Documentation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Dependency ReviewThe following issues were found:
License Issuespyproject.toml
OpenSSF Scorecard
Scanned Files
|
There was a problem hiding this comment.
Pull request overview
This PR updates the repository’s CI security posture and SonarCloud integration by migrating SonarCloud scanning to an org-level reusable workflow and pinning previously floating reusable workflow references to immutable commit SHAs.
Changes:
- Replaced the inline SonarCloud workflow implementation with a thin caller to the org reusable
python-sonarcloud.ymlworkflow (includingskip-if-no-tokenand non-blocking quality gate behavior). - Pinned org reusable workflow references that previously used
@mainto a specific commit SHA to reduce supply-chain risk. - Updated the main CI caller to disable inline SonarCloud handling and set the correct SonarCloud organization value.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/sonarcloud.yml | Migrates SonarCloud analysis to org reusable workflow with configured inputs/secrets. |
| .github/workflows/ci.yml | Pins org CI workflow to SHA; disables SonarCloud in CI reusable workflow and sets org name. |
| .github/workflows/codecov.yml | Pins org Codecov reusable workflow reference to SHA. |
| .github/workflows/container-security.yml | Pins org container security reusable workflow reference to SHA. |
| .github/workflows/coverage.yml | Pins org Qlty coverage reusable workflow reference to SHA. |
| .github/workflows/docs.yml | Pins org docs reusable workflow reference to SHA. |
| .github/workflows/mutation-testing.yml | Pins org mutation testing reusable workflow reference to SHA. |
| .github/workflows/performance-regression.yml | Pins org performance regression reusable workflow reference to SHA. |
| .github/workflows/publish-pypi.yml | Pins org PyPI publish reusable workflow reference to SHA. |
| .github/workflows/python-compatibility.yml | Pins org compatibility matrix reusable workflow reference to SHA. |
| .github/workflows/qlty.yml | Pins org Qlty coverage reusable workflow reference to SHA. |
| .github/workflows/release.yml | Pins org release reusable workflow reference to SHA. |
| .github/workflows/sbom.yml | Pins org SBOM reusable workflow reference to SHA. |
| .github/workflows/scorecard.yml | Pins org Scorecard reusable workflow reference to SHA. |
| .github/workflows/security-analysis.yml | Pins org security analysis reusable workflow reference to SHA. |
| .github/workflows/slsa-provenance.yml | Pins org SLSA reusable workflow reference to SHA. |
✅ FIPS Compatibility Check
Status: ✅ PASSED What is FIPS?FIPS 140-2/140-3 is a US government standard for cryptographic modules. Common issues:
|
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
Two more template defaults discovered while resolving SonarCloud and Python compatibility matrix failures on PR #27.
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
🎉 Performance Regression CheckStatus: PERFORMANCE IMPROVED
Threshold: +/-10% allowed regression ✅ Great work!: Performance has improved. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
|
@copilot resolve the merge conflicts in this pull request |
Resolved in
|
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
.github/workflows/mutation-testing.yml (1)
43-53:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid broad secret inheritance in this caller.
Line 53 inherits all repository secrets into the reusable workflow. Restrict to explicit secret mappings required by
python-mutation.yml.As per coding guidelines,
.github/workflows/**: Review GitHub Actions workflows for “Proper secret handling”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/mutation-testing.yml around lines 43 - 53, The workflow currently uses a broad secret inheritance via the secrets: inherit setting when calling the reusable workflow (python-mutation.yml); replace that with explicit secret mappings by identifying which secrets the reusable workflow actually needs (e.g., tokens or API keys referenced inside python-mutation.yml) and add only those as named mappings (for example GITHUB_TOKEN, PYPI_TOKEN, CUSTOM_API_KEY) under the call instead of inherit so only required secrets are passed to the reusable workflow; update the caller to remove secrets: inherit and add the minimal secrets: block with the exact secret names used by python-mutation.yml..github/workflows/container-security.yml (1)
42-55:⚠️ Potential issue | 🟠 Major | ⚡ Quick winLimit secret scope for reusable workflow calls.
Line 55 uses
secrets: inherit, which grants all repository secrets to the called workflow. Replace with explicit secret mapping. The reusable workflow only requiresDHI_USERNAMEandDHI_PAT:secrets: DHI_USERNAME: ${{ secrets.DHI_USERNAME }} DHI_PAT: ${{ secrets.DHI_PAT }}This follows the least-privilege principle per GitHub Actions security best practices.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/container-security.yml around lines 42 - 55, The workflow call using the reusable workflow (the uses: ByronWilliamsCPA/... line) currently uses "secrets: inherit" which exposes all repo secrets; replace that with an explicit secret mapping that only passes DHI_USERNAME and DHI_PAT (map DHI_USERNAME: ${{ secrets.DHI_USERNAME }} and DHI_PAT: ${{ secrets.DHI_PAT }}) so the reusable workflow receives only the needed credentials..github/workflows/python-compatibility.yml (1)
44-69:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove
secrets: inheritfrom this external reusable workflow call.This testing workflow has no apparent need for repository secrets. Inheriting all secrets (via
secrets: inherit) unnecessarily expands the blast radius if the external workflow is ever compromised. Follow least-privilege: explicitly pass only required secrets using thesecretskey, or omit it entirely if none are needed.Suggested change
uses: ByronWilliamsCPA/.github/.github/workflows/python-compatibility.yml@1b2d33c47cc11a96b9757b49f41873c54e75f57c # main with: @@ -66,7 +66,8 @@ # rag-processor uses hatchling as build backend, so editable installs # require the build step. The reusable defaults no-build: true; override. no-build: false - secrets: inherit + # Omit secrets entirely unless explicitly required by the reusable workflow. + # secrets: + # REQUIRED_SECRET: ${{ secrets.REQUIRED_SECRET }}As per coding guidelines,
.github/workflows/**should follow "Proper secret handling" and "Security best practices (minimal permissions)".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/python-compatibility.yml around lines 44 - 69, The call to the external reusable workflow currently uses "secrets: inherit", which exposes all repository secrets unnecessarily; remove the "secrets: inherit" line from the reusable workflow invocation (the block using "uses: ByronWilliamsCPA/.github/.github/workflows/python-compatibility.yml") or replace it with an explicit minimal mapping of only the specific secrets that workflow requires (e.g., "secrets: { REQUIRED_SECRET: ${{ secrets.REQUIRED_SECRET }}") if any are actually needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/template_feedback.md`:
- Around line 53-55: The fenced code block containing the error message "error:
Distribution `<project>==0.1.0 @ editable+.` can't be installed because it is
marked as `--no-build` but has no binary distribution" should include a language
specifier for proper syntax highlighting; update the opening triple-backtick for
that block to "```text" (and keep the closing "```") so the snippet is rendered
with the text language identifier.
---
Outside diff comments:
In @.github/workflows/container-security.yml:
- Around line 42-55: The workflow call using the reusable workflow (the uses:
ByronWilliamsCPA/... line) currently uses "secrets: inherit" which exposes all
repo secrets; replace that with an explicit secret mapping that only passes
DHI_USERNAME and DHI_PAT (map DHI_USERNAME: ${{ secrets.DHI_USERNAME }} and
DHI_PAT: ${{ secrets.DHI_PAT }}) so the reusable workflow receives only the
needed credentials.
In @.github/workflows/mutation-testing.yml:
- Around line 43-53: The workflow currently uses a broad secret inheritance via
the secrets: inherit setting when calling the reusable workflow
(python-mutation.yml); replace that with explicit secret mappings by identifying
which secrets the reusable workflow actually needs (e.g., tokens or API keys
referenced inside python-mutation.yml) and add only those as named mappings (for
example GITHUB_TOKEN, PYPI_TOKEN, CUSTOM_API_KEY) under the call instead of
inherit so only required secrets are passed to the reusable workflow; update the
caller to remove secrets: inherit and add the minimal secrets: block with the
exact secret names used by python-mutation.yml.
In @.github/workflows/python-compatibility.yml:
- Around line 44-69: The call to the external reusable workflow currently uses
"secrets: inherit", which exposes all repository secrets unnecessarily; remove
the "secrets: inherit" line from the reusable workflow invocation (the block
using "uses:
ByronWilliamsCPA/.github/.github/workflows/python-compatibility.yml") or replace
it with an explicit minimal mapping of only the specific secrets that workflow
requires (e.g., "secrets: { REQUIRED_SECRET: ${{ secrets.REQUIRED_SECRET }}") if
any are actually needed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6d79777c-ffe3-404d-85a7-97c5f4dc2a98
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock,!**/*.lock
📒 Files selected for processing (30)
.github/workflows/ci.yml.github/workflows/codecov.yml.github/workflows/container-security.yml.github/workflows/coverage.yml.github/workflows/docs.yml.github/workflows/mutation-testing.yml.github/workflows/performance-regression.yml.github/workflows/pr-validation.yml.github/workflows/publish-pypi.yml.github/workflows/python-compatibility.yml.github/workflows/qlty.yml.github/workflows/release.yml.github/workflows/sbom.yml.github/workflows/scorecard.yml.github/workflows/security-analysis.yml.github/workflows/slsa-provenance.yml.github/workflows/sonarcloud.ymldocs/template_feedback.mdpyproject.tomlsrc/rag_processor/auth/cloudflare.pysrc/rag_processor/auth/models.pysrc/rag_processor/models/batch.pysrc/rag_processor/models/job.pysrc/rag_processor/queue/jobs.pysrc/rag_processor/queue/redis_store.pysrc/rag_processor/utils/time_utils.pysrc/rag_processor/websocket/events.pytests/unit/test_auth_middleware.pytests/unit/test_models.pytests/unit/test_websocket.py
ci.yml passed `sonarcloud-organization: 'williaby'` while sonarcloud.yml passes `sonar-organization: 'byronwilliamscpa'`. The conflict is the same issue documented in docs/template_feedback.md (entry "Stale sonar-organization default"): williaby is a leftover personal handle and the project actually lives under the byronwilliamscpa SonarCloud org. Although `enable-sonarcloud: false` in ci.yml currently makes the value unused at runtime, leaving the documented-wrong value in code contradicts the PR's stated goal of fixing the SonarCloud organization and will mislead future readers when ci.yml's enable flag flips back to true. Surface findings origin: PR #27 self-review (Critical tier). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Address CodeRabbit's three outside-diff comments on PR #27 about `secrets: inherit` granting all repository secrets to org reusable workflows: - container-security.yml: replace `secrets: inherit` with an explicit `DHI_USERNAME` + `DHI_PAT` mapping. The org python-container-security.yml only consumes those two credentials. - mutation-testing.yml: drop `secrets: inherit`. python-mutation.yml does not need repository secrets for the mutation run; re-add a specific mapping if a future change adds an authenticated step. - python-compatibility.yml: drop `secrets: inherit`. The compatibility matrix runs pure unit tests against multiple Python versions and has no apparent secret dependency. Follows the least-privilege principle in OWASP / GitHub Actions security best practices: minimize the blast radius if an org reusable workflow is ever compromised. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…wing them The IGNORE_ARGS step in pr-validation.yml previously used `$(python3 -c "..." 2>/dev/null || true)` which silently swallowed any tomllib parse failure. If pyproject.toml became unreadable, pip-audit would run without the intended `[tool.pip-audit].ignore-vuln` flags and re-flag advisories that were deliberately ignored, causing a confusing "why is this suddenly failing" diagnosis with no audit-log trail. Replace the silent fallback with an explicit `if ! IGNORE_ARGS=$(...)` branch that: - Lets stderr from python3 surface in the workflow log - Emits a `::warning::` annotation so reviewers see the parse failure on the run summary - Resets IGNORE_ARGS to an empty string so pip-audit still runs and surfaces the real vulnerabilities loudly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The `UTC = timezone.utc # noqa: UP017` pattern is repeated across 11 files (8 src + 3 tests). The previous module docstring claimed "Python 3.12 idioms" which contradicted the backport. Without an inline explanation, future maintainers see only a suppressed ruff rule with no hint of when it can be removed. Add a comment on time_utils.py (the canonical home of UTC for the project) that: - States why the alias exists: `datetime.UTC` was added in Python 3.11 and `requires-python = ">=3.10"` still supports 3.10 - States the removal condition: when the project minimum moves to 3.11, switch to `from datetime import UTC` and drop the noqa annotations Also correct the module docstring from "Python 3.12 idioms" to "Python 3.10+ compatible" so the file's own description matches its actual constraint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- pr-validation.yml: build pip-audit --ignore-vuln flags as a bash array
and pass them quoted instead of relying on unquoted $IGNORE_ARGS
expansion. Current PYSEC-* IDs are shell-safe, but the array pattern
is robust against future entries with shell-significant characters.
- template_feedback.md: hard-wrap the new safety-removal section at the
120-char limit called out by the repo's markdown guideline.
Verified locally: pip-audit clean ('No known vulnerabilities found, 3
ignored').
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
d03effd to
c15bd5f
Compare
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
PR Fix SummaryRebased onto current Note on rebase: The original 14 PR commits were replayed onto the post-PR#33 / post-PR#20 Fixes appliedCritical
Important (CodeRabbit's 3 outside-diff comments)
Suggested
Rebase conflict resolutions
Skipped (already addressed in PR before review)
Remaining (manual follow-up recommended)
Pre-commit hooks passed locally on each commit. CI will re-run on push. Generated with Claude Code |
The rebase replay reintroduced a duplicate `no-build: false` entry at the end of the `with:` block (one came from PR #33's mainline edit, the other from this PR's earlier no-build sweep). YAML silently keeps the last duplicate, but Actions emits a warning and lint tools flag it. Drop the trailing duplicate so the key appears exactly once.
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
The org python-ci.yml reusable workflow at SHA 1b2d33c4 does not expose `enable-sonarcloud` or `sonarcloud-organization` inputs (see the `workflow_call.inputs` schema in the org repo at that pin). GitHub Actions rejects the entire workflow run at load time when a caller passes unknown inputs, producing `startup_failure` rather than a real job run. The cascading effect: the top-level `ci-gate` job never runs, which means no status is reported for the org ruleset's required `CI Gate` check, which leaves the PR in `mergeStateStatus: BLOCKED` indefinitely. The previous commits (this PR and a follow-up) tried to fix the wrong problem by changing the value of `sonarcloud-organization` from 'williaby' to 'byronwilliamscpa'. Neither value is recognized; the inputs themselves are not in the schema. The PR's stated intent (delegate SonarCloud to the dedicated sonarcloud.yml caller) is achieved by the absence of any inline SonarCloud configuration in ci.yml, not by passing flags the reusable workflow does not expose. Remove both lines entirely. Surface findings origin: PR #27 self-review CI Gate stuck-required-check diagnosis. Matches memory entry `project_reusable_input_schema_sha_pin`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Address CodeRabbit's inline comment on PR #27 (pr-validation.yml:164): the previous version built a single space-separated string and expanded it unquoted with `# shellcheck disable=SC2086`. While the values (PYSEC-* IDs) are whitespace and glob-free in practice, the array form is the idiomatic shell pattern: each `--ignore-vuln` flag and its value arrives at pip-audit's argv as a discrete element, no word-splitting, no shellcheck disable needed. The python heredoc now emits two lines per vuln ID (one for the flag, one for the value); `mapfile -t IGNORE_ARGS` reads them into the array. The error-handling branch from the prior commit is preserved: a tomllib parse failure emits a `::warning::` annotation rather than silently running pip-audit without ignores. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Address CodeRabbit's inline comment on PR #27 (docs/template_feedback.md): several lines in the newly-added template-feedback entries exceeded the 120-character convention CodeRabbit applies, even though the project's own `.markdownlint.json` disables MD013/line-length checking. - Hard-wrap the prose lines in the four new entries (no-build, sonar-organization, test-command word-splitting, atheris-3.10 wheel) to a 120-char column. - Convert the long brace-expansion file-path on the no-build entry to a bulleted list under a shared parent path; the resulting list is more readable and stays well under 120 chars per line. - Leave the literal pip error message inside the `text` code block unwrapped: wrapping inside a fenced code block would corrupt the reproduction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Follow-up: Root cause of stuck
|
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
…ter repos CodeRabbit was configured with `request_changes_workflow: true`, which makes CodeRabbit submit formal CHANGES_REQUESTED reviews. Combined with the default GitHub branch protection behavior (any reviewer with write access blocks merge until their review is dismissed or addressed), this caused PR #27 to stay in `mergeStateStatus: BLOCKED` after CodeRabbit filed its second review pass. Sister repos under ByronWilliamsCPA (audio-processor, fragrance-rater, cookiecutter-python-template, reference-library) all run with this setting disabled or omitted (default false). With the setting off, CodeRabbit findings still appear as inline comments and a review summary, but the review state is COMMENTED rather than CHANGES_REQUESTED, so it does not gate merge. Humans and Copilot keep their veto. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All actionable findings have been addressed in subsequent commits: bash-array refactor (commit 92b2473) for pr-validation.yml, prose hard-wrap (commit 747b9f9) for template_feedback.md, and the underlying ci.yml startup_failure fix (commit 25f42ef). CodeRabbit config also updated to disable request_changes_workflow (commit 8f9af6c) so future PRs do not block on CR reviews. Dismissing this stale review to clear the merge gate.
✅ Performance Regression CheckStatus: PERFORMANCE OK
Threshold: +/-10% allowed regression ✅ Performance is within acceptable range. Additional Metrics
About Performance Regression TestingThis automated check compares
To reproduce locally: uv run --frozen python scripts/benchmark.py --iterations 1000 |
|
PR #26 review (Important #1 and #2). #1 (bypass identity mismatch): verify_ws_token returned {"email": "anonymous@local", "user_id": "local"} in dev-bypass mode, while CloudflareAuthMiddleware._get_bypass_user returned email="dev@localhost" / user_id="dev-user-001". After batch_is_owned_by was introduced, a batch created over the HTTP path (creator = dev@localhost) could no longer be subscribed to over WebSocket (caller = anonymous@local) — ownership check failed. Align verify_ws_token to return the same identity the HTTP path uses, and (Suggested #4) emit the same CRITICAL log so production misconfiguration is loud on every WS upgrade too. #2 (unhandled network errors): verify_cloudflare_token can raise httpx.HTTPStatusError / ConnectError / TimeoutException and json.JSONDecodeError when the JWKS fetch fails. These aren't subclasses of jwt.InvalidTokenError, so they previously propagated out of verify_ws_token, closing the socket with 1011 (internal error) instead of 1008 (policy violation). The HTTP middleware already handles this with a broad except at cloudflare.py:145-147; add the matching fallback in the WS path. #6 (PII in success log): swap user_email out of the success-path WebSocket connection log for parity with the redacted unauthorized path. Tests: - Updated test_verify_ws_token_bypass_mode (test_websocket.py) + test_verify_ws_token_auth_disabled (test_websocket_router.py) to assert the dev@localhost / dev-user-001 identity. - New test_verify_ws_token_swallows_network_errors covering httpx.ConnectError → returns None. Docs: - SECURITY-FINDINGS.md §6 (Summary of fixes): note that sonarcloud.yml has been migrated on main (#27, 4055d57) so the SHA-pin from this PR's first commit will be overwritten on merge. - SECURITY-FINDINGS.md follow-ups: track tightening the batch_is_owned_by email fallback once all batches have user_id. 334 passed, 1 skipped (pre-existing). ruff/format clean.
* fix(security): close batch IDOR, harden CORS, pin CI actions
- api/batch.py, websocket/router.py: require ownership (created_by_user_id /
created_by_email) on GET /batch/{id}, GET /batch/job/{id}, and the
/ws/batch/{id} upgrade; non-owners get 404 / WS_1008_POLICY_VIOLATION so
batch existence is not leaked.
- core/config.py, main.py: read CORS origins from
RAG_PROCESSOR_CORS_ALLOWED_ORIGINS; refuse to start on "*" with
credentials; replace allow_methods/allow_headers wildcards with explicit
lists.
- auth/cloudflare.py: log CRITICAL on every request while bypass mode
(cloudflare_enabled=false) is active so a prod misconfig is loud.
- workflows/sonarcloud.yml: pin sonarsource/sonarqube-quality-gate-action
from @master to v1.2.0 SHA; add harden-runner to both jobs.
- workflows/dependency-review.yml: pin actions/dependency-review-action
from @v4 to v4.9.0 SHA; add harden-runner.
- SECURITY-FINDINGS.md: full review with severities and follow-ups.
* test(security): cover batch/WS ownership branches; fix ruff format
- ruff format ran across the new ownership condition in
websocket/router.py; commit the formatter output.
- New tests/unit/test_batch_authz.py exercises _user_owns_batch and the
owner / non-owner / missing-batch / missing-job / orphaned-job branches of
GET /batch/{batch_id} and GET /batch/job/{job_id}.
- Added a WebSocket non-owner test alongside the existing
rejects-unauthenticated / rejects-invalid-batch cases.
Total coverage 81.12% -> 82.76%. api/batch.py coverage 57% -> ~95%,
websocket/router.py 91% -> 96%. SonarCloud "new code coverage" should
now clear the 80% gate.
* ci: re-trigger checks after org workflow update
* security(cors): drop hardcoded http:// dev defaults, fail closed
SonarCloud python:S5332 (correctly) flagged the previous
cors_allowed_origins default list of `http://localhost:*` URLs as
insecure plaintext origins baked into production source. The defaults
also weakened the secure-by-default posture the rest of the PR was
aiming for: a misconfigured prod deploy would silently trust dev
origins.
- core/config.py: default cors_allowed_origins to an empty list. Empty
combined with the existing wildcard guard in main.py means the
backend fails closed; CORS preflights are denied until the operator
sets RAG_PROCESSOR_CORS_ALLOWED_ORIGINS explicitly.
- .env.example: document the recommended local-dev value so reviewers
/ new contributors aren't stuck guessing what the dev origins are.
- SECURITY-FINDINGS.md §2.2: updated to reflect the new default and
to note the SonarCloud rule that's now resolved.
* test(cors): set dev origins in conftest after empty-default change
The previous commit moved cors_allowed_origins to empty-by-default
(SonarCloud S5332 + secure-by-default), which broke the existing
TestCORSHeaders.test_cors_headers_present_on_get integration test:
the test sends Origin: http://localhost:3000 and asserts the
Access-Control-Allow-Origin header comes back, but with no origins
configured the middleware (correctly) returns no CORS headers.
CORSMiddleware reads allow_origins at app construction, so the env
var has to be set in conftest.py before the app module is imported -
the same pattern already used for RAG_PROCESSOR_RATE_LIMITING_ENABLED
and RAG_PROCESSOR_CLOUDFLARE_ENABLED.
* style(batch): use Annotated[] form for FastAPI Depends parameters
SonarCloud python:S8410 flagged the legacy
`user: CloudflareUser = Depends(get_current_user)` form on the two
endpoints added in this PR. The Annotated[] form is FastAPI's
recommended pattern and (per Sonar) avoids subtle issues with mutable
default values in function signatures.
No behavior change; tests/unit/test_batch_authz.py still passes its
mocked user via the same kwarg name.
* security(authz): consolidate ownership check + redact PII from logs
Addresses 6 CodeRabbit review comments on PR #26.
1. (security) src/rag_processor/websocket/router.py: the WS ownership
check used `owns_by_id OR owns_by_email`, so a caller whose email
coincidentally matched the batch owner's email could authenticate
as the owner even with a different Cloudflare user_id. The REST
path in api/batch.py had the correct user_id-takes-precedence
logic; the two paths could not diverge silently.
Extract the helper to auth/dependencies.py::batch_is_owned_by
(keyword-only requester_user_id / requester_email so both the REST
path with a CloudflareUser and the WS path with a verify-token
dict can call it). Both endpoints now share the same enforcement.
2. (privacy) api/batch.py + websocket/router.py: unauthorized-access
warnings previously logged owner_email, requester_email, and the
requester user_id. owner_email lets an attacker probing batch IDs
harvest owner identities from logs; requester_email is PII we
don't need for incident response when we already have
requester_user_id. Strip both fields; keep batch_id / job_id /
requester_user_id only.
3. (tests) tests/unit/test_batch_authz.py: add
test_user_id_mismatch_does_not_fall_back_to_email regression
covering the bypass described above.
4. (tests) tests/unit/test_websocket_router.py: add the WebSocket
variant — same email, mismatched user_id, assert close with
WS_1008_POLICY_VIOLATION and connection_manager.connect never
called.
5. (style) SECURITY-FINDINGS.md §"Summary of fixes applied in this
PR": replace the wide markdown table with a bulleted list so no
line exceeds the project's 120-char limit.
Tests: 333 passed, 1 skipped (pre-existing). ruff check + format
both clean.
* security(ws): align bypass identity + catch JWKS network errors
PR #26 review (Important #1 and #2).
#1 (bypass identity mismatch): verify_ws_token returned
{"email": "anonymous@local", "user_id": "local"}
in dev-bypass mode, while CloudflareAuthMiddleware._get_bypass_user
returned email="dev@localhost" / user_id="dev-user-001". After
batch_is_owned_by was introduced, a batch created over the HTTP
path (creator = dev@localhost) could no longer be subscribed to
over WebSocket (caller = anonymous@local) — ownership check failed.
Align verify_ws_token to return the same identity the HTTP path
uses, and (Suggested #4) emit the same CRITICAL log so production
misconfiguration is loud on every WS upgrade too.
#2 (unhandled network errors): verify_cloudflare_token can raise
httpx.HTTPStatusError / ConnectError / TimeoutException and
json.JSONDecodeError when the JWKS fetch fails. These aren't
subclasses of jwt.InvalidTokenError, so they previously propagated
out of verify_ws_token, closing the socket with 1011 (internal
error) instead of 1008 (policy violation). The HTTP middleware
already handles this with a broad except at cloudflare.py:145-147;
add the matching fallback in the WS path.
#6 (PII in success log): swap user_email out of the success-path
WebSocket connection log for parity with the redacted unauthorized
path.
Tests:
- Updated test_verify_ws_token_bypass_mode (test_websocket.py) +
test_verify_ws_token_auth_disabled (test_websocket_router.py) to
assert the dev@localhost / dev-user-001 identity.
- New test_verify_ws_token_swallows_network_errors covering
httpx.ConnectError → returns None.
Docs:
- SECURITY-FINDINGS.md §6 (Summary of fixes): note that
sonarcloud.yml has been migrated on main (#27, 4055d57) so the
SHA-pin from this PR's first commit will be overwritten on merge.
- SECURITY-FINDINGS.md follow-ups: track tightening the
batch_is_owned_by email fallback once all batches have user_id.
334 passed, 1 skipped (pre-existing). ruff/format clean.
---------
Co-authored-by: Claude <noreply@anthropic.com>



Summary
Addresses two CI issues in one PR:
Closes #7 — Migrate SonarCloud to org reusable workflow + fix organization
sonarcloud.yml(which usedsonarqube-scan-actionat an old SHA / v4.2.2) with a thin caller to the org reusable workflowpython-sonarcloud.yml@6bad2f898be1d387b8424e9deddefa519674cb19(uses sonarqube-scan-action v7.2.0 and resolves the/analysis/analyses404 bug on new projects).skip-if-no-token: true,fail-on-quality-gate: false, andsonar-organization: 'williaby'(the actual SonarCloud account name).ci.yml, addedenable-sonarcloud: falseandsonarcloud-organization: 'williaby'so the CI reusable workflow delegates SonarCloud responsibility tosonarcloud.yml.Closes #8 — Pin all
@mainorg reusable workflow references to SHAReplaced every
@mainsuffix onByronWilliamsCPA/.githubreusable workflow references with@e067cdb7294f6221dbde74ef1f4c3ca735eed570 # main. Floating on@mainis a supply chain risk; any commit pushed to the org.githubrepo would be immediately live in this repository's CI without review.Files updated (15 callers + sonarcloud rewrite):
ci.yml,codecov.yml,container-security.yml,coverage.yml,docs.ymlmutation-testing.yml,performance-regression.yml,publish-pypi.ymlpython-compatibility.yml,qlty.yml,release.yml,sbom.ymlscorecard.yml,security-analysis.yml,slsa-provenance.ymlsonarcloud.yml(rewritten as caller topython-sonarcloud.yml)Verified no
@mainreferences remain in any YAML workflow file.Test plan
SONAR_TOKENis unsetcijob no longer attempts inline SonarCloud (delegated tosonarcloud.yml)Closes #7
Closes #8
Generated by Claude Code
Summary by CodeRabbit
Chores
Bug Fixes
Dependencies
Documentation