feat: security hardening, collector refactor, async orchestrator#57
Conversation
Traffmonetizer now enforces reCAPTCHA on login, permanently blocking programmatic email/password auth. Switch collector to browser-extracted JWT (same pattern as bytelixir/grass). Simpler implementation: token-only constructor, clear error messages guiding users to Local Storage. Add delete_volumes flag to remove_service that cleans up named Docker volumes when removing a container. Flows end-to-end: UI → main API → worker API → orchestrator. Bind mounts and anonymous volumes are skipped.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughPR refactors collectors to share an async HTTP client and retry helper, migrates Traffmonetizer to token-only JWT auth, adds cached collector lifecycle and close_all_collectors, extends container removal with optional Docker volume deletion (with per-volume error reporting), hardens auth/session handling, adds per-IP login rate limiting and concurrent collection, validates deploy specs, and updates DB PRAGMA, templates, tests, and security headers. ChangesCollectors refactor & Traffmonetizer JWT
Server infra, orchestrator, auth, DB, and API
🎯 4 (Complex) | ⏱️ ~45 minutes Possibly Related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Use hmac.compare_digest for admin/fleet API key comparison (timing attack) - Add Content-Security-Policy header to security middleware - Add login rate limiting (5 attempts per 5 minutes per IP) - Block privileged containers and dangerous volume mounts on worker - Escape error messages in fleet.html to prevent XSS - Add PRAGMA busy_timeout=5000 to prevent SQLITE_BUSY on contention - Parallelize collector execution with asyncio.gather - Add collection lock to prevent overlapping runs - Replace silent contextlib.suppress with try/except+logging on worker - Upgrade background task error logging from debug to warning
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/main.py`:
- Around line 53-59: The current _check_login_rate function records every
attempt before authentication, causing successful logins to count toward the
limit; change the flow so only failed attempts update the _login_attempts bucket
and ensure the bucket is cleared on successful authentication. Concretely,
remove the pre-auth append in _check_login_rate (or add an argument to make it a
read-only check) and instead call a new/updated function to record failures
(e.g., increment_failed_login or _record_failed_login) when authentication
fails; additionally, on successful auth clear _login_attempts[client_ip] (or
reset its list) in the success path. Apply the same change to the other
occurrence referenced (lines ~394-410) so only failed logins are recorded and
successful logins reset the bucket.
In `@app/worker_api.py`:
- Around line 278-287: In _validate_deploy_spec, the volume check only rejects
exact matches so nested/relative mounts slip through; normalize each volume
source (use os.path.realpath or pathlib.Path(source).resolve(strict=False)) and
normalize all entries in _BLOCKED_VOLUME_SOURCES the same way, then reject if
the resolved source is equal to a blocked path or is a subpath (i.e., startswith
blocked_path + os.sep) or is root "/"; update the loop over spec.volumes
(variable source) to perform this normalized comparison and raise the same
HTTPException when matched.
In `@tests/test_main_deploy_routes.py`:
- Around line 272-285: The assertion is too weak; in
test_remove_with_delete_volumes replace the fuzzy check of
mock_client.delete.call_args with a direct assertion that the forwarded query
params are exactly as expected: inspect
mock_client.delete.call_args.kwargs["params"] and assert it equals
{"delete_volumes": "true"} so the test verifies the precise params forwarded by
the code under test (refer to test_remove_with_delete_volumes and
mock_client.delete.call_args).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fe4b2ff5-7de4-4c21-a8bf-31fc12df17ac
📒 Files selected for processing (11)
app/auth.pyapp/collectors/traffmonetizer.pyapp/database.pyapp/main.pyapp/orchestrator.pyapp/templates/fleet.htmlapp/worker_api.pyservices/bandwidth/traffmonetizer.ymltests/test_collectors_deep.pytests/test_coverage_gaps.pytests/test_main_deploy_routes.py
CI environment doesn't install docker SDK (runtime-only dependency). Use pytest.importorskip to gracefully skip these tests.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #57 +/- ##
==========================================
- Coverage 96.58% 92.04% -4.55%
==========================================
Files 24 24
Lines 2375 2526 +151
==========================================
+ Hits 2294 2325 +31
- Misses 81 201 +120
🚀 New features to boost your workflow:
|
…nc orchestrator - BaseCollector: add _get_client() for HTTP connection reuse, _retry() for transient failure recovery, close() for lifecycle management - Factory: cache collector instances across cycles (tokens persist), evict stale instances on config change, close_all on shutdown - All 14 collectors: use shared client, add exc_info=True to error logs, add super().__init__(), use self._retry() for balance fetches - Grass: replace magic sentinel floats with _GrassStatus enum - IPRoyal: stable device_id across logins, token caching with 401 retry - EarnApp: configurable API version via class attribute - Bytelixir: path-based URL validation instead of string "login" check - Orchestrator: wrap all Docker SDK calls in asyncio.to_thread() - Auth: add session iat (issued-at) + SESSION_EPOCH for mass invalidation - Database: add password_changed_at column migration + update function - Pipeline: close stale collectors, sanitize alert errors (200 char max) - Password policy: minimum raised from 8 to 10 characters - Remove dead bytes_uploaded field from EarningsResult
EarnApp dashboard API now requires version 1.627.783. Traffmonetizer needs Origin/Referer headers matching the browser app to avoid 403s.
- Rate limiter only counts failed logins (not successful ones) - Volume validation blocks subpaths of sensitive mounts - Strengthen delete_volumes test assertion to check exact params
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
All 3 CodeRabbit findings from the previous review have been addressed in commit af6e093:
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/collectors/__init__.py (1)
79-147:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale collectors not closed after
make_collectorsreturns.
make_collectorsis synchronous but populates_stalewith evicted collectors. These only get closed whenclose_all_collectors()is called (typically at shutdown). If config changes frequently, stale collectors accumulate with open httpx clients.Consider returning or scheduling async cleanup, or making the caller responsible for awaiting
_close_stale()after eachmake_collectorscall.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/collectors/__init__.py` around lines 79 - 147, make_collectors currently accumulates evicted collectors in _stale but never closes them until shutdown; after building collectors, if _stale is non-empty schedule asynchronous cleanup instead of deferring to shutdown: import asyncio and call asyncio.create_task(_close_stale()) (or similar non-blocking scheduling) at the end of make_collectors when _stale has items, ensuring _close_stale() consumes/clears _stale and that _cached_collectors/_cached_kwargs remain consistent; alternatively, if you prefer caller-driven cleanup, change make_collectors to return a token or the stale list so callers can await _close_stale() themselves (refer to make_collectors, _stale, _close_stale(), close_all_collectors(), _cached_collectors, _cached_kwargs).
🧹 Nitpick comments (2)
app/collectors/grass.py (1)
106-108: 💤 Low value
_estimate_from_active_devicesstill returns-1.0sentinel while_get_settled_pointsuses enum.For consistency, consider returning
_GrassStatus.RATE_LIMITEDfrom_estimate_from_active_deviceson 429, matching the enum pattern in_get_settled_points.Also applies to: 184-189
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/collectors/grass.py` around lines 106 - 108, The function _estimate_from_active_devices returns a float sentinel -1.0 on HTTP 429 but the rest of the module (e.g., _get_settled_points) uses the _GrassStatus enum; change the 429 return path(s) in _estimate_from_active_devices (and the other occurrence around the second 429 handling) to return _GrassStatus.RATE_LIMITED instead of -1.0, and update any callers or type annotations as needed to expect an _GrassStatus value from this function.app/collectors/base.py (1)
36-42: 💤 Low valueClient kwargs not stored; subsequent calls with different kwargs silently reuse stale config.
If a collector calls
_get_client(cookies=X)then later_get_client(timeout=60), the second call reuses the existing client (which has the old cookies but ignores the new timeout). This works only because collectors currently use consistent kwargs.If intentional, document that the first call's kwargs win. Otherwise, store
_client_kwargsand recreate on mismatch.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/collectors/base.py` around lines 36 - 42, The _get_client method currently reuses self._client even when callers pass different kwargs, causing stale configuration; update the class to record the initial client kwargs (e.g., self._client_kwargs) and in _get_client compare incoming kwargs to self._client_kwargs and recreate self._client with the new combined defaults when they differ (also update self._client_kwargs), or if you prefer the first-call-wins behavior, add a clear docstring on _get_client and a comment stating that the first call's kwargs are authoritative; reference the _get_client method and the self._client attribute when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/collectors/iproyal.py`:
- Around line 86-96: When handling a 401 you refresh the token then call
_fetch_balance directly, which bypasses the retry/backoff wrapper used for the
initial fetch; change the re-login path to invoke the same retry-enabled fetch
logic instead of calling _fetch_balance directly (either call the existing
higher-level method that applies retries/backoff or wrap _fetch_balance in the
same retry/backoff helper) so transient network errors after token refresh get
the same retry behavior; update the 401 branch where self._token is set and resp
is re-fetched to use that retry-enabled call.
---
Outside diff comments:
In `@app/collectors/__init__.py`:
- Around line 79-147: make_collectors currently accumulates evicted collectors
in _stale but never closes them until shutdown; after building collectors, if
_stale is non-empty schedule asynchronous cleanup instead of deferring to
shutdown: import asyncio and call asyncio.create_task(_close_stale()) (or
similar non-blocking scheduling) at the end of make_collectors when _stale has
items, ensuring _close_stale() consumes/clears _stale and that
_cached_collectors/_cached_kwargs remain consistent; alternatively, if you
prefer caller-driven cleanup, change make_collectors to return a token or the
stale list so callers can await _close_stale() themselves (refer to
make_collectors, _stale, _close_stale(), close_all_collectors(),
_cached_collectors, _cached_kwargs).
---
Nitpick comments:
In `@app/collectors/base.py`:
- Around line 36-42: The _get_client method currently reuses self._client even
when callers pass different kwargs, causing stale configuration; update the
class to record the initial client kwargs (e.g., self._client_kwargs) and in
_get_client compare incoming kwargs to self._client_kwargs and recreate
self._client with the new combined defaults when they differ (also update
self._client_kwargs), or if you prefer the first-call-wins behavior, add a clear
docstring on _get_client and a comment stating that the first call's kwargs are
authoritative; reference the _get_client method and the self._client attribute
when making this change.
In `@app/collectors/grass.py`:
- Around line 106-108: The function _estimate_from_active_devices returns a
float sentinel -1.0 on HTTP 429 but the rest of the module (e.g.,
_get_settled_points) uses the _GrassStatus enum; change the 429 return path(s)
in _estimate_from_active_devices (and the other occurrence around the second 429
handling) to return _GrassStatus.RATE_LIMITED instead of -1.0, and update any
callers or type annotations as needed to expect an _GrassStatus value from this
function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a366f8dd-e5a2-403e-bd6c-04d3fbabd5bb
📒 Files selected for processing (25)
app/auth.pyapp/collectors/__init__.pyapp/collectors/base.pyapp/collectors/bitping.pyapp/collectors/bytelixir.pyapp/collectors/earnapp.pyapp/collectors/earnfm.pyapp/collectors/grass.pyapp/collectors/honeygain.pyapp/collectors/iproyal.pyapp/collectors/mystnodes.pyapp/collectors/packetstream.pyapp/collectors/proxyrack.pyapp/collectors/repocket.pyapp/collectors/salad.pyapp/collectors/storj.pyapp/collectors/traffmonetizer.pyapp/database.pyapp/main.pyapp/worker_api.pytests/test_collectors.pytests/test_collectors_deep.pytests/test_collectors_extended.pytests/test_coverage_gaps.pytests/test_main_deploy_routes.py
💤 Files with no reviewable changes (1)
- tests/test_collectors.py
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/test_main_deploy_routes.py
- tests/test_coverage_gaps.py
- app/collectors/traffmonetizer.py
Ensures transient network errors after token refresh get the same retry/backoff protection as the initial fetch.
Summary
Major release combining security fixes, collector infrastructure refactor, and performance improvements identified through an extensive multi-agent code review (8 specialized reviewers covering security, architecture, performance, collector quality, API design, error handling, concurrency, and test coverage).
Security Hardening
==withhmac.compare_digestfor admin and fleet key checks inauth.py/,/etc,/proc,/sys,/root, Docker socket) at worker leveliattimestamp;CASHPILOT_SESSION_EPOCHenv var enables mass session revocationCollector Base Refactor
BaseCollector._get_client()creates persistenthttpx.AsyncClientper collector (eliminates 14+ TLS handshakes per cycle)BaseCollector._retry()with exponential backoff forTimeoutException/NetworkErrorclose_all_collectors()on shutdown prevents unclosed client warningsexc_info=Truebytes_uploadedfield removed fromEarningsResultCollector-Specific Improvements
_GrassStatusenum_get_clientoverride for session cookiesimport jsonto module levelPerformance & Reliability
asyncio.gather()runs all 14 collectors concurrently (was sequential: 42-70s → ~5-10s)asyncio.to_thread()on worker (was blocking event loop 10-20s per heartbeat with 10 containers)PRAGMA busy_timeout=5000preventsSQLITE_BUSYon write contentioncontextlib.suppress(Exception)with try/except + loggingInfrastructure
delete_volumesflag on container removal endpointpytest.importorskip("docker")for CI where SDK is unavailableTest plan
Summary by CodeRabbit
New Features
Bug Fixes
Security
Performance