security: fix 12 audit findings — v0.3.0#11
Conversation
Critical: - Worker auth no longer disabled when API key is empty; generates ephemeral key and logs a warning if CASHPILOT_API_KEY is unset - Secret key auto-generated and persisted to .secret_key if env var is missing or set to a known default like "changeme-..." - Fleet API key endpoint restricted to owner role (was any auth) - Credentials encrypted at rest with Fernet; existing plaintext values decrypted transparently (backward compatible) High: - /settings page, /api/config, and /api/collectors/meta restricted to owner role (viewers could read all credentials) - Worker proxy helpers now check resp.status_code and raise on 4xx/5xx instead of silently returning error responses as 200 - presearch.yml volumes changed from mapping to colon-delimited string so the deploy code can actually parse them Medium: - PacketStream collector returns error when HTML parsing fails instead of silently reporting balance=0 - Traffmonetizer factory wires email/password as optional args so the collector's login fallback is actually reachable - Chart.js race condition fixed with async guard before new Chart() - CI updated to Python 3.14 (matching shipped Docker images) - New catalog tests: docker section required, volumes must be strings
- Separate API key roles: CASHPILOT_ADMIN_API_KEY for owner, fleet key gives writer only - Fix default compose path: skip fleet auth when no key configured instead of rejecting - Remove ephemeral key generation from worker (caused key mismatch) - Add secret_key to encrypted config key suffixes - Add status code check in api_worker_command proxy - Tighten PacketStream zero-balance detection - Require writer role for log access (was viewer)
Follow-up fixes (79fd161)Addressed all findings from the second review: Critical
High
Medium
|
Answers to open questions1. Is Resolved in 79fd161 — it's now worker-only. The fleet key ( 2. Are "Environment Variables" in Settings actual runtime config or stored metadata? They're stored metadata — a convenience view so the user can see/document what env vars are set, but they don't override 3. v0.3.0 tagging Correct, git tag v0.3.0 && git push origin v0.3.0The release workflow will pick up the tag and build the Docker images. |
- Remove host port binding for worker (expose-only within Docker network) - Start heartbeat loop when UI_URL is set regardless of API_KEY - Track parse success in PacketStream separately from balance value
Follow-up fixes (323d775)Critical: Worker port exposure High: Heartbeat loop never starting Medium: PacketStream false positive on $0.00 |
In unauthenticated local compose mode, derive the worker URL from request.client.host instead of trusting the caller-supplied URL. Prevents URL injection via heartbeat spoofing on the Docker network.
Follow-up fix (d100520)Medium: Worker URL injection in no-key mode In unauthenticated local compose mode, the heartbeat handler now derives the worker URL from In keyed mode (fleet), the caller is authenticated so the supplied URL is trusted as before. |
Add app/fleet_key.py: both UI and worker resolve the fleet API key from CASHPILOT_API_KEY env var or a shared /fleet volume. First container to start generates the key atomically (O_EXCL); the second reads it. No-key mode is eliminated — heartbeats always require authentication. - Add cashpilot_fleet shared volume in docker-compose.yml - Remove all skip-auth fallbacks from _verify_fleet_api_key and _verify_api_key - Remove URL pinning workaround (auth makes it unnecessary) - Fix ruff format on test_catalog.py for CI
Follow-up fix (112376e)High: Worker impersonation in no-key mode — eliminated no-key mode entirely Added Changes:
Default |
After FileExistsError (other container won the O_EXCL create), poll for file content with 100ms backoff (2s max) instead of a single read that can observe the empty file before the writer finishes.
Follow-up fix (1206d39)High: First-boot race on shared fleet key The |
9 tests covering: env var priority, file read, key generation, persistence stability, O_EXCL race simulation, empty file handling, unwritable directory fallback, and file permissions.
…xir fallback - Replace _require_worker_id with async _resolve_worker_id: auto-picks the sole online worker when worker_id is omitted, fixing service detail page controls and legacy API callers - Port parsing now keys on container_port/protocol (e.g. 28967/tcp, 28967/udp) per Docker SDK, preserving protocol-specific bindings - auth.py bearer check uses fleet_key.resolve_fleet_key() instead of os.getenv, matching the auto-generated shared key in zero-config mode - Bytelixir API fallback flags balance as withdrawable-only since /api/v1/user does not return total earned amount
Follow-up fixes (6926380)High: Service detail controls broken without worker_id High: Port parsing drops protocol duplicates Medium: auth.py fleet key mismatch Medium: Bytelixir balance type |
- Add CASHPILOT_WORKER_URL env var for explicit worker URL override in complex network topologies; document in fleet compose example - Fleet page Copy button auto-fetches API key before copying instead of copying masked ******** value - Reveal/Copy/Remove controls on Fleet page hidden for non-owner users via _isOwner JS flag from Jinja2 template context
Follow-up fixes (0bf91b7)Medium: Worker URL override for fleet deployments Medium: Fleet page Copy copies masked key Low: Viewer access to owner controls on Fleet page |
…istence - catalog.get_services/get_service return shallow copies so per-request fields (deployed, node_count) don't pollute the shared cache - Inject _userRole from Jinja2 into JS; hide restart/stop/deploy controls for viewer users (backend already enforces, now UI matches) - Enable PRAGMA foreign_keys=ON so ON DELETE CASCADE works for user_preferences - Setup wizard persists category selections and timezone to /api/preferences on reaching step 4
Follow-up fixes (d1a65c8)Medium: Catalog cache mutation Medium: Viewer UI shows writer-only controls Low: SQLite foreign keys not enforced Low: Setup wizard doesn't persist preferences |
- Gate wizard deploy button and service detail modal instance controls (restart/stop/logs) behind _canWrite - PreferencesUpdate fields now nullable; POST /api/preferences merges with existing prefs so partial saves don't reset setup_mode - Fix var(--danger) → var(--error) for deploy failure status styling
Follow-up fixes (d7644db)Medium: Viewer UI gating incomplete
Medium: Wizard preference save overwrites setup_mode Low: CSS --danger undefined |
- Prevent owner from demoting themselves or removing the last owner - Move logs button inside _canWrite gate (single + multi-instance) - Hide Settings sidebar link for non-owner roles
Addressed in d8dc16fMedium — Owner self-demotion / last-owner guard
Medium — Logs button visible to viewers Low — Settings sidebar visible to non-owners |
- Fix chained comparison that made min_amount=0 services permanently ineligible; now eligible when balance > 0 and balance >= min_amount - Mirror same fix in frontend dashboard row eligibility check - Make storj api_url optional so built-in default works out of the box - Collector alerts: non-owners see alerts but clicks don't route to owner-only settings page
Addressed in a8fe31dMedium — Zero-threshold payout eligibility Medium — Storj redundant config requirement Low — Non-owner alert dead-end |
Addressed in 9153a74Low — Onboarding /settings dead-end for non-owners
|
- 12 tests for payout eligibility: zero-threshold, normal threshold, empty cashout, edge cases (parametrized) - 3 tests for storj: api_url marked optional, collector created without config using default URL, custom URL passed through
Addressed in 55d8385Onboarding /settings dead-end was fixed in the previous commit (9153a74). Test coverage for the residual gaps the reviewer flagged:
Full suite: 423 tests green, The non-owner collector alert and onboarding CTA behaviors are client-side JS — those would need an integration/E2E framework to cover, which is outside the scope of this PR. |
Rewrite test_eligibility.py to call the actual api_earnings_breakdown handler with mocked DB/catalog/auth deps instead of a mirrored boolean expression. Tests exercise real route wiring and response assembly. Skips gracefully in minimal local envs (no fastapi), runs in CI where full deps are installed via requirements.txt.
Addressed in 117f5c4Test quality — eligibility tests now exercise the real handler Rewrote 14 tests cover: zero-threshold eligible/ineligible, normal thresholds (above/exact/below), missing cashout section, unknown service fallback, full response structure validation, delta computation, and parametrized edge cases. Skips gracefully in minimal local envs ( Local: 411 passed, 1 skipped. CI: expect 425 passed (14 eligibility tests will run). |
CI installs pytest but not pytest-asyncio. Convert async test methods to sync methods that call asyncio.run() to invoke the async handler.
Comprehensive security audit and hardening (PR #11): - Fleet key bootstrap, RBAC enforcement, role-aware UI gating - Zero-threshold payout eligibility, Storj default URL - Integration test coverage for eligibility and collectors
Summary
Addresses all findings from the security audit — 4 critical, 3 high, 5 medium.
Critical
_verify_api_key()no longer skips auth whenCASHPILOT_API_KEYis empty. Generates an ephemeral random key and logs a warning.CASHPILOT_SECRET_KEYis auto-generated and persisted to.secret_keyif unset or set to a known default ("changeme-..."). Prevents forged session cookies./api/fleet/api-keynow requires owner role (was any authenticated user).High
/settings,/api/config, and/api/collectors/metarestricted to owner role._proxy_worker_command/deploy/logsnow checkresp.status_codeand raiseHTTPExceptionon 4xx/5xx.Medium
balance=0.0.email/passwordas optional args so login fallback works.new Chart()call.dockersection required, volumes must be strings.Test plan
ruff check app/passesruff format --check app/passespytest tests/ -v— 399 tests pass