Skip to content

security: fix 12 audit findings — v0.3.0#11

Merged
GeiserX merged 18 commits intomainfrom
fix/security-audit
Mar 30, 2026
Merged

security: fix 12 audit findings — v0.3.0#11
GeiserX merged 18 commits intomainfrom
fix/security-audit

Conversation

@GeiserX
Copy link
Copy Markdown
Owner

@GeiserX GeiserX commented Mar 30, 2026

Summary

Addresses all findings from the security audit — 4 critical, 3 high, 5 medium.

Critical

  • Worker auth bypass: _verify_api_key() no longer skips auth when CASHPILOT_API_KEY is empty. Generates an ephemeral random key and logs a warning.
  • Static secret key: CASHPILOT_SECRET_KEY is auto-generated and persisted to .secret_key if unset or set to a known default ("changeme-..."). Prevents forged session cookies.
  • Fleet API key privilege escalation: /api/fleet/api-key now requires owner role (was any authenticated user).
  • Plaintext credentials: Config values matching secret key patterns are encrypted at rest with Fernet. Existing plaintext values are decrypted transparently (backward compatible).

High

  • Viewer reads secrets: /settings, /api/config, and /api/collectors/meta restricted to owner role.
  • Worker proxy silent failures: _proxy_worker_command/deploy/logs now check resp.status_code and raise HTTPException on 4xx/5xx.
  • presearch.yml broken volumes: Changed from YAML mapping to colon-delimited string so the deploy code can parse it.

Medium

  • PacketStream silent zero: Returns error when HTML parsing fails instead of balance=0.0.
  • Traffmonetizer factory gap: Wires email/password as optional args so login fallback works.
  • Chart.js race condition: Added async guard before new Chart() call.
  • CI Python mismatch: Updated from 3.12 to 3.14 (matching shipped Docker images).
  • New catalog tests: docker section required, volumes must be strings.

Test plan

  • ruff check app/ passes
  • ruff format --check app/ passes
  • pytest tests/ -v — 399 tests pass
  • Deploy and verify credentials encrypt/decrypt correctly
  • Verify viewer users get 403 on /settings
  • Verify worker rejects requests when API key is unset

GeiserX added 2 commits March 30, 2026 17:52
Critical:
- Worker auth no longer disabled when API key is empty; generates
  ephemeral key and logs a warning if CASHPILOT_API_KEY is unset
- Secret key auto-generated and persisted to .secret_key if env var
  is missing or set to a known default like "changeme-..."
- Fleet API key endpoint restricted to owner role (was any auth)
- Credentials encrypted at rest with Fernet; existing plaintext
  values decrypted transparently (backward compatible)

High:
- /settings page, /api/config, and /api/collectors/meta restricted
  to owner role (viewers could read all credentials)
- Worker proxy helpers now check resp.status_code and raise on 4xx/5xx
  instead of silently returning error responses as 200
- presearch.yml volumes changed from mapping to colon-delimited string
  so the deploy code can actually parse them

Medium:
- PacketStream collector returns error when HTML parsing fails instead
  of silently reporting balance=0
- Traffmonetizer factory wires email/password as optional args so the
  collector's login fallback is actually reachable
- Chart.js race condition fixed with async guard before new Chart()
- CI updated to Python 3.14 (matching shipped Docker images)
- New catalog tests: docker section required, volumes must be strings
- Separate API key roles: CASHPILOT_ADMIN_API_KEY for owner, fleet key gives writer only
- Fix default compose path: skip fleet auth when no key configured instead of rejecting
- Remove ephemeral key generation from worker (caused key mismatch)
- Add secret_key to encrypted config key suffixes
- Add status code check in api_worker_command proxy
- Tighten PacketStream zero-balance detection
- Require writer role for log access (was viewer)
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fixes (79fd161)

Addressed all findings from the second review:

Critical

  • Separated API key roles — New CASHPILOT_ADMIN_API_KEY env var for owner-level programmatic access. CASHPILOT_API_KEY bearer now gives writer role only, so a leaked fleet key no longer grants owner access.
  • Fixed default compose path_verify_fleet_api_key() and worker's _verify_api_key() now skip auth when no key is configured (local single-compose setup), instead of rejecting heartbeats. Removed ephemeral key generation from worker that caused key mismatch.

High

  • Env settings encryption gap — Added secret_key to SECRET_CONFIG_KEYS suffixes so CASHPILOT_SECRET_KEY values stored in the DB config table are encrypted at rest.

Medium

  • api_worker_command status check — Added resp.status_code >= 400 check before returning resp.json(), matching the pattern already used in the proxy helper functions.
  • PacketStream parse hardening — Simplified the zero-balance guard: any balance == 0.0 now triggers the parse-failure error, regardless of whether userData appears in the HTML. Prevents silent 0.0 when the page structure changes but still contains the string.
  • Viewer log accessapi_service_logs() now requires writer role instead of auth_api (viewer), since log contents can expose credentials and deployment details.

@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Answers to open questions

1. Is CASHPILOT_API_KEY a human/admin token or a worker secret?

Resolved in 79fd161 — it's now worker-only. The fleet key (CASHPILOT_API_KEY) is demoted to writer role when used as Bearer token, which is enough for Home Assistant integrations (read earnings, control services) but can't change settings or manage users. A new CASHPILOT_ADMIN_API_KEY env var is available for programmatic owner-level access when needed.

2. Are "Environment Variables" in Settings actual runtime config or stored metadata?

They're stored metadata — a convenience view so the user can see/document what env vars are set, but they don't override os.getenv() at runtime. The app reads env vars at startup from the process environment; changing them in the UI has no effect until the container is restarted with updated env vars. This is by design — hot-reloading env vars like CASHPILOT_SECRET_KEY mid-flight would break active sessions. Could add a note in the UI to make this clearer, but that's cosmetic, not a security issue.

3. v0.3.0 tagging

Correct, release.yml only increments patch. After merging this PR, I'll manually tag v0.3.0:

git tag v0.3.0 && git push origin v0.3.0

The release workflow will pick up the tag and build the Docker images.

- Remove host port binding for worker (expose-only within Docker network)
- Start heartbeat loop when UI_URL is set regardless of API_KEY
- Track parse success in PacketStream separately from balance value
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fixes (323d775)

Critical: Worker port exposure
Worker no longer binds to host 0.0.0.0:8081 in the default compose. Changed ports to expose — the worker is only reachable within the Docker network. The "skip auth when no key" path is now safe since no external traffic can reach the worker. Fleet deployments on separate servers will use docker-compose.fleet.yml where the port IS exposed and CASHPILOT_API_KEY is required.

High: Heartbeat loop never starting
Changed the condition from if UI_URL and API_KEY to if UI_URL. The worker now sends heartbeats whenever a UI URL is configured, regardless of whether an API key is set. Logs a warning when running without auth.

Medium: PacketStream false positive on $0.00
Added a parsed boolean that tracks whether any extraction method succeeded, independent of the balance value. Only triggers the parse-failure error when not parsed, so legitimate $0.00 balances are returned normally.

In unauthenticated local compose mode, derive the worker URL from
request.client.host instead of trusting the caller-supplied URL.
Prevents URL injection via heartbeat spoofing on the Docker network.
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fix (d100520)

Medium: Worker URL injection in no-key mode

In unauthenticated local compose mode, the heartbeat handler now derives the worker URL from request.client.host instead of trusting the caller-supplied body.url. The scheme and port are preserved from the body, but the host is pinned to the actual TCP source. This prevents any container on the Docker network from registering a fake worker URL that would redirect proxy requests to an arbitrary destination.

In keyed mode (fleet), the caller is authenticated so the supplied URL is trusted as before.

Add app/fleet_key.py: both UI and worker resolve the fleet API key
from CASHPILOT_API_KEY env var or a shared /fleet volume. First
container to start generates the key atomically (O_EXCL); the
second reads it. No-key mode is eliminated — heartbeats always
require authentication.

- Add cashpilot_fleet shared volume in docker-compose.yml
- Remove all skip-auth fallbacks from _verify_fleet_api_key and
  _verify_api_key
- Remove URL pinning workaround (auth makes it unnecessary)
- Fix ruff format on test_catalog.py for CI
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fix (112376e)

High: Worker impersonation in no-key mode — eliminated no-key mode entirely

Added app/fleet_key.py — a shared fleet key resolver used by both UI and worker. When CASHPILOT_API_KEY is not set via env var, both containers auto-generate and share a key through a cashpilot_fleet Docker volume mounted at /fleet. The first container to start creates the key atomically (O_CREAT | O_EXCL); the second reads it.

Changes:

  • docker-compose.yml: Added cashpilot_fleet shared volume mounted in both services
  • _verify_fleet_api_key() and _verify_api_key(): Removed all skip-auth fallbacks — always require a valid key, return 503 if key resolution failed
  • Removed the URL pinning workaround from the heartbeat handler (auth makes it unnecessary)
  • Fixed ruff format on test_catalog.py so CI lint passes

Default docker compose up now works securely out of the box — no env var configuration needed.

After FileExistsError (other container won the O_EXCL create),
poll for file content with 100ms backoff (2s max) instead of a
single read that can observe the empty file before the writer
finishes.
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fix (1206d39)

High: First-boot race on shared fleet key

The FileExistsError handler in resolve_fleet_key() now polls for file content with 100ms backoff (2s max) instead of a single read. This closes the window between os.open (file created empty) and os.write (key written) where the losing container could observe an empty file and cache an empty key permanently.

GeiserX added 2 commits March 30, 2026 19:23
9 tests covering: env var priority, file read, key generation,
persistence stability, O_EXCL race simulation, empty file handling,
unwritable directory fallback, and file permissions.
…xir fallback

- Replace _require_worker_id with async _resolve_worker_id: auto-picks
  the sole online worker when worker_id is omitted, fixing service
  detail page controls and legacy API callers
- Port parsing now keys on container_port/protocol (e.g. 28967/tcp,
  28967/udp) per Docker SDK, preserving protocol-specific bindings
- auth.py bearer check uses fleet_key.resolve_fleet_key() instead of
  os.getenv, matching the auto-generated shared key in zero-config mode
- Bytelixir API fallback flags balance as withdrawable-only since
  /api/v1/user does not return total earned amount
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fixes (6926380)

High: Service detail controls broken without worker_id
Replaced _require_worker_id() with async _resolve_worker_id(). When worker_id is omitted and only one worker is online, it auto-resolves to that worker. Returns 503 if none are online, 400 if multiple are online (caller must choose). This fixes the server-rendered template, the legacy loadDetailLogs JS function, and any external API caller that doesn't know the worker_id.

High: Port parsing drops protocol duplicates
Changed port dict key from host port to container_port/protocol (e.g. 28967/tcp, 28967/udp) per Docker SDK format. Storj and other services with same-port TCP+UDP bindings now deploy both.

Medium: auth.py fleet key mismatch
Bearer auth in get_current_user() now calls fleet_key.resolve_fleet_key() instead of os.getenv("CASHPILOT_API_KEY"), so the auto-generated shared key from the /fleet volume is honored.

Medium: Bytelixir balance type
API fallback now returns the balance with error="Withdrawable balance only (HTML scrape failed, using API fallback)" since /api/v1/user returns withdrawable balance, not total earned.

- Add CASHPILOT_WORKER_URL env var for explicit worker URL override
  in complex network topologies; document in fleet compose example
- Fleet page Copy button auto-fetches API key before copying instead
  of copying masked ******** value
- Reveal/Copy/Remove controls on Fleet page hidden for non-owner users
  via _isOwner JS flag from Jinja2 template context
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fixes (0bf91b7)

Medium: Worker URL override for fleet deployments
Added CASHPILOT_WORKER_URL env var. When set, the worker uses this URL in heartbeats instead of auto-detecting via the 8.8.8.8 UDP trick. Documented in docker-compose.fleet.yml for both the main-server and remote-server examples.

Medium: Fleet page Copy copies masked key
copyWorkerEnv() is now async — it auto-fetches the API key via /api/fleet/api-key before building the env block, so the copied text always has the real key.

Low: Viewer access to owner controls on Fleet page
Added _isOwner JS flag from Jinja2 user.r. Remove Worker buttons and Reveal/Copy API Key controls are now hidden for non-owner users. The backend APIs already enforce owner-only, but the UI no longer shows controls that would fail.

…istence

- catalog.get_services/get_service return shallow copies so per-request
  fields (deployed, node_count) don't pollute the shared cache
- Inject _userRole from Jinja2 into JS; hide restart/stop/deploy
  controls for viewer users (backend already enforces, now UI matches)
- Enable PRAGMA foreign_keys=ON so ON DELETE CASCADE works for
  user_preferences
- Setup wizard persists category selections and timezone to
  /api/preferences on reaching step 4
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fixes (d1a65c8)

Medium: Catalog cache mutation
catalog.get_services() and get_service() now return shallow copies (dict(s)) so per-request enrichment fields (deployed, manual_only, node_count) don't accumulate on the shared in-memory cache.

Medium: Viewer UI shows writer-only controls
Injected window._userRole from Jinja2 user.r in base.html. app.js uses _canWrite (owner or writer) to conditionally render restart/stop buttons and gate deploy buttons with disabled for viewers. Logs remain visible (read-only) since viewers should be able to observe.

Low: SQLite foreign keys not enforced
Added PRAGMA foreign_keys=ON to _get_db() so ON DELETE CASCADE on user_preferences.user_id actually fires.

Low: Setup wizard doesn't persist preferences
wizardNext() now calls POST /api/preferences when reaching step 4 (summary), matching the same save behavior as the onboarding flow.

- Gate wizard deploy button and service detail modal instance controls
  (restart/stop/logs) behind _canWrite
- PreferencesUpdate fields now nullable; POST /api/preferences merges
  with existing prefs so partial saves don't reset setup_mode
- Fix var(--danger) → var(--error) for deploy failure status styling
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Follow-up fixes (d7644db)

Medium: Viewer UI gating incomplete

  • Setup wizard deploy button now gated with disabled for non-writers
  • Service detail modal per-instance controls (Restart/Stop/Logs) now hidden entirely for non-writers via _canWrite

Medium: Wizard preference save overwrites setup_mode
PreferencesUpdate fields are now Optional (default None). The handler reads existing preferences and merges — only fields explicitly sent are updated. The wizard save at step 4 no longer overwrites setup_mode since it doesn't send that field.

Low: CSS --danger undefined
Changed var(--danger) to var(--error) in the deploy failure status styling, matching the actual CSS custom property name.

- Prevent owner from demoting themselves or removing the last owner
- Move logs button inside _canWrite gate (single + multi-instance)
- Hide Settings sidebar link for non-owner roles
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Addressed in d8dc16f

Medium — Owner self-demotion / last-owner guard
PATCH /api/users/{id} now blocks:

  • Self-demotion (owner changing their own role to non-owner)
  • Removing the last owner (counts existing owners before allowing demotion)

Medium — Logs button visible to viewers
Moved the viewLogs button inside the _canWrite ternary in both single-instance and multi-instance dashboard rows. Viewers no longer see restart, stop, or logs buttons.

Low — Settings sidebar visible to non-owners
Wrapped the Settings nav link in {% if user and user.r == 'owner' %} so only owners see it.

- Fix chained comparison that made min_amount=0 services permanently
  ineligible; now eligible when balance > 0 and balance >= min_amount
- Mirror same fix in frontend dashboard row eligibility check
- Make storj api_url optional so built-in default works out of the box
- Collector alerts: non-owners see alerts but clicks don't route to
  owner-only settings page
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Addressed in a8fe31d

Medium — Zero-threshold payout eligibility
Backend: replaced chained comparison balance >= min_amount > 0 with bool(cashout) and balance > 0 and balance >= min_amount. Services with min_amount: 0 are now eligible when they have a positive balance. Frontend dashboard row mirrors the same fix (balance > 0 && balance >= minAmount). The claim modal already reads co.eligible from the API response, so it inherits the backend fix automatically.

Medium — Storj redundant config requirement
Changed "storj": ["api_url"]"storj": ["?api_url"] in _COLLECTOR_ARGS. The ? prefix makes it optional, so StorjCollector falls back to its built-in default (http://localhost:14002) when no config is set.

Low — Non-owner alert dead-end
Collector alerts are still visible to all roles (informational value), but the onclick handler and pointer cursor are only attached for owners. Non-owners see the alert text without being routed to the owner-only /settings page.

@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Addressed in 9153a74

Low — Onboarding /settings dead-end for non-owners
Step 4 CTAs are now role-aware via window._userRole. For non-owners:

  • Monitoring mode: description says "An owner will configure…" and only shows a Dashboard button (no /settings link)
  • Mixed mode: description drops the Settings reference and shows Dashboard + "Deploy New Services" (/setup) instead
  • Fresh mode: unchanged (already links to /catalog and /setup, both accessible to all roles)

- 12 tests for payout eligibility: zero-threshold, normal threshold,
  empty cashout, edge cases (parametrized)
- 3 tests for storj: api_url marked optional, collector created without
  config using default URL, custom URL passed through
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Addressed in 55d8385

Onboarding /settings dead-end was fixed in the previous commit (9153a74).

Test coverage for the residual gaps the reviewer flagged:

  • tests/test_eligibility.py — 12 tests covering zero-threshold (min_amount=0) positive/zero balance, normal thresholds (above/exact/below), empty cashout dict, and parametrized edge cases
  • tests/test_collectors.py — 3 new tests: ?api_url marker present, make_collectors creates StorjCollector with default URL when no config, custom URL passes through

Full suite: 423 tests green, ruff check . and ruff format --check . clean.

The non-owner collector alert and onboarding CTA behaviors are client-side JS — those would need an integration/E2E framework to cover, which is outside the scope of this PR.

Rewrite test_eligibility.py to call the actual api_earnings_breakdown
handler with mocked DB/catalog/auth deps instead of a mirrored boolean
expression. Tests exercise real route wiring and response assembly.

Skips gracefully in minimal local envs (no fastapi), runs in CI where
full deps are installed via requirements.txt.
@GeiserX
Copy link
Copy Markdown
Owner Author

GeiserX commented Mar 30, 2026

Addressed in 117f5c4

Test quality — eligibility tests now exercise the real handler

Rewrote tests/test_eligibility.py to call the actual api_earnings_breakdown route handler with mocked database.get_earnings_per_service, catalog.get_service, and auth.get_current_user. No more mirrored boolean expression — a regression in route wiring or response assembly will now be caught.

14 tests cover: zero-threshold eligible/ineligible, normal thresholds (above/exact/below), missing cashout section, unknown service fallback, full response structure validation, delta computation, and parametrized edge cases.

Skips gracefully in minimal local envs (try/except ImportErrorpytest.skip), runs in CI where pip install -r requirements.txt provides the full dep chain.

Local: 411 passed, 1 skipped. CI: expect 425 passed (14 eligibility tests will run).

GeiserX added 2 commits March 31, 2026 01:02
CI installs pytest but not pytest-asyncio. Convert async test methods
to sync methods that call asyncio.run() to invoke the async handler.
Comprehensive security audit and hardening (PR #11):
- Fleet key bootstrap, RBAC enforcement, role-aware UI gating
- Zero-threshold payout eligibility, Storj default URL
- Integration test coverage for eligibility and collectors
@GeiserX GeiserX merged commit d2e29ec into main Mar 30, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant