feat(api): add PAKE pairing and AES-256-GCM WebSocket encryption by wizzomafizzo · Pull Request #630 · ZaparooProject/zaparoo-core

wizzomafizzo · 2026-04-09T08:04:59Z

Summary

PAKE-based client pairing (schollz/pake/v3, P-256) with 6-digit PIN and HMAC confirmation
AES-256-GCM per-session encryption for WebSocket with HKDF-derived directional keys
Non-WS transports (HTTP POST, SSE, REST GET) locked to localhost by default
Paired client management: Clients table, clients/clients.delete RPC methods
mDNS TXT records stripped (no version/platform/id broadcast)
Client SDK docs at docs/api/encryption.md

Encryption defaults to off. Localhost is always exempt. This is PR1 (server); TUI pairing menu is PR2.

Transport policy changes

When AllowedIPs is empty (default) and encryption is off (default):

Transport	Localhost	Remote (old)	Remote (new)
WebSocket	Open	Open (API key)	Open (API key) — no change
HTTP POST (`/api`)	Open	Open (API key)	Blocked (403)
SSE (`/api/events`)	Open	Open (API key)	Blocked (403)
REST GET (`/run/*` etc)	Open	Open (API key)	Blocked (403)
Pairing (`/api/pair/*`)	Open	N/A (new)	Open (rate-limited)
Health (`/health`)	Open	Open	Open — no change
App (`/app/*`)	Open	Open	Open — no change

Remote HTTP POST, SSE, and REST GET users who currently rely on an empty AllowedIPs must add their IP/CIDR to restore access. WebSocket (the primary transport for all clients) is unaffected.

When AllowedIPs is populated, listed IPs can access all transports as before.

New dependencies

github.com/schollz/pake/v3 v3.1.1 (PAKE protocol)
github.com/google/uuid (already indirect, now direct for client IDs)

Summary by CodeRabbit

New Features
- End-to-end WebSocket encryption with a PIN-based pairing flow (PAKE), per-session keys, and encrypted JSON-RPC framing.
- Pairing HTTP endpoints (/api/pair/start, /api/pair/finish) with rate limiting and a one-time PIN flow.
- Localhost-only paired-client management: list and delete paired devices; server emits a "clients.paired" notification.
- Config toggle to enable/disable WebSocket encryption and automatic last-seen tracking for paired clients.
- Non-WebSocket HTTP/SSE routes are now restricted by allowlist; pairing endpoints remain remotely reachable.
Documentation
- New comprehensive guide documenting the WebSocket encryption protocol, pairing flow, and client examples.
Bug Fixes / Reliability
- Background cleanup, rate-limit and replay protections, and counter exhaustion guards to improve session robustness.

Add application-layer encryption for WebSocket connections using PAKE-based pairing (schollz/pake/v3 P-256) and AES-256-GCM with HKDF-derived per-session keys. Pairing flow: device displays a 6-digit PIN, client runs PAKE exchange via /api/pair/start + /api/pair/finish with HMAC confirmation, server persists the paired client with a 32-byte long-term key. Encrypted sessions derive directional keys from the pairing key + a per-connection random salt. Transport policy: WebSocket supports both encrypted and plaintext (controlled by Service.Encryption config). Non-WS transports (HTTP POST, SSE) are locked to localhost by default via NonWSIPFilterMiddleware. Pairing, health, and app routes remain remote-accessible. Also includes: - Clients table migration + CRUD (UserDBI interface) - clients/clients.delete RPC methods (localhost-only) - Per-session failed-decrypt rate limiter with exponential backoff - Salt replay deduplication - LastSeenAt tracker with batched DB flush - Melody buffer overflow handler (force-close to prevent counter desync) - mDNS TXT record stripping (no version/platform/id broadcast) - Client SDK documentation (docs/api/encryption.md) Encryption defaults to off. Localhost is always exempt from encryption requirements.

coderabbitai · 2026-04-09T08:05:14Z

📝 Walkthrough

Walkthrough

Adds PAKE2 HTTP pairing, HKDF/AES-256-GCM per-session WebSocket encryption with first-frame handshake, paired-client persistence and management, IP-filter and rate-limit changes, last-seen batching, DB migration and SQL for clients, config toggle for encryption, comprehensive tests, and encryption documentation.

Changes

Cohort / File(s)	Summary
Documentation `docs/api/encryption.md`	New specification for PAKE pairing, encrypted WebSocket first-frame/subsequent-frame protocol, HKDF session derivation, nonce/counter rules, limits, rate-limits, error semantics, platform deps, and a JS example.
Module deps `go.mod`	Added `github.com/schollz/pake/v3 v3.1.1` (plus indirect curve deps).
Crypto primitives & tests `pkg/api/crypto/crypto.go`, `pkg/api/crypto/crypto_test.go`	HKDF-based session key derivation, AES-256-GCM AEAD helpers, nonce-by-counter XOR, counter-exhaustion sentinel errors, and unit tests for round-trips, directional separation, AAD/tamper checks, and exhaustion.
Pairing manager & handlers `pkg/api/pairing.go`, `pkg/api/pairing_test.go`	New `PairingManager` with `/api/pair/start` and `/api/pair/finish` implementing PAKE+PIN flow, session/PIN/attempt limits, HKDF-derived pairingKey, client persistence, notifications, cleanup loop, and extensive tests (including concurrency and error cases).
Encryption gateway & middleware `pkg/api/middleware/encryption.go`, `pkg/api/middleware/encryption_test.go`, `pkg/api/middleware/export_test.go`	New `EncryptionGateway` and `ClientSession` managing salt dedup, failure/backoff tracking, per-connection AEADs/counters/AAD, first-frame establishment and subsequent-frame decryption/encryption, cleanup, and broad tests; test-only exports expose internal state.
Server WebSocket integration `pkg/api/server.go`, `pkg/api/server_encryption.go`, `pkg/api/server_encryption_test.go`	WS handler updated to support encrypted first-frame/frames, integrate pairing/encryption/last-seen components, plaintext/error builders, send/close semantics for encrypted sessions, and tests for pong/encryption shapes and remote-IP parsing.
IP filtering & wiring tests `pkg/api/middleware/ipfilter.go`, `pkg/api/middleware/ipfilter_test.go`, `pkg/api/router_wiring_test.go`	Replaced IPFilter with `NonWSIPFilterMiddleware` (loopback allowed, non-WS require allowlist), updated tests and router wiring to ensure pairing & WS reachability and pairing-specific rate-limiting placement.
Client management API & tests `pkg/api/methods/clients.go`, `pkg/api/methods/clients_test.go`	New localhost-only handlers to list and delete paired clients, with validation, DB calls, and unit tests for permission/DB error paths.
Models, notifications & JSON tests `pkg/api/models/models.go`, `pkg/api/models/responses.go`, `pkg/api/models/responses_test.go`, `pkg/api/notifications/notifications.go`	Added `PairedClient`, `ClientsResponse`, `ClientsDeleteParams`, `ClientsPairedNotification`, `NotificationClientsPaired` constant; added `ErrorObject.Data` field; `ClientsPaired` dispatcher; JSON-shape tests to prevent sensitive-field leakage.
Last-seen batching & tests `pkg/api/middleware/lastseen.go`, `pkg/api/middleware/lastseen_test.go`	New `LastSeenTracker` with Touch/Flush/StartFlushLoop to batch UpdateClientLastSeen, requeue on cancellation, and tests for behavior and final flush on shutdown.
Rate limiter changes & tests `pkg/api/middleware/ratelimit.go`, `pkg/api/middleware/ratelimit_test.go`	`IPRateLimiter` supports configurable rate/burst via `NewIPRateLimiterWithLimits`; WebSocket handler now closes connection on rate exceed; tests added/updated.
Database: client model, SQL & migrations `pkg/database/database.go`, `pkg/database/userdb/.go`, `pkg/database/userdb/migrations/..._create_clients_table.sql`, `pkg/database/userdb/_test.go`, `pkg/database/userdb/sql.go`, `pkg/database/userdb/sql_test.go`, `pkg/database/userdb/userdb_integration_test.go`	Added `Client` type and `UserDBI` client CRUD/lookup/count/lastSeen methods, SQL implementations with auth-token validation and sentinel errors, migration to create `Clients` table with constraints, and unit/integration tests.
Testing & mocks `pkg/api/integration_encryption_test.go`, `pkg/testing/helpers/db_mocks.go`	Integration test for pairing → encrypted-session flow (including revoked and wrong-key cases); extended `MockUserDBI` with client CRUD methods for tests.
Configuration API & tests `pkg/config/configservice.go`, `pkg/config/configencryption.go`, `pkg/config/configencryption_test.go`	Added `Service.Encryption bool` config field and thread-safe getters/setters (`EncryptionEnabled`/`SetEncryptionEnabled`) with tests and TOML round-trip.
Service discovery changes & tests `pkg/service/discovery/discovery.go`, `pkg/service/discovery/discovery_test.go`, `pkg/service/service.go`	Removed platform ID from discovery TXT records (default TXT now empty), changed `discovery.New` signature, and updated call sites/tests.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant HTTP as HTTP (Pairing)
    participant PM as PairingManager
    participant Crypto as Crypto/PAKE
    participant DB as UserDB
    participant Gateway as EncryptionGateway

    Client->>HTTP: POST /api/pair/start (base64 msgA)
    HTTP->>PM: HandlePairStart()
    PM->>Crypto: PAKE responder -> compute msgB
    Crypto-->>PM: msgB, sessionID
    PM-->>HTTP: {msgB, sessionID}
    HTTP-->>Client: {msgB, sessionID}

    Client->>HTTP: POST /api/pair/finish (HMAC, sessionID)
    HTTP->>PM: HandlePairFinish()
    PM->>PM: Verify HMAC, derive pairingKey (HKDF)
    PM->>DB: CreateClient {ClientID, AuthToken, PairingKey}
    DB-->>PM: success
    PM-->>HTTP: {AuthToken, ServerHMAC}
    HTTP-->>Client: {AuthToken, ServerHMAC}

    Client->>Gateway: EncryptedFirstFrame {salt, authToken, ciphertext}
    Gateway->>DB: GetClientByToken(authToken)
    DB-->>Gateway: Client{PairingKey,...}
    Gateway->>Crypto: DeriveSessionKeys(pairingKey, salt)
    Crypto-->>Gateway: SessionKeys
    Gateway->>Crypto: Decrypt(ciphertext)
    Crypto-->>Gateway: plaintext -> session established

sequenceDiagram
    participant Client
    participant WS as WebSocket Handler
    participant Gateway as EncryptionGateway
    participant Tracker as LastSeenTracker
    participant Crypto as Crypto/AEAD
    participant DB as UserDB

    Client->>WS: EncryptedFrame {ciphertext, counter}
    WS->>Gateway: DecryptSubsequent(frame)
    Gateway->>Crypto: Decrypt(ciphertext, counter, AAD)
    Crypto-->>Gateway: plaintext JSON-RPC
    Gateway-->>WS: decrypted request
    WS->>Tracker: Touch(authToken, now)
    WS->>WS: Handle request -> response
    WS->>Crypto: Encrypt(response, counter, AAD)
    Crypto-->>WS: ciphertext
    WS-->>Client: EncryptedFrame {ciphertext}
    Tracker->>DB: Periodic Flush -> UpdateClientLastSeen(authToken, ts)
    DB-->>Tracker: OK

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐰 I hopped a PIN into the moonlit key,

PAKE sang softly — secrets sown with glee.
Salts and counters tumble, coins that clink,
C2S, S2C: I guard the ciphered link.
Encrypted carrots snug — a bunny's wink.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 45.90% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'feat(api): add PAKE pairing and AES-256-GCM WebSocket encryption' accurately and concisely summarizes the main changes: introduction of PAKE-based client pairing and AES-256-GCM encryption for WebSocket connections. It is specific, clear, and captures the primary feature additions from the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/api-encryption

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sentry · 2026-04-09T08:11:05Z

Codecov Report

❌ Patch coverage is 68.39080% with 330 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
pkg/api/server.go	24.43%	127 Missing and 6 partials ⚠️
pkg/api/middleware/encryption.go	77.89%	45 Missing and 16 partials ⚠️
pkg/api/pairing.go	78.36%	44 Missing and 17 partials ⚠️
pkg/api/server_encryption.go	32.50%	26 Missing and 1 partial ⚠️
pkg/database/userdb/sql.go	77.02%	9 Missing and 8 partials ⚠️
pkg/api/crypto/crypto.go	77.04%	7 Missing and 7 partials ⚠️
pkg/api/middleware/lastseen.go	80.85%	7 Missing and 2 partials ⚠️
pkg/api/methods/clients.go	90.00%	2 Missing and 1 partial ⚠️
pkg/api/middleware/ratelimit.go	75.00%	1 Missing and 1 partial ⚠️
pkg/api/notifications/notifications.go	0.00%	2 Missing ⚠️
... and 1 more

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (6)

pkg/api/notifications/notifications.go (1)
127-129: Consider classifying clients.paired as a critical notification.

This helper adds a new user-visible state transition, but the method is not included in criticalNotifications, so it will still be silently downgraded to a WARN-level drop when the channel is full. If pairing UX depends on this event, that failure mode will be hard to spot.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/notifications/notifications.go` around lines 127 - 129, The
ClientsPaired helper sends a user-visible transition
(models.NotificationClientsPaired) but isn’t in the criticalNotifications set,
so drops are logged at WARN when the notifications channel is full; add
models.NotificationClientsPaired to the criticalNotifications collection (the
same variable used to determine criticality in sendNotification) so
ClientsPaired events are treated as critical and logged/handled accordingly.
pkg/database/userdb/sql.go (1)
827-836: Consider whether silent no-op on missing token is intentional.

sqlUpdateClientLastSeen doesn't check RowsAffected(), so updating a non-existent token silently succeeds. This may be intentional for the batched last-seen tracker (where the client might have been deleted between touch and flush), but differs from sqlDeleteClient which returns "not found" on zero rows.

If the silent behavior is intentional (which seems reasonable for the async last-seen use case), a brief comment would clarify this design choice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/database/userdb/sql.go` around lines 827 - 836, sqlUpdateClientLastSeen
currently executes an UPDATE and returns success even if no rows were modified
(it does not call RowsAffected), which can silently ignore missing tokens; if
this silent no-op is intentional for the async/batched last-seen tracker, add a
concise clarifying comment above sqlUpdateClientLastSeen stating that missing
clients are ignored on purpose (mentioning the async flush/touch use-case and
that sqlDeleteClient deliberately differs by returning "not found"), otherwise
modify sqlUpdateClientLastSeen to check result.RowsAffected() and return a
not-found error when zero rows are affected.
pkg/api/server_encryption_test.go (1)
233-239: Use EncryptionProtoVersion here instead of hardcoding 1.

This test is pinning the wire shape. Hardcoding the version means a legitimate protocol bump breaks the test even if unsupportedEncryptionVersionResponse() is still correct.
Proposed change
-	assert.InDelta(t, float64(1), supported[0], 0,
+	assert.InDelta(t, float64(apimiddleware.EncryptionProtoVersion), supported[0], 0,
 		"only protocol version 1 is currently supported")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/server_encryption_test.go` around lines 233 - 239, The test currently
hardcodes the protocol version as float64(1); replace that literal with the
package constant EncryptionProtoVersion so the test tracks the actual protocol
constant (e.g., use float64(EncryptionProtoVersion)) and keep the same InDelta
check and message; update the assertion line that calls assert.InDelta(t, ...)
to convert EncryptionProtoVersion to float64 and use it instead of 1 so a legit
protocol bump won't break the test while unsupportedEncryptionVersionResponse()
remains validated.
pkg/api/middleware/lastseen.go (1)
91-97: Re-queue failed last-seen writes instead of dropping them.

A transient UpdateClientLastSeen error is only logged here, so that timestamp is lost permanently unless the client happens to send more traffic later. Re-queueing the failed entry for the next flush would make this path much more resilient.
Possible fix
 		if err := t.db.UpdateClientLastSeen(token, ts); err != nil {
 			log.Warn().
 				Err(err).
 				Str("auth_token", redactToken(token)).
 				Msg("encryption: failed to persist client last seen")
+			t.mu.Lock()
+			if existing, ok := t.dirty[token]; !ok || existing < ts {
+				t.dirty[token] = ts
+			}
+			t.mu.Unlock()
 		}
 		delete(snapshot, token)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/middleware/lastseen.go` around lines 91 - 97, The current loop drops
last-seen timestamps on transient failures because it always deletes the
snapshot entry after calling t.db.UpdateClientLastSeen; change the logic so that
when t.db.UpdateClientLastSeen(token, ts) returns an error you do not delete the
entry but instead re-queue it for the next flush (e.g., leave it in snapshot or
push it into a retry queue), and only delete(snapshot, token) on success; ensure
the retry path is concurrency-safe and bounded (avoid unbounded growth) and keep
the existing log (log.Warn().Err(err).Str("auth_token",
redactToken(token)).Msg(...)) when re-queuing so failures are visible while
preserving the timestamp for the next flush.
pkg/api/pairing_test.go (1)
161-172: Consider potential flakiness in time-based tests.

Tests using short TTLs (10-15ms) with time.Sleep can be flaky under CI load. The margins here (20ms sleep for 10ms TTL) seem reasonable, but if these become flaky, consider using a clock abstraction.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/pairing_test.go` around lines 161 - 172, TestStartPairing_AfterExpiry
uses a very short real-time TTL and time.Sleep which can be flaky on CI; update
the test to be robust by either using the test clock abstraction (replace real
time with a controllable clock in newPairingHarness and advance the clock
instead of sleeping) or, if a clock abstraction isn’t available, increase the
TTL and sleep margins (e.g., use WithPairingPINTTL(50ms) and sleep 100ms) so
h.mgr.StartPairing calls reliably observe expiry; ensure changes reference
TestStartPairing_AfterExpiry, newPairingHarness, WithPairingPINTTL, and
h.mgr.StartPairing when locating and updating the test.
pkg/api/integration_encryption_test.go (1)
364-367: Consider asserting the specific error type.

The test verifies an error occurs but doesn't distinguish between decryption failure and other errors (e.g., rate limiting). Consider checking that the error wraps a GCM authentication failure to ensure the test validates the intended security property.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/integration_encryption_test.go` around lines 364 - 367, The test
currently only checks that encGateway.EstablishSession returns an error via
require.Error; change it to assert the specific decryption/GCM-authentication
failure so other errors (rate limiting, network) don't pass. Replace the generic
require.Error(t, err, ...) with a targeted check such as require.ErrorIs(t, err,
expectedAuthErr) if your code exposes a sentinel (e.g., ErrGCMAuth), or assert
strings.Contains(err.Error(), "message authentication failed") / errors.Is(err,
cipher.ErrOpen) (or use errors.As to unwrap a wrapped AEAD/Open error) after
calling encGateway.EstablishSession to ensure the error is the GCM auth failure.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/api/encryption.md`:
- Around line 29-77: Replace the unlabeled fenced code blocks in the
encryption.md protocol transcript with labeled fences using the text language
tag (i.e., change ``` to ```text) so markdownlint warnings stop and rendering
stays consistent; update every similar unlabeled fence in the file (including
the visual TUI/Server/Client transcript and the other protocol blocks referenced
in the review) to use ```text.

In `@pkg/api/middleware/ipfilter_test.go`:
- Around line 216-356: Add a new unit test in the ipfilter_test suite that
exercises the code path where ParseRemoteIP(...) returns nil by supplying a
malformed req.RemoteAddr (e.g. "bad-addr" or missing port) to
NonWSIPFilterMiddleware with a non-empty allowlist; assert that the middleware
does not call the next handler and returns http.StatusForbidden. This should
reference the existing test pattern (use next http.HandlerFunc, wrapped :=
NonWSIPFilterMiddleware(provider/ipsProvider(...)), set req.RemoteAddr to the
malformed value, call wrapped.ServeHTTP, and assert called==false and
recorder.Code==http.StatusForbidden so the ParseRemoteIP == nil branch is
covered.

In `@pkg/database/userdb/clients_test.go`:
- Around line 131-145: The tests like TestSqlGetClientByToken_NotFound set
sqlmock expectations but never verify them; after exercising the code (e.g.,
after calling sqlGetClientByToken) add a call to mock.ExpectationsWereMet() and
assert no error (e.g., require.NoError(t, mock.ExpectationsWereMet())) to ensure
the expected query was actually executed; apply the same change to the other
expectation-based tests in this file (the tests around the other
sqlGetClientByToken/insert/update test cases) so each test verifies mock
expectations before returning.

In `@pkg/database/userdb/migrations/20260407062033_create_clients_table.sql`:
- Around line 11-13: The clients table currently allows any non-NULL BLOB for
PairingKey; update the column definition to enforce the fixed byte length used
by your pairing flow by adding a CHECK on length(PairingKey) (e.g. PairingKey
BLOB NOT NULL CHECK(length(PairingKey)=32)); modify the CREATE TABLE migration
that defines PairingKey to include this CHECK so malformed/truncated keys are
rejected at insert time.

---

Nitpick comments:
In `@pkg/api/integration_encryption_test.go`:
- Around line 364-367: The test currently only checks that
encGateway.EstablishSession returns an error via require.Error; change it to
assert the specific decryption/GCM-authentication failure so other errors (rate
limiting, network) don't pass. Replace the generic require.Error(t, err, ...)
with a targeted check such as require.ErrorIs(t, err, expectedAuthErr) if your
code exposes a sentinel (e.g., ErrGCMAuth), or assert
strings.Contains(err.Error(), "message authentication failed") / errors.Is(err,
cipher.ErrOpen) (or use errors.As to unwrap a wrapped AEAD/Open error) after
calling encGateway.EstablishSession to ensure the error is the GCM auth failure.

In `@pkg/api/middleware/lastseen.go`:
- Around line 91-97: The current loop drops last-seen timestamps on transient
failures because it always deletes the snapshot entry after calling
t.db.UpdateClientLastSeen; change the logic so that when
t.db.UpdateClientLastSeen(token, ts) returns an error you do not delete the
entry but instead re-queue it for the next flush (e.g., leave it in snapshot or
push it into a retry queue), and only delete(snapshot, token) on success; ensure
the retry path is concurrency-safe and bounded (avoid unbounded growth) and keep
the existing log (log.Warn().Err(err).Str("auth_token",
redactToken(token)).Msg(...)) when re-queuing so failures are visible while
preserving the timestamp for the next flush.

In `@pkg/api/notifications/notifications.go`:
- Around line 127-129: The ClientsPaired helper sends a user-visible transition
(models.NotificationClientsPaired) but isn’t in the criticalNotifications set,
so drops are logged at WARN when the notifications channel is full; add
models.NotificationClientsPaired to the criticalNotifications collection (the
same variable used to determine criticality in sendNotification) so
ClientsPaired events are treated as critical and logged/handled accordingly.

In `@pkg/api/pairing_test.go`:
- Around line 161-172: TestStartPairing_AfterExpiry uses a very short real-time
TTL and time.Sleep which can be flaky on CI; update the test to be robust by
either using the test clock abstraction (replace real time with a controllable
clock in newPairingHarness and advance the clock instead of sleeping) or, if a
clock abstraction isn’t available, increase the TTL and sleep margins (e.g., use
WithPairingPINTTL(50ms) and sleep 100ms) so h.mgr.StartPairing calls reliably
observe expiry; ensure changes reference TestStartPairing_AfterExpiry,
newPairingHarness, WithPairingPINTTL, and h.mgr.StartPairing when locating and
updating the test.

In `@pkg/api/server_encryption_test.go`:
- Around line 233-239: The test currently hardcodes the protocol version as
float64(1); replace that literal with the package constant
EncryptionProtoVersion so the test tracks the actual protocol constant (e.g.,
use float64(EncryptionProtoVersion)) and keep the same InDelta check and
message; update the assertion line that calls assert.InDelta(t, ...) to convert
EncryptionProtoVersion to float64 and use it instead of 1 so a legit protocol
bump won't break the test while unsupportedEncryptionVersionResponse() remains
validated.

In `@pkg/database/userdb/sql.go`:
- Around line 827-836: sqlUpdateClientLastSeen currently executes an UPDATE and
returns success even if no rows were modified (it does not call RowsAffected),
which can silently ignore missing tokens; if this silent no-op is intentional
for the async/batched last-seen tracker, add a concise clarifying comment above
sqlUpdateClientLastSeen stating that missing clients are ignored on purpose
(mentioning the async flush/touch use-case and that sqlDeleteClient deliberately
differs by returning "not found"), otherwise modify sqlUpdateClientLastSeen to
check result.RowsAffected() and return a not-found error when zero rows are
affected.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 60ffb82e-4a91-4975-b3ea-f041fa19b0ba

📥 Commits

Reviewing files that changed from the base of the PR and between 235f6ea and 2f01125.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (39)

docs/api/encryption.md
go.mod
pkg/api/crypto/crypto.go
pkg/api/crypto/crypto_test.go
pkg/api/integration_encryption_test.go
pkg/api/methods/clients.go
pkg/api/methods/clients_test.go
pkg/api/middleware/encryption.go
pkg/api/middleware/encryption_test.go
pkg/api/middleware/export_test.go
pkg/api/middleware/ipfilter.go
pkg/api/middleware/ipfilter_test.go
pkg/api/middleware/lastseen.go
pkg/api/middleware/lastseen_test.go
pkg/api/middleware/ratelimit.go
pkg/api/models/models.go
pkg/api/models/responses.go
pkg/api/models/responses_test.go
pkg/api/notifications/notifications.go
pkg/api/pairing.go
pkg/api/pairing_test.go
pkg/api/router_wiring_test.go
pkg/api/server.go
pkg/api/server_encryption.go
pkg/api/server_encryption_test.go
pkg/config/configencryption.go
pkg/config/configencryption_test.go
pkg/config/configservice.go
pkg/database/database.go
pkg/database/userdb/clients.go
pkg/database/userdb/clients_test.go
pkg/database/userdb/migrations/20260407062033_create_clients_table.sql
pkg/database/userdb/sql.go
pkg/database/userdb/sql_test.go
pkg/database/userdb/userdb_integration_test.go
pkg/service/discovery/discovery.go
pkg/service/discovery/discovery_test.go
pkg/service/service.go
pkg/testing/helpers/db_mocks.go

Add ratelimit_test.go covering IPRateLimiter, HTTP middleware, and WebSocket handler (real melody upgrade). Fix review findings: add markdownlint text fences in encryption.md, cover ParseRemoteIP nil branch in ipfilter tests, add ExpectationsWereMet to 6 client mock tests, add PairingKey length CHECK constraint to migration, widen flaky pairing TTL margins, document intentional no-op in sqlUpdateClientLastSeen.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (3)

pkg/api/middleware/ratelimit_test.go (2)

207-213: Avoid fixed sleeps in WS async assertions to reduce CI flakiness.

The two fixed delays make this test timing-sensitive. Prefer polling-with-timeout (eventually pattern) for handler-call and close assertions.

⏱️ Suggested test hardening

-	time.Sleep(20 * time.Millisecond)
-	assert.Equal(t, int32(1), handlerCalls.Load(), "first message should be handled")
+	deadline := time.Now().Add(500 * time.Millisecond)
+	for time.Now().Before(deadline) && handlerCalls.Load() < 1 {
+		time.Sleep(5 * time.Millisecond)
+	}
+	assert.Equal(t, int32(1), handlerCalls.Load(), "first message should be handled")
 
 	// Second message should trigger rate limit and close the connection.
 	require.NoError(t, conn.WriteMessage(websocket.TextMessage, []byte("again")))
-	time.Sleep(50 * time.Millisecond)
 	assert.Equal(t, int32(1), handlerCalls.Load(), "second message should not reach handler")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/api/middleware/ratelimit_test.go` around lines 207 - 213, Replace the
fixed time.Sleep calls in the test with polling-with-timeout assertions: instead
of sleeping 20ms and 50ms, poll until handlerCalls.Load() equals the expected
value (use a short interval loop with a deadline) after
conn.WriteMessage(websocket.TextMessage, []byte("again")), and similarly poll
for the connection close condition; update assertions that reference
handlerCalls.Load() and the post-write close check to use this
eventually/polling pattern to avoid timing flakiness.

98-101: StartCleanup test is non-verifying and may miss regressions.

This test currently can pass even if cancel handling regresses (sleep-only, no observable assertion). Please assert a concrete effect (or introduce a test hook) so cancellation behavior is actually validated.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/api/middleware/ratelimit_test.go` around lines 98 - 101, The test for
StartCleanup currently only sleeps and doesn't verify cancellation; update it to
observe a concrete effect from StartCleanup (e.g., the cleanup goroutine
exiting) by adding a test hook or synchronization primitive: modify StartCleanup
to accept (or use) an optional done channel or call a package-level test-only
callback (e.g., cleanupDone chan struct{} or onCleanupExited func()) that is
triggered when the goroutine ends, then in the test (ratelimit_test.go) replace
time.Sleep(10 * time.Millisecond) with waiting on that channel/callback with a
timeout and assert it fires before the timeout so cancellation is actually
validated. Ensure references to StartCleanup and the chosen test hook
(cleanupDone or onCleanupExited) are used so the test deterministically observes
goroutine exit.

pkg/api/middleware/ipfilter_test.go (1)

216-377: Consider a small test helper to reduce setup duplication.

The repeated next/mw/wrapped/request/recorder scaffolding can be extracted to keep future additions easier to maintain.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/api/middleware/ipfilter_test.go` around lines 216 - 377, Extract the
duplicated test scaffolding into a small helper (e.g., setupIPTest or
newIPTestHelper) that creates the called flag and next http.HandlerFunc,
constructs the middleware via NonWSIPFilterMiddleware(ipsProvider(...)) or with
a provider func, builds the httptest.Request with a provided RemoteAddr, and
returns the wrapped handler, recorder, request, and a pointer or accessor to the
called boolean so tests like TestNonWSIPFilterMiddleware_LoopbackAlwaysAllowed,
_RemoteEmptyAllowlistDenied, _RemoteAllowlistedIP, _RemoteAllowlistedCIDR,
_RemoteNotInAllowlist, _MalformedRemoteAddr and _HotReload can call the helper
with different allowlists/addresses and assert on called and recorder.Code to
remove repeated next/mw/wrapped/request/recorder setup.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/api/encryption.md`:
- Line 216: The docs row currently suggests "hand-rolled EC math" for Swift;
remove that wording and replace it with recommendations for vetted,
platform-approved libraries only (e.g., Swift Crypto/CryptoKit, OpenSSL
bindings, or other well-maintained crypto libraries) so the table entry for
Swift reads something like "OpenSSL binding or vetted crypto library
(CryptoKit/Swift Crypto)" — update the cell text where it currently contains
"hand-rolled EC math" and ensure the wording explicitly discourages
custom/hand-rolled EC implementations.
- Line 11: Update the "Per-session keys" sentence in docs/api/encryption.md to
correctly describe the HKDF inputs: state that the server uses the pairing key
as the HKDF input keying material (IKM) and the session salt as the HKDF salt
(not a concatenated pairingKey || salt), so SDK authors derive keys the same way
as pkg/api/crypto/crypto.go; replace the current wording to explicitly mention
IKM and salt roles and that counters reset to 0 each session.
- Around line 299-345: connectEncrypted currently returns immediately so callers
can call sendRPC before the WebSocket is open and hit InvalidStateError; fix by
delaying exposure of a usable sendRPC until ws is open (or making sendRPC await
an openPromise). Specifically, inside connectEncrypted create an openPromise
resolved in ws.onopen and rejected in ws.onclose/onerror, then either await that
promise before returning { sendRPC } or change sendRPC to await openPromise (and
reject if the socket closes), and ensure ws.onopen/onerror/onclose handlers are
added so sendRPC only calls ws.send when ws.readyState === WebSocket.OPEN.

In `@pkg/api/pairing_test.go`:
- Around line 606-610: The tests (e.g.,
TestHandlePairFinish_AuditLogsHMACMismatch and the similar block at lines
680-684) are mutating the global log.Logger which races with parallel tests;
change the tests to stop swapping the global logger by either (A) injecting a
logger into the code under test: add a logger parameter/field to PairingManager
and the relevant handler constructors/Call sites so tests can pass a
zerolog.Logger instance and capture output, or (B) if injection is too invasive,
serialize the global swap by using a package-level test mutex to guard any test
that sets log.Logger; update the tests to use one of these approaches and remove
direct assignments to log.Logger in TestHandlePairFinish_AuditLogsHMACMismatch
(and the other affected test) so no global mutation occurs during t.Parallel()
runs.
- Around line 686-695: The test currently hardcodes a "wrong" PIN ("000000") but
doesn't compare it to the PIN returned by StartPairing(), so if StartPairing()
returns "000000" the test can falsely pass; update the test around
newPairingHarness/StartPairing to capture the actual PIN returned (e.g., pin, _,
err := h.mgr.StartPairing()), then ensure the wrong PIN used for pake.InitCurve
is different (if pin == "000000" pick a different value such as "000001" or
generate until different) before calling pake.InitCurve and creating wrongPake
so the handshake is guaranteed to fail and the exhaustion/audit path is
exercised.

---

Nitpick comments:
In `@pkg/api/middleware/ipfilter_test.go`:
- Around line 216-377: Extract the duplicated test scaffolding into a small
helper (e.g., setupIPTest or newIPTestHelper) that creates the called flag and
next http.HandlerFunc, constructs the middleware via
NonWSIPFilterMiddleware(ipsProvider(...)) or with a provider func, builds the
httptest.Request with a provided RemoteAddr, and returns the wrapped handler,
recorder, request, and a pointer or accessor to the called boolean so tests like
TestNonWSIPFilterMiddleware_LoopbackAlwaysAllowed, _RemoteEmptyAllowlistDenied,
_RemoteAllowlistedIP, _RemoteAllowlistedCIDR, _RemoteNotInAllowlist,
_MalformedRemoteAddr and _HotReload can call the helper with different
allowlists/addresses and assert on called and recorder.Code to remove repeated
next/mw/wrapped/request/recorder setup.

In `@pkg/api/middleware/ratelimit_test.go`:
- Around line 207-213: Replace the fixed time.Sleep calls in the test with
polling-with-timeout assertions: instead of sleeping 20ms and 50ms, poll until
handlerCalls.Load() equals the expected value (use a short interval loop with a
deadline) after conn.WriteMessage(websocket.TextMessage, []byte("again")), and
similarly poll for the connection close condition; update assertions that
reference handlerCalls.Load() and the post-write close check to use this
eventually/polling pattern to avoid timing flakiness.
- Around line 98-101: The test for StartCleanup currently only sleeps and
doesn't verify cancellation; update it to observe a concrete effect from
StartCleanup (e.g., the cleanup goroutine exiting) by adding a test hook or
synchronization primitive: modify StartCleanup to accept (or use) an optional
done channel or call a package-level test-only callback (e.g., cleanupDone chan
struct{} or onCleanupExited func()) that is triggered when the goroutine ends,
then in the test (ratelimit_test.go) replace time.Sleep(10 * time.Millisecond)
with waiting on that channel/callback with a timeout and assert it fires before
the timeout so cancellation is actually validated. Ensure references to
StartCleanup and the chosen test hook (cleanupDone or onCleanupExited) are used
so the test deterministically observes goroutine exit.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cd837256-5b85-4760-97b2-b1d8dacc6952

📥 Commits

Reviewing files that changed from the base of the PR and between a22ed51 and b28a12c.

📒 Files selected for processing (8)

docs/api/encryption.md
pkg/api/middleware/ipfilter_test.go
pkg/api/middleware/ratelimit_test.go
pkg/api/pairing_test.go
pkg/database/userdb/clients_test.go
pkg/database/userdb/migrations/20260407062033_create_clients_table.sql
pkg/database/userdb/sql.go
pkg/database/userdb/userdb_integration_test.go

✅ Files skipped from review due to trivial changes (2)

pkg/database/userdb/migrations/20260407062033_create_clients_table.sql
pkg/database/userdb/userdb_integration_test.go

🚧 Files skipped from review as they are similar to previous changes (2)

pkg/database/userdb/clients_test.go
pkg/database/userdb/sql.go

… JS example Fix encryption.md: describe HKDF inputs as IKM/salt (not concatenation), replace "hand-rolled EC math" with Swift Crypto, and add ws.onopen guard to JS example so sendRPC doesn't throw before socket is open. Fix exhaustion test to check actual PIN before using hardcoded wrong PIN.

coderabbitai

♻️ Duplicate comments (2)

pkg/api/pairing_test.go (2)

252-257: ⚠️ Potential issue | 🟡 Minor

Guarantee the “wrong PIN” is actually wrong.

If StartPairing() returns "999999", this becomes a valid handshake and the test turns nondeterministic. Mirror the guard you already added in the exhaustion test.

Minimal fix

-	_, _, err := h.mgr.StartPairing()
+	pin, _, err := h.mgr.StartPairing()
 	require.NoError(t, err)

 	// Use a wrong PIN — different session key, HMAC will not match.
-	wrongPake, err := pake.InitCurve([]byte("999999"), 0, pairingCurve)
+	wrongPIN := "999999"
+	if pin == wrongPIN {
+		wrongPIN = "111111"
+	}
+	wrongPake, err := pake.InitCurve([]byte(wrongPIN), 0, pairingCurve)
 	require.NoError(t, err)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/api/pairing_test.go` around lines 252 - 257, The test currently
initializes wrongPake with a hardcoded PIN "999999" which could match the real
PIN returned by h.mgr.StartPairing(), making the test nondeterministic; update
the test around h.mgr.StartPairing() and pake.InitCurve so you assert the
"wrong" PIN differs from the actual pairing PIN (from StartPairing()), e.g.,
retrieve the real PIN value and if it equals "999999" choose a different PIN (or
loop/generate until different) before calling pake.InitCurve to ensure wrongPake
truly uses a different PIN.

606-610: ⚠️ Potential issue | 🟠 Major

Stop swapping the package-global logger in these tests.

Even without t.Parallel() here, the rest of the package does run in parallel, so redirecting log.Logger can still race and leak unrelated log lines into buf. Guard the swap behind a package-level test mutex or inject the logger into the code under test.

Also applies to: 680-684

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/api/pairing_test.go` around lines 606 - 610, The tests (e.g.,
TestHandlePairFinish_AuditLogsHMACMismatch and the similar test around lines
680-684) must not race by swapping the package-global log.Logger; either guard
the swap with a package-level test mutex (create a var testLogMu sync.Mutex in
the test package and Lock/Unlock around assigning log.Logger and restoring it)
or refactor the code under test to accept an injected zerolog.Logger so tests
can pass a local logger instance instead of mutating the global; update the
tests to use one of these approaches and ensure they restore the original logger
while holding the mutex if using the swap approach.

🧹 Nitpick comments (1)

pkg/api/pairing_test.go (1)
195-202: These expiry tests are using margins that are too tight for CI.

A 5ms TTL with a 15ms sleep is easy to trip over under scheduler or clock jitter, so these cases are likely to flake on slower runners. Prefer a fake clock, or widen the TTL/sleep gap substantially and centralize it in a helper.

Also applies to: 341-350, 353-368, 843-857, 865-876
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/pairing_test.go` around lines 195 - 202, The expiry tests use
too-tight real-time margins (e.g., WithPairingPINTTL(5*time.Millisecond) with
time.Sleep(15*time.Millisecond)) which flakes in CI; update the tests that use
newPairingHarness, WithPairingPINTTL, mgr.StartPairing and mgr.PendingPIN to
either use a controllable/fake clock in the pairing harness or substantially
widen the TTL/sleep gap and centralize the timing logic in a helper (e.g.,
create a waitUntilExpired helper or a harness option that advances a fake clock)
so tests deterministically expire PINs without relying on tiny real sleeps.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/api/pairing_test.go`:
- Around line 252-257: The test currently initializes wrongPake with a hardcoded
PIN "999999" which could match the real PIN returned by h.mgr.StartPairing(),
making the test nondeterministic; update the test around h.mgr.StartPairing()
and pake.InitCurve so you assert the "wrong" PIN differs from the actual pairing
PIN (from StartPairing()), e.g., retrieve the real PIN value and if it equals
"999999" choose a different PIN (or loop/generate until different) before
calling pake.InitCurve to ensure wrongPake truly uses a different PIN.
- Around line 606-610: The tests (e.g.,
TestHandlePairFinish_AuditLogsHMACMismatch and the similar test around lines
680-684) must not race by swapping the package-global log.Logger; either guard
the swap with a package-level test mutex (create a var testLogMu sync.Mutex in
the test package and Lock/Unlock around assigning log.Logger and restoring it)
or refactor the code under test to accept an injected zerolog.Logger so tests
can pass a local logger instance instead of mutating the global; update the
tests to use one of these approaches and ensure they restore the original logger
while holding the mutex if using the swap approach.

---

Nitpick comments:
In `@pkg/api/pairing_test.go`:
- Around line 195-202: The expiry tests use too-tight real-time margins (e.g.,
WithPairingPINTTL(5*time.Millisecond) with time.Sleep(15*time.Millisecond))
which flakes in CI; update the tests that use newPairingHarness,
WithPairingPINTTL, mgr.StartPairing and mgr.PendingPIN to either use a
controllable/fake clock in the pairing harness or substantially widen the
TTL/sleep gap and centralize the timing logic in a helper (e.g., create a
waitUntilExpired helper or a harness option that advances a fake clock) so tests
deterministically expire PINs without relying on tiny real sleeps.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 110df34d-29eb-493e-9912-34b06be2cde3

📥 Commits

Reviewing files that changed from the base of the PR and between b28a12c and b7d099c.

📒 Files selected for processing (2)

docs/api/encryption.md
pkg/api/pairing_test.go

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

Comment thread docs/api/encryption.md Outdated

Comment thread pkg/api/middleware/ipfilter_test.go

Comment thread pkg/database/userdb/clients_test.go

Comment thread pkg/database/userdb/migrations/20260407062033_create_clients_table.sql Outdated

wizzomafizzo added 2 commits April 9, 2026 17:56

Merge branch 'main' into feat/api-encryption

a22ed51

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

Comment thread docs/api/encryption.md Outdated

Comment thread docs/api/encryption.md Outdated

Comment thread docs/api/encryption.md

Comment thread pkg/api/pairing_test.go

Comment thread pkg/api/pairing_test.go

wizzomafizzo added 2 commits April 9, 2026 18:40

Merge branch 'main' into feat/api-encryption

9fbc331

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

wizzomafizzo merged commit 9feb46e into main Apr 9, 2026
13 checks passed

wizzomafizzo deleted the feat/api-encryption branch April 9, 2026 11:04

This was referenced Apr 11, 2026

feat(zapscript): harden execution pipeline #656

Merged

refactor(config): add LoadTOML, remove SetXxxForTesting methods #657

Merged

feat(test): add fuzz tests for attacker-reachable code paths and nightly CI #665

Merged

Uh oh!

Conversation

wizzomafizzo commented Apr 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Transport policy changes

New dependencies

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

sentry bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wizzomafizzo commented Apr 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 9, 2026 •

edited

Loading

sentry bot commented Apr 9, 2026 •

edited

Loading