Conversation
…, and audit enhancements - JSONL TranscriptWriter with append-only per-session logs, fsync, and crash recovery - SQLite OperationalStore replacing JSON DeviceRegistry with WAL mode and migrations - EncryptedFileSecretStore (AES-256-GCM + scrypt) and KeychainSecretStore backends - Typed audit events with automatic field redaction for sensitive data - JSON-to-SQLite one-time device migration with idempotent .migrated marker - Wire OperationalStore and SecretStore into gateway startup, auth, and RPC handlers - Add secretsBackend, masterPassphrase, and fsyncWrites config fields - Persistence barrel exports from gateway package Closes #6
There was a problem hiding this comment.
Pull request overview
This PR introduces a persistence layer for the gateway, moving core state from JSON files into SQLite, adding an encrypted secrets vault, and standardizing audit/transcript logging with redaction and durability controls.
Changes:
- Add
OperationalStore(SQLite + WAL + migrations) and wire it into gateway startup/auth/RPC. - Add secrets storage (
EncryptedFileSecretStore,KeychainSecretStore) and new config fields (secretsBackend,masterPassphrase,fsyncWrites). - Add JSONL transcript writer and typed audit events with automatic sensitive-field redaction; expand unit/integration coverage.
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/gateway/test/unit/transcript-writer.test.ts | Unit coverage for JSONL transcript append/recover, fsync, permissions, error callback. |
| packages/gateway/test/unit/rpc-router.test.ts | Switch RPC tests from DeviceRegistry to OperationalStore. |
| packages/gateway/test/unit/redact.test.ts | Unit coverage for secret masking and recursive field redaction. |
| packages/gateway/test/unit/operational-store.test.ts | Unit coverage for SQLite device/agent CRUD, transactions, WAL, permissions, migrations. |
| packages/gateway/test/unit/migrate-devices.test.ts | Unit coverage for JSON→SQLite device migration and marker behavior. |
| packages/gateway/test/unit/encrypted-file-secret-store.test.ts | Unit coverage for encrypted vault CRUD, persistence, permissions, wrong passphrase. |
| packages/gateway/test/unit/audit-events.test.ts | Unit coverage for typed audit event factories and fields. |
| packages/gateway/test/integration/persistence.integration.test.ts | Integration coverage spanning OperationalStore, transcripts, secrets vault, migration, audit redaction, and gateway auth. |
| packages/gateway/test/integration/connect-auth.integration.test.ts | Update revoked-device test to use OperationalStore. |
| packages/gateway/test/helpers/test-gateway.ts | Wire OperationalStore into test gateway harness and ensure proper cleanup. |
| packages/gateway/src/state/types.ts | Introduce AgentRecord; make AuditEvent typed with event union + outcome. |
| packages/gateway/src/server/register-websocket-routes.ts | Switch handshake auth lookup from DeviceRegistry to OperationalStore. |
| packages/gateway/src/server/create-gateway-server.ts | Instantiate OperationalStore + secrets store, run JSON migration, pass fsyncWrites to audit log, close resources on shutdown. |
| packages/gateway/src/rpc/method-handlers.ts | Replace DeviceRegistry dependency with OperationalStore; include audit outcome. |
| packages/gateway/src/persistence/transcript-writer.ts | Add append-only per-session JSONL transcript writer with optional sync + crash recovery. |
| packages/gateway/src/persistence/secret-store.ts | Add SecretStore interface/types for secrets backends. |
| packages/gateway/src/persistence/operational-store.ts | Add SQLite-backed operational persistence with migrations and prepared statements. |
| packages/gateway/src/persistence/migrations.ts | Define schema migrations for devices/agents/schema_version. |
| packages/gateway/src/persistence/migrate-devices.ts | Add one-time JSON devices migration with .migrated marker. |
| packages/gateway/src/persistence/keychain-secret-store.ts | Add keychain backend wrapper (currently delegates to encrypted file store). |
| packages/gateway/src/persistence/index.ts | Add persistence barrel exports. |
| packages/gateway/src/persistence/encrypted-file-secret-store.ts | Add encrypted vault implementation (AES-256-GCM + scrypt, atomic writes). |
| packages/gateway/src/index.ts | Export persistence module from gateway package root. |
| packages/gateway/src/config/parse-cli.ts | Tighten CLI arg parsing types around split("=", 2). |
| packages/gateway/src/config/gateway-config.ts | Add config fields for secrets backend selection, master passphrase, and fsync writes. |
| packages/gateway/src/audit/redact.ts | Add redaction helpers for sensitive audit fields. |
| packages/gateway/src/audit/index.ts | Export audit events, redact helpers, and AuditLogOptions. |
| packages/gateway/src/audit/audit-log.ts | Add optional datasync writes and automatic redaction; re-export legacy factory aliases. |
| packages/gateway/src/audit/audit-events.ts | Add typed audit event factory functions with outcomes. |
Comments suppressed due to low confidence (1)
packages/gateway/test/unit/rpc-router.test.ts:83
OperationalStoreis created inbeforeEachbut never closed inafterEach. Since it holds an open SQLite handle, this can leak resources and cause file locking issues/flaky tests. Store it in an outerletand calloperationalStore.close()during cleanup.
const operationalStore = new OperationalStore({
dbPath: join(dataDir, "homeagent.db"),
});
const connectionManager = new ConnectionManager();
const auditLog = new AuditLog(dataDir);
registerV1Handlers(registry, {
operationalStore,
connectionManager,
auditLog,
});
router = new RpcRouter(registry, store);
});
afterEach(() => {
store.close();
rmSync(tempDir, { recursive: true, force: true });
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Only rename if we actually processed something (even with some errors) | ||
| if (errors.length === 0 || migrated > 0) { |
There was a problem hiding this comment.
migrateDevicesFromJson renames devices.json to .migrated even when errors.length > 0 as long as any record migrated. This can permanently drop devices that failed to migrate (e.g., duplicates/invalid records) because future runs will be skipped due to the marker file. Consider only renaming when there are zero errors, or writing a separate marker (e.g. .migrated-with-errors) while keeping the source file for manual remediation/retry.
| // Only rename if we actually processed something (even with some errors) | |
| if (errors.length === 0 || migrated > 0) { | |
| // Only rename if all records were migrated successfully (no errors) | |
| if (errors.length === 0) { |
| const device = record as Record<string, unknown>; | ||
| try { | ||
| const input: Parameters<typeof store.registerDevice>[0] = { | ||
| deviceId: device.deviceId as string, | ||
| sharedSecret: device.sharedSecret as string, | ||
| role: (device.role as "admin" | "node" | "client") ?? "client", | ||
| approved: (device.approved as boolean) ?? false, | ||
| }; |
There was a problem hiding this comment.
The migration code force-casts deviceId, sharedSecret, role, and approved from untyped JSON (as string / as boolean) and relies on SQLite errors for validation. This can lead to confusing partial migrations and makes it hard to report actionable errors (e.g., non-string deviceId, invalid role). It would be safer to explicitly validate types/allowed role values up front and push a clear error message per record before calling registerDevice.
| /** Whether to call fd.datasync() after each write for crash safety. Default: true */ | ||
| fsync?: boolean; | ||
| /** Optional callback for error reporting (e.g., to audit log) */ | ||
| onError?: (error: Error) => void; | ||
| /** Injectable clock for testing */ | ||
| now?: () => number; |
There was a problem hiding this comment.
The option docstring says this calls fd.datasync() after each write, but the implementation uses fsyncSync(). Either update the comment to match the behavior or switch to the intended fdatasync/datasync API so callers understand the durability guarantees.
| { | ||
| version: 3, | ||
| description: "Create schema_version table", | ||
| up: ` | ||
| CREATE TABLE IF NOT EXISTS schema_version ( | ||
| version INTEGER PRIMARY KEY, | ||
| applied_at INTEGER NOT NULL | ||
| ); | ||
| `, | ||
| }, |
There was a problem hiding this comment.
runMigrations() always creates the schema_version table up front, but MIGRATIONS also includes a dedicated migration (v3) that creates the same table. This duplication is harmless but makes the migration history harder to reason about. Consider removing the redundant migration or removing the pre-create step and relying on migrations exclusively.
| { | |
| version: 3, | |
| description: "Create schema_version table", | |
| up: ` | |
| CREATE TABLE IF NOT EXISTS schema_version ( | |
| version INTEGER PRIMARY KEY, | |
| applied_at INTEGER NOT NULL | |
| ); | |
| `, | |
| }, |
| public constructor(options: KeychainSecretStoreOptions) { | ||
| this.serviceName = options.serviceName ?? "homeagent"; | ||
| this.keychainAvailable = isKeychainAvailable(); | ||
|
|
||
| // For now, always delegate to EncryptedFileSecretStore. | ||
| // Keychain integration reads/writes individual secrets but the | ||
| // EncryptedFileSecretStore is the battle-tested primary path. | ||
| this.delegate = new EncryptedFileSecretStore(options.fallback); | ||
| } |
There was a problem hiding this comment.
Despite the name/config (secretsBackend: "keychain"), this implementation currently always delegates all reads/writes to EncryptedFileSecretStore and never stores anything in the OS keychain. That makes the backend selection misleading for operators and deviates from the implied behavior of a “KeychainSecretStore”. Consider either implementing actual keychain storage, renaming/re-scoping this class to reflect “keychain detection + encrypted-file store”, or updating the gateway config/docs so users don’t assume secrets are in the OS keychain.
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
@aaronolds I've opened a new pull request, #14, to work on those changes. Once the pull request is ready, I'll request review from you. |
Closes #6