Skip to content

Architecture: Move runtime state and replay onto a SQLite-backed state store #126

@techmore

Description

@techmore

Title: Architecture: Move runtime state and replay onto a SQLite-backed state store

Type:
refactor

Severity:
high

Area:
runtime state / realtime delivery / packaged app resilience

Description:
The current runtime relies on in-memory process state plus Socket.IO replay buffers for topology hydration, job replay, report availability, and reconnect behavior. That is workable for a single long-lived process, but it is fragile for packaged desktop usage, reconnects, second tabs, and future offline/history features.

Recent regressions have repeatedly clustered around the same boundary:

  • startup-discovered state exists but is not consistently rehydrated into new clients
  • reconnect/new-tab behavior depends on connect-time Socket.IO handler correctness
  • packaged app issues are hard to diagnose because runtime truth is spread across globals, registries, and replay buffers

A SQLite-backed state store would give the app a durable source of truth for:

  • latest network topology and public IP snapshot
  • current customer assignment
  • last scan target
  • active and recent jobs
  • report artifacts and availability
  • replayable event/log history

Socket.IO can still remain as the live transport, but it should stop being the primary state holder.

Evidence:

  • app.py and runtime service layers still coordinate globals, client registries, and replay buffers
  • nmapui/handlers/connections.py is a critical reconnect bottleneck for client hydration
  • nmapui/jobs.py and client-state abstractions currently keep live state in memory only
  • recent packaged regressions around target/topology hydration have shown how fragile this path is

Proposed Fix:
Introduce a SQLite-backed runtime state layer and migrate reconnect/hydration to read from SQLite first.

Suggested phases:

  1. Add SQLite tables for runtime snapshot, jobs, reports, and event log
  2. Persist startup network discovery and current assignment immediately at startup
  3. Persist job lifecycle transitions and report artifacts as they happen
  4. Change connect/new-tab hydration to read durable state from SQLite instead of relying on in-memory replay alone
  5. Narrow Socket.IO to live progress and invalidation events instead of full source-of-truth state
  6. Add browser and packaged-app regression tests around restart/reconnect behavior

Implementation Notes:

  • This should not start as a full ORM rewrite; a small typed persistence layer is sufficient
  • Keep filesystem scan/report assets on disk; store indexes, metadata, and runtime snapshots in SQLite
  • Consider an append-only event log table for replay/debugging and a current-state table for fast hydration
  • If transport simplification is desired later, this architecture would make SSE or polling viable for most read paths

Related Issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions