Skip to content

v0.2.0 — production-readiness

Choose a tag to compare

@schnsrw schnsrw released this 25 May 20:55
· 111 commits to main since this release

[0.2.0] — 2026-05-26

The "production-readiness" release. Six engineering streams that
turn v0.1's "real persistence + self-host story" into a workload you
can put real users on. Co-edit divergence becomes recoverable; the
gateway gets per-IP throttling + a hard room cap; we have measured
baseline numbers + a sizing model; and the FUniver-boundary type
debt starts coming down.

Added — co-edit reliability

  • Bridge replay retry with backoff + dead-letter ring buffer
    (apps/web/src/collab/replay-retry.ts). Replay failures are
    classified transient (dynamic-import chunk-load failures —
    retry with 300/900/2700 ms backoff) or permanent (malformed
    params / unknown command id — dead-letter immediately). Final
    failures append to a capped (20) ring buffer exposed via
    BridgeHandle.getReplayDeadLetter() / subscribeReplayDeadLetter().
  • Click-to-expand replay-failure detail in the CollabIndicator
    pill. Shows the last 5 dead-letter entries with mutation id,
    classification chip, truncated error, and age. Closes on
    outside-click / Escape. Auto-clears when the dead-letter empties.

Added — backend hardening

  • Per-IP rate limit via @fastify/rate-limit. New env vars:
    • RATE_LIMIT_ENABLED (default true) — master switch.
    • RATE_LIMIT_PER_MIN (default 60) — applies to POST /api/rooms.
    • UPLOAD_RATE_LIMIT_PER_MIN (default 12) — applies to
      POST /api/rooms/:id/seed and POST /api/rooms/:id/snapshot.
      Returns standard 429 + retry-after + x-ratelimit-* headers
      on overflow. Read endpoints (GET /snapshot) are NOT rate-limited.
  • Hard cap on concurrent rooms via new MAX_ROOMS env
    (default 256). When create() would exceed the cap, LRU-evicts
    the oldest evictable room (no password / no seed / no
    snapshot). If every slot is non-evictable, returns
    503 capacity_full + retry-after: 60. Two-pass eviction
    policy: prefer idle-but-evictable, fall back to live-but-evictable
    by createdAt — prevents a "spam open rooms" pattern from
    permanently locking out new users.
  • Boot log of room registry + upload limits so operators can
    verify the configured caps at startup.

Added — measurement + capacity planning

  • In-tree HTTP load harness at apps/server/scripts/loadtest.ts
    (~190 lines, no new deps — uses Node's built-in fetch +
    perf_hooks + FormData + Blob). Drives the four bounded
    write-path endpoints with configurable VUs / duration / target;
    output is a grep-friendly numbers table. Run with
    pnpm --filter @sheet/server load.
  • v0.1 baseline numbers documented in docs/LOAD_TEST.md:
    ~1900 req/s sustained, p99 < 3 ms across all four write endpoints
    with rate-limit disabled. Rate-limit verification run shows the
    bucket clamps a single IP exactly at the configured 60/min +
    12/min envelopes.
  • Capacity model + sizing tiers in docs/CAPACITY_MODEL.md.
    Workload-anchored: per-doc RAM / CPU / network / storage cost
    derived from the baseline + Yjs / Hocuspocus fan-out math. Five
    deployment tiers (Solo / Small / Mid / Big single-process /
    Sharded) with concrete dollar costs ($5/mo → $300/mo → linear).
    Worked example for a 4 vCPU / 8 GB / 180 SSD DigitalOcean
    General-Purpose droplet at 1 user/doc: 5 000–8 000 concurrent
    single-process, ~10 000–15 000 with cluster mode + sticky routing.
  • Production-pipeline doc at docs/PRODUCTION_PIPELINE.md
    rolling roadmap of the post-v0.1 reliability + hardening +
    measurement + release streams.

Added — UX (toast + a11y + mobile + clarity)

  • Unified toast surface (apps/web/src/shell/toast/) — info
    / success / error kinds, optional action button, accessible
    role="status" / role="alert". Wired into:
    • File > Save / Export (success + error per format)
    • Autosave > Restore (success + error)
    • Insert Chart (Added Chart 3)
    • Sheet tab actions: rename ("Renamed to X"), duplicate, hide
      ("Hid X" with one-click Show action), delete ("Deleted X"
      with 8 s Undo action that calls Univer's command-stack undo).
    • Print Area set/clear (with Undo action that restores the
      previous range).
    • Paste Special apply (Pasted: Formats / Column widths /
      etc. — names the variant the user picked).
    • Flash Fill — outcome-aware (success carries the cell count;
      each failure mode gets a specific explanation rather than
      silently no-op'ing).
    • Save Version (success with Open history action; error catch).
    • Insert Sparkline (success names the type + anchor; error catch).
  • Peer count + queued-mutation count in the CollabIndicator:
    • "Live · 2" when co-editing with 2 peers.
    • "Reconnecting · 3" when 3 of your edits are queued locally.
  • Humanised open-file errors in the loading overlay — 8
    classifier branches (corrupt zip, encrypted, network, HTTP
    404 / 403 / 5xx, ods loader, memory) with the raw error
    collapsed under a <details>.
  • Insert Chart range error elevated to a banner above the
    input with role="alert" + aria-live + aria-invalid on
    the input.
  • Ribbon group landmarksrole="group" + aria-label on
    each ribbon group so screen readers announce boundaries.
  • Mobile fixes: side-panel back-out pill is now unmistakable on
    touch (40 × 40 px "← Back"); toolbar overflow chevrons pinned to
    viewport edges so they don't get hidden behind the device
    notch; desktop toolbar hides correctly at ≤ 480 px.

Changed — type-safety refactor (rolling)

  • Typed Univer facade at apps/web/src/univer-facade.ts
    (~210 lines). Centralises the as any casts at the
    FUniver → workbook → sheet → range boundary into one
    auditable module. Surface: sheetId, isHidden, maxRows,
    maxColumns, rangeAt, rangeBox, rangeFromA1,
    activateRange, dataRangeOrActive, setActiveSheet,
    findSheetById, saveWorkbook, activeSheet, activeRange,
    injector, viteEnv, viteEnvNumber, windowStringGlobal.
  • Converted 5 highest-traffic files (tab-actions, sheet-actions,
    flash-fill, MenuBar, CollabDriver) — 27 caller-side
    as-any sites eliminated
    , 23 centralised in the facade. The
    remaining ~21 unconverted files are mechanical follow-up
    tracked under the rolling B1 stream.

Fixed

  • Formula bar didn't trigger initial recalc on workbook mount + swap
    (back-ported in v0.1.1; recorded here for completeness).
  • Excel-style typed input ($1,234 · 15% · (500) · €99)
    parses as numbers instead of strings (v0.1.1 back-port).

Internal

  • 6 new unit tests for replay-retry.ts (classifier + retry
    scheduler + ring-buffer eviction).
  • 6 new unit tests for RoomRegistry cap + LRU eviction.
  • 10 new unit tests for toast normalisation + humanised errors
    (back-fill from v0.1.x pre-release).
  • Total: 139 / 139 unit tests pass (was 116 at start of cycle).