v0.2.0 — production-readiness
[0.2.0] — 2026-05-26
The "production-readiness" release. Six engineering streams that
turn v0.1's "real persistence + self-host story" into a workload you
can put real users on. Co-edit divergence becomes recoverable; the
gateway gets per-IP throttling + a hard room cap; we have measured
baseline numbers + a sizing model; and the FUniver-boundary type
debt starts coming down.
Added — co-edit reliability
- Bridge replay retry with backoff + dead-letter ring buffer
(apps/web/src/collab/replay-retry.ts). Replay failures are
classifiedtransient(dynamic-import chunk-load failures —
retry with 300/900/2700 ms backoff) orpermanent(malformed
params / unknown command id — dead-letter immediately). Final
failures append to a capped (20) ring buffer exposed via
BridgeHandle.getReplayDeadLetter()/subscribeReplayDeadLetter(). - Click-to-expand replay-failure detail in the
CollabIndicator
pill. Shows the last 5 dead-letter entries with mutation id,
classification chip, truncated error, and age. Closes on
outside-click / Escape. Auto-clears when the dead-letter empties.
Added — backend hardening
- Per-IP rate limit via
@fastify/rate-limit. New env vars:RATE_LIMIT_ENABLED(defaulttrue) — master switch.RATE_LIMIT_PER_MIN(default60) — applies toPOST /api/rooms.UPLOAD_RATE_LIMIT_PER_MIN(default12) — applies to
POST /api/rooms/:id/seedandPOST /api/rooms/:id/snapshot.
Returns standard429+retry-after+x-ratelimit-*headers
on overflow. Read endpoints (GET /snapshot) are NOT rate-limited.
- Hard cap on concurrent rooms via new
MAX_ROOMSenv
(default256). Whencreate()would exceed the cap, LRU-evicts
the oldest evictable room (no password / no seed / no
snapshot). If every slot is non-evictable, returns
503 capacity_full+retry-after: 60. Two-pass eviction
policy: prefer idle-but-evictable, fall back to live-but-evictable
bycreatedAt— prevents a "spam open rooms" pattern from
permanently locking out new users. - Boot log of room registry + upload limits so operators can
verify the configured caps at startup.
Added — measurement + capacity planning
- In-tree HTTP load harness at
apps/server/scripts/loadtest.ts
(~190 lines, no new deps — uses Node's built-infetch+
perf_hooks+FormData+Blob). Drives the four bounded
write-path endpoints with configurable VUs / duration / target;
output is a grep-friendly numbers table. Run with
pnpm --filter @sheet/server load. - v0.1 baseline numbers documented in
docs/LOAD_TEST.md:
~1900 req/s sustained, p99 < 3 ms across all four write endpoints
with rate-limit disabled. Rate-limit verification run shows the
bucket clamps a single IP exactly at the configured60/min+
12/minenvelopes. - Capacity model + sizing tiers in
docs/CAPACITY_MODEL.md.
Workload-anchored: per-doc RAM / CPU / network / storage cost
derived from the baseline + Yjs / Hocuspocus fan-out math. Five
deployment tiers (Solo / Small / Mid / Big single-process /
Sharded) with concrete dollar costs ($5/mo → $300/mo → linear).
Worked example for a 4 vCPU / 8 GB / 180 SSD DigitalOcean
General-Purpose droplet at 1 user/doc: 5 000–8 000 concurrent
single-process, ~10 000–15 000 with cluster mode + sticky routing. - Production-pipeline doc at
docs/PRODUCTION_PIPELINE.md—
rolling roadmap of the post-v0.1 reliability + hardening +
measurement + release streams.
Added — UX (toast + a11y + mobile + clarity)
- Unified toast surface (
apps/web/src/shell/toast/) —info
/success/errorkinds, optional action button, accessible
role="status"/role="alert". Wired into:- File > Save / Export (success + error per format)
- Autosave > Restore (success + error)
- Insert Chart (
Added Chart 3) - Sheet tab actions: rename ("Renamed to X"), duplicate, hide
("Hid X" with one-clickShowaction), delete ("Deleted X"
with 8 sUndoaction that calls Univer's command-stack undo). - Print Area set/clear (with
Undoaction that restores the
previous range). - Paste Special apply (
Pasted: Formats/Column widths/
etc. — names the variant the user picked). - Flash Fill — outcome-aware (success carries the cell count;
each failure mode gets a specific explanation rather than
silently no-op'ing). - Save Version (success with
Open historyaction; error catch). - Insert Sparkline (success names the type + anchor; error catch).
- Peer count + queued-mutation count in the
CollabIndicator:- "Live · 2" when co-editing with 2 peers.
- "Reconnecting · 3" when 3 of your edits are queued locally.
- Humanised open-file errors in the loading overlay — 8
classifier branches (corrupt zip, encrypted, network, HTTP
404 / 403 / 5xx, ods loader, memory) with the raw error
collapsed under a<details>. - Insert Chart range error elevated to a banner above the
input withrole="alert"+aria-live+aria-invalidon
the input. - Ribbon group landmarks —
role="group"+aria-labelon
each ribbon group so screen readers announce boundaries. - Mobile fixes: side-panel back-out pill is now unmistakable on
touch (40 × 40 px "← Back"); toolbar overflow chevrons pinned to
viewport edges so they don't get hidden behind the device
notch; desktop toolbar hides correctly at ≤ 480 px.
Changed — type-safety refactor (rolling)
- Typed Univer facade at
apps/web/src/univer-facade.ts
(~210 lines). Centralises theas anycasts at the
FUniver → workbook → sheet → rangeboundary into one
auditable module. Surface:sheetId,isHidden,maxRows,
maxColumns,rangeAt,rangeBox,rangeFromA1,
activateRange,dataRangeOrActive,setActiveSheet,
findSheetById,saveWorkbook,activeSheet,activeRange,
injector,viteEnv,viteEnvNumber,windowStringGlobal. - Converted 5 highest-traffic files (
tab-actions,sheet-actions,
flash-fill,MenuBar,CollabDriver) — 27 caller-side
as-any sites eliminated, 23 centralised in the facade. The
remaining ~21 unconverted files are mechanical follow-up
tracked under the rolling B1 stream.
Fixed
- Formula bar didn't trigger initial recalc on workbook mount + swap
(back-ported in v0.1.1; recorded here for completeness). - Excel-style typed input (
$1,234·15%·(500)·€99)
parses as numbers instead of strings (v0.1.1 back-port).
Internal
- 6 new unit tests for
replay-retry.ts(classifier + retry
scheduler + ring-buffer eviction). - 6 new unit tests for
RoomRegistrycap + LRU eviction. - 10 new unit tests for toast normalisation + humanised errors
(back-fill from v0.1.x pre-release). - Total: 139 / 139 unit tests pass (was 116 at start of cycle).