Skip to content

fix: launch readiness — auth docs, healthcheck, metrics, info fields#310

Merged
kacy merged 1 commit intomainfrom
feat/launch-readiness-audit
Feb 26, 2026
Merged

fix: launch readiness — auth docs, healthcheck, metrics, info fields#310
kacy merged 1 commit intomainfrom
feat/launch-readiness-audit

Conversation

@kacy
Copy link
Copy Markdown
Owner

@kacy kacy commented Feb 26, 2026

summary

closes the pre-preview gaps identified in the launch readiness audit. all changes are documentation corrections, missing instrumentation, or operational hygiene — no behavior changes on the hot path.

what was changed

SECURITY.md — replaced the stale "no authentication" line with the actual current capabilities: --requirepass, ACL per-user control, and TLS/mTLS. added a new section documenting that per-ip rate limiting is delegated to a reverse proxy/firewall.

--healthcheck flagember-server --healthcheck now works. it opens a TCP connection to the configured RESP port, sends PING, and exits 0/1 depending on whether it gets +PONG\r\n back. this makes the existing docker-compose healthcheck block functional.

keyspace hits/misses — added keyspace_hits and keyspace_misses u64 counters to KeyspaceStats, incremented on every get() call. exposed in:

  • INFO stats as keyspace_hits / keyspace_misses (redis-compatible field names)
  • prometheus as ember_keyspace_hits_total / ember_keyspace_misses_total

INFO completeness — filled in fields expected by redis-compatible monitoring tools (datadog, redis_exporter, grafana dashboards):

  • server: tcp_port, hz, config_file
  • persistence: aof_last_bgrewrite_status, rdb_last_save_time

gRPC keepalive — both tonic server paths now set http2_keepalive_interval (30s), http2_keepalive_timeout (10s), and tcp_keepalive (60s) so idle connections are cleaned up and broken connections are detected promptly.

what was tested

  • cargo check --workspace passes cleanly
  • manually verified KeyspaceStats zero-initialization and aggregation at all call sites
  • confirmed gRPC keepalive constants are within tonic's documented valid ranges

…pc keepalive

- update SECURITY.md to reflect current auth capabilities (requirepass,
  ACL, TLS/mTLS) and add a per-ip rate limiting delegation note
- add --healthcheck flag: syncs TCP ping to the configured RESP port,
  exits 0/1 — makes docker-compose healthcheck work
- add keyspace_hits/keyspace_misses counters to KeyspaceStats; tracked
  on every get() call in keyspace::string
- expose hits/misses in INFO stats section and as prometheus gauges
  (ember_keyspace_hits_total, ember_keyspace_misses_total)
- fill in missing INFO fields: tcp_port, hz, config_file in server
  section; aof_last_bgrewrite_status and rdb_last_save_time in
  persistence section
- configure gRPC keepalive (30s interval, 10s timeout, 60s tcp) on
  both server paths to clean up idle connections
@kacy kacy merged commit 68da122 into main Feb 26, 2026
@kacy kacy deleted the feat/launch-readiness-audit branch February 26, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant