backup: add Redis hash encoder#725
backup: add Redis hash encoder#725bootjp wants to merge 8 commits intofeat/backup-phase0a-redis-simplefrom
Conversation
First of the Phase 0a wide-column encoders (after the redis_string.go strings/HLL/TTL work in PR #713). Hashes follow a meta+per-field key shape on the live store (`!hs|meta|<len><userKey>` → `[Len(8)]`, `!hs|fld|<len><userKey><field>` → field bytes; see store/hash_helpers.go), so the encoder buffers each user key's fields in memory and flushes one `hashes/<encoded>.json` file per hash at Finalize. Output schema matches the design (docs/design/2026_04_29_proposed_snapshot_logical_decoder.md:327): { "format_version": 1, "fields": {"name": "alice"}, "expire_at_ms": null } Cross-cutting decisions: - TTL is folded INLINE into expire_at_ms (not a separate sidecar) so a restorer can replay the hash in one shot. HandleTTL routes redisKindHash records into the hash state rather than the strings_ttl.jsonl path. - Field VALUES use the same UTF-8-or-base64 envelope introduced for the SQS message body in PR #714: valid UTF-8 → plain JSON string; non-UTF-8 → `{"base64":"..."}`. Field NAMES that are not valid UTF-8 are percent-encoded (EncodeSegment) for the JSON object key — JSON object keys must be strings. - Empty hashes (HLEN==0) still emit a file because their existence is observable to clients (HEXISTS, HLEN). Mismatched declared-vs-observed length surfaces as `redis_hash_length_mismatch` warning. - SHA-fallback user keys feed back through the existing recordIfFallback path so KEYMAP.jsonl still reverses non-reversible filenames. - Output is sorted by user key + field name for deterministic byte-identical dumps across runs (matches the round-9 sort pattern from #716 schemaToPublic). Tests cover: round-trip, empty hash, inline TTL routing, length mismatch warning, binary value envelope, malformed meta, truncated field key, SHA-fallback keymap.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements Redis hash backup support, including handlers for hash metadata and fields, and logic to flush accumulated state to JSON files. The implementation handles binary-safe field values and provides deterministic output. Review feedback identifies a potential panic on 32-bit architectures due to unsafe integer conversion and suggests refactoring the hashState helper to centralize key-kind tracking and reduce redundant string allocations.
| ukLen := int(binary.BigEndian.Uint32(b[:hashUserKeyLenSize])) //nolint:gosec // bounded by len(b) | ||
| if len(b) < hashUserKeyLenSize+ukLen { | ||
| return nil, false | ||
| } | ||
| return b[hashUserKeyLenSize : hashUserKeyLenSize+ukLen], true |
There was a problem hiding this comment.
On 32-bit architectures, converting a uint32 to int can result in a negative value if the high bit is set (e.g., for lengths > 2GB). This would bypass the subsequent length check and cause a panic when slicing b. Using uint64 for the comparison ensures safety across all architectures.
| ukLen := int(binary.BigEndian.Uint32(b[:hashUserKeyLenSize])) //nolint:gosec // bounded by len(b) | |
| if len(b) < hashUserKeyLenSize+ukLen { | |
| return nil, false | |
| } | |
| return b[hashUserKeyLenSize : hashUserKeyLenSize+ukLen], true | |
| ukLen := binary.BigEndian.Uint32(b[:hashUserKeyLenSize]) | |
| if uint64(len(b)) < uint64(hashUserKeyLenSize)+uint64(ukLen) { | |
| return nil, false | |
| } | |
| return b[hashUserKeyLenSize : hashUserKeyLenSize+int(ukLen)], true |
| st := r.hashState(userKey) | ||
| st.declaredLen = int64(binary.BigEndian.Uint64(value)) //nolint:gosec // signed int64 by design | ||
| st.metaSeen = true | ||
| r.kindByKey[string(userKey)] = redisKindHash | ||
| return nil |
There was a problem hiding this comment.
The update to r.kindByKey can be centralized within the hashState helper to avoid redundancy and ensure consistency across all hash-related handlers.
| st := r.hashState(userKey) | |
| st.declaredLen = int64(binary.BigEndian.Uint64(value)) //nolint:gosec // signed int64 by design | |
| st.metaSeen = true | |
| r.kindByKey[string(userKey)] = redisKindHash | |
| return nil | |
| st := r.hashState(userKey) | |
| st.declaredLen = int64(binary.BigEndian.Uint64(value)) //nolint:gosec // signed int64 by design | |
| st.metaSeen = true | |
| return nil |
References
- Centralize common complex logic into shared helper functions to avoid code duplication and improve maintainability.
| st := r.hashState(userKey) | ||
| st.fields[string(fieldName)] = bytes.Clone(value) | ||
| r.kindByKey[string(userKey)] = redisKindHash | ||
| return nil |
There was a problem hiding this comment.
The update to r.kindByKey is redundant here if centralized in hashState.
| st := r.hashState(userKey) | |
| st.fields[string(fieldName)] = bytes.Clone(value) | |
| r.kindByKey[string(userKey)] = redisKindHash | |
| return nil | |
| st := r.hashState(userKey) | |
| st.fields[string(fieldName)] = bytes.Clone(value) | |
| return nil |
References
- Centralize common complex logic into shared helper functions to avoid code duplication and improve maintainability.
| func (r *RedisDB) hashState(userKey []byte) *redisHashState { | ||
| if st, ok := r.hashes[string(userKey)]; ok { | ||
| return st | ||
| } | ||
| st := &redisHashState{fields: make(map[string][]byte)} | ||
| r.hashes[string(userKey)] = st | ||
| return st | ||
| } |
There was a problem hiding this comment.
Centralizing the kindByKey update here improves maintainability and avoids repeated map writes in the handlers. Additionally, caching the string conversion of userKey avoids redundant allocations when the state is not found.
| func (r *RedisDB) hashState(userKey []byte) *redisHashState { | |
| if st, ok := r.hashes[string(userKey)]; ok { | |
| return st | |
| } | |
| st := &redisHashState{fields: make(map[string][]byte)} | |
| r.hashes[string(userKey)] = st | |
| return st | |
| } | |
| func (r *RedisDB) hashState(userKey []byte) *redisHashState { | |
| ukStr := string(userKey) | |
| if st, ok := r.hashes[ukStr]; ok { | |
| return st | |
| } | |
| st := &redisHashState{fields: make(map[string][]byte)} | |
| r.hashes[ukStr] = st | |
| r.kindByKey[ukStr] = redisKindHash | |
| return st | |
| } |
References
- Centralize common complex logic into shared helper functions to avoid code duplication and improve maintainability.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72e2b75a61
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| jsonName := name | ||
| if !utf8.ValidString(name) { | ||
| jsonName = EncodeSegment([]byte(name)) | ||
| } |
There was a problem hiding this comment.
Disambiguate encoded and raw hash field names
This conversion only percent-encodes non-UTF8 field names while leaving UTF-8 names unchanged, so distinct Redis fields can collapse to the same JSON key (for example, a literal field name %FF and a binary field name 0xFF). In that case fields[jsonName] overwrites one value, silently dropping data from hashes that contain both forms; Redis hash field names are binary-safe, so this is a valid input shape.
Useful? React with 👍 / 👎.
| for _, name := range names { | ||
| jsonName := name | ||
| if !utf8.ValidString(name) { | ||
| jsonName = EncodeSegment([]byte(name)) |
There was a problem hiding this comment.
Avoid SHA-fallback when encoding hash field names
EncodeSegment is a filename encoder with a 240-byte limit and SHA fallback; applying it to non-UTF8 field names can emit an irreversible fallback token for long names. Because hash field names are stored only as JSON object keys and no KEYMAP is written for per-field names, restoring such hashes cannot reconstruct the original field-name bytes, causing silent key corruption for long binary field names.
Useful? React with 👍 / 👎.
…tion (PR #725, round 1) Codex round 12 raised two P1 findings on commit 72e2b75; Gemini raised one high and three mediums alongside. P1: Hash field name encoding via JSON object key was lossy. - Distinct fields could collapse to the same JSON key: a UTF-8 literal `%FF` and a single binary byte `0xFF` both ended up as `%FF` after the EncodeSegment-on-non-UTF-8 fallback, silently dropping data on hashes with both forms. - Long binary field names (>240 bytes) hit EncodeSegment's SHA fallback which is irreversible at this layer (no per-field KEYMAP), so restore could not reconstruct the original bytes. Fix: change `fields` to a JSON array of typed `{name, value}` records. Both name and value go through the same UTF-8-or-base64 envelope (marshalRedisBinaryValue, mirrors PR #714's SQS body), so arbitrary binary bytes round-trip without collisions and without SHA fallback. High (Gemini): parseUserKeyLenPrefix used `int(uint32)` which wraps to negative on 32-bit when the high bit is set, bypassing the bounds check and causing a slice panic. Compare in `uint64` space before converting. Medium (Gemini, ×3): centralise `kindByKey` registration in the hashState helper so meta/field/TTL paths agree on the kind in one place. Audited callers — HandleHashMeta, HandleHashField, and HandleTTL's `case redisKindHash:` arm — all reach hashState via the same userKey identity, and the TTL caller already knows the kind so the now-implicit kindByKey write is idempotent. No other callers per `grep -n "r\.hashState(" internal/backup/`. Tests: - TestRedisDB_HashRoundTripBasic updated to assert the array shape via new hashFieldArray / hashFieldByName helpers. - TestRedisDB_HashEmptyHashStillEmitsFile, _BinaryFieldValue updated likewise. - NEW TestRedisDB_HashBinaryFieldNameRoundTripsViaArray drives the exact Codex collision pair (UTF-8 `%FF` + binary 0xFF) and asserts both names survive distinctly with their expected types. Verified: linux race tests, GOOS=windows, GOOS=js wasm, lint — all clean.
|
@claude review Round 1 fixes for Codex P1 round 12 + Gemini high/medium on commit 72e2b75 (commit P1 — hash field name encoding was lossy: distinct Redis fields could collapse to the same JSON object key (UTF-8 literal High (Gemini) — 32-bit overflow in length parser: Medium (Gemini ×3) — kindByKey centralisation:
|
|
Claude finished @bootjp's task in 2m 54s —— View job Review — Round 1 fixes (commit
|
…' into feat/backup-phase0a-redis-hash
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4646b9f7f6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if len(fieldName) == 0 { | ||
| return cockroachdberr.Wrapf(ErrRedisInvalidHashKey, | ||
| "empty field name in key %q", key) |
There was a problem hiding this comment.
HandleHashField currently treats a zero-length field name as malformed, but Redis hash field names are binary-safe and may legitimately be empty. In this case a valid key like !hs|fld|<len><userKey> (with no trailing field bytes) will cause backup decoding to error out instead of emitting the hash, so backups can fail on real data created via HSET with an empty field name.
Useful? React with 👍 / 👎.
…' into feat/backup-phase0a-redis-hash
Codex P1 round 13 (commit 4646b9f): HandleHashField was rejecting zero-length field names as ErrRedisInvalidHashKey, but Redis hash field names are binary-safe and `HSET k "" v` is a valid command. The live store emits a key shaped exactly `!hs|fld|<len><userKey>` with no trailing field bytes for that case, so backup decoding would error out on real data created via HSET with an empty field name. Remove the explicit zero-length check; the array-of-records JSON shape introduced in round 1 already handles empty names without collision (an entry like {"name":"", "value":"..."} sorts and round-trips distinctly). Caller audit per loop's audit-semantics rule (fail-closed → fail-open behavior change for empty fieldName): - HandleHashField has only test-code callers; no production pipeline consumes it yet (per `grep -n "HandleHashField" internal/backup/`). - The truncated-key test (TestRedisDB_HashRejectsTruncatedFieldKey) drives a malformed binary key whose parseHashFieldKey returns ok=false; that branch still surfaces ErrRedisInvalidHashKey, independent of the empty-name check we removed. Test: TestRedisDB_HashAcceptsEmptyFieldName drives one empty- name field and one named field, asserts both survive distinct records in the output JSON.
|
@claude review Round 2 fix for Codex P1 round 13 on commit 4646b9f (commit
Removed the explicit zero-length check; the array-of-records JSON shape introduced in round 1 already handles empty names without collision (an entry like Caller audit per loop's audit-semantics rule (fail-closed → fail-open for empty fieldName):
Test: |
|
Claude finished @bootjp's task in 1m 29s —— View job Review — Round 2 fix (commit
|
…' into feat/backup-phase0a-redis-hash
…' into feat/backup-phase0a-redis-hash
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6abf62f547
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // parseHashMetaKey strips !hs|meta| and the 4-byte BE userKeyLen prefix. | ||
| // Returns (userKey, true) on success. | ||
| func parseHashMetaKey(key []byte) ([]byte, bool) { | ||
| rest := bytes.TrimPrefix(key, []byte(RedisHashMetaPrefix)) |
There was a problem hiding this comment.
Exclude hash meta-delta keys from meta parsing
parseHashMetaKey trims RedisHashMetaPrefix, which also matches !hs|meta|d|... delta keys, so those records are treated as base-meta candidates and then fail parsing as malformed keys. Delta keys are a normal part of live Redis hash state (the codebase even documents that delta prefixes share the meta prefix in adapter/redis_retry.go), so when snapshot dispatch sends hash-prefixed records to this encoder, valid snapshots containing unresolved deltas can abort with ErrRedisInvalidHashKey instead of being skipped or handled.
Useful? React with 👍 / 👎.
Codex P1 round 14 (commit 6abf62f): The delta prefix `!hs|meta|d|` shares the base meta prefix `!hs|meta|` (the live store/hash_helpers.go documents the same nesting). A snapshot dispatcher that routes records by "starts with RedisHashMetaPrefix" lands delta records in HandleHashMeta, where parseHashMetaKey trimmed the meta prefix and tried to parse `d|<len>...` as `<len><userKey>` — failing with ErrRedisInvalidHashKey on real snapshots that contain unresolved deltas. Phase 0a's hash output emits an array of OBSERVED fields (`!hs|fld|...` records), so the delta arithmetic the live system uses to recompute Len is not needed at backup time. Skip delta keys silently in HandleHashMeta. Defense-in-depth: parseHashMetaKey also returns `!ok` for delta-prefixed keys so a future refactor that bypasses the HandleHashMeta short-circuit still surfaces a parse failure rather than silent state corruption. Caller audit (per loop's audit-semantics rule for fail-closed → fail-open behavior change): - HandleHashMeta has only test-code callers; no production pipeline consumes it yet. - parseHashMetaKey is called only from HandleHashMeta. The !ok return is mapped to ErrRedisInvalidHashKey in HandleHashMeta's existing path; the delta short-circuit runs first now so the parser path never sees a delta key in normal flow. Test: TestRedisDB_HashHandleHashMetaSkipsDeltaKey synthesises a realistic delta key and asserts both that HandleHashMeta returns nil and that parseHashMetaKey returns !ok directly.
|
@claude review Round 3 fix for Codex P1 round 14 on commit 6abf62f (commit The delta prefix Phase 0a's hash output emits an array of OBSERVED fields ( Caller audit per loop's rule (fail-closed → fail-open behavior):
Test:
|
|
Claude finished @bootjp's task in 1m 40s —— View job Review — Round 3 fix (commit
|
Summary
!hs|meta|+!hs|fld|records on the live store into onehashes/<encoded>.jsonfile per Redis hash, per the design atdocs/design/2026_04_29_proposed_snapshot_logical_decoder.md:327.RedisDB, the SHA-fallback KEYMAP path, and the dirsCreated cache.expire_at_ms(not a sidecar) so a restorer can replay each hash in one shot.HandleTTLadds aredisKindHashroute.EncodeSegment) when not valid UTF-8 to keep JSON object keys as strings.meta.Lenand observed!hs|fld|keys surface asredis_hash_length_mismatchwarnings.Self-review (per CLAUDE.md "five lenses")
RedisDBis single-writer per the existing contract; the newhashesmap follows the same per-key state model used bykindByKey/inlineTTLEmitted.flushHashesruns once at Finalize. Sort isO(n log n)over user keys + per-hash field names — acceptable at dump cadence.kindByKeyregistration on both meta and field arrival prevents TTL mis-routing across snapshot record-ordering scenarios.Test plan
go test -race ./internal/backup/...— clean.GOOS=windows GOARCH=amd64 go build ./internal/backup/...— clean.GOOS=js GOARCH=wasm go build ./internal/backup/...— clean.golangci-lint run— clean.