adapter: decompose ZSet storage into per-member wide-column keys#483
adapter: decompose ZSet storage into per-member wide-column keys#483
Conversation
9c1b322 to
37d6099
Compare
There was a problem hiding this comment.
Code Review
This pull request transitions Redis Sorted Set (ZSet) storage from a single legacy blob to a wide-column format, where members and their scores are stored in individual keys alongside a score-based index. The changes include new helper functions for managing these keys, logic to migrate existing data upon access, and updates to ZSet commands to support incremental, diff-based updates. Feedback highlights an issue where empty ZSets might incorrectly persist their metadata key instead of being deleted, and suggests consistent use of the store.PrefixEnd utility to avoid naming inconsistencies.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request migrates Redis Sorted Set (ZSet) storage from a legacy blob format to a wide-column layout using per-member keys and a score index. The changes include new key construction logic, sortable float encoding, and updated command implementations to support incremental updates and on-the-fly migration. Feedback focuses on optimizing storage operations by utilizing the new score index for ordered access and using prefix deletions to avoid inefficient full-set scans and potential memory exhaustion.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a wide-column storage format for Redis Sorted Sets, replacing the legacy blob-based storage with a more granular structure of metadata, member, and score-index keys. The changes include new key encoding logic, updated transaction state management for diff-based updates, and modified Redis command handlers to support the new format and facilitate on-the-fly migration. The review feedback highlights several areas for improvement, including optimizing bulk deletions via prefix-based operations, preventing potential OOM issues by bounding member scans, and refining migration logic to handle empty sets more efficiently.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements a wide-column storage format for Redis Sorted Sets, transitioning from a single-blob model to a structure using metadata, member keys, and a score index. The changes include migration support, updated transaction state, and revised command implementations. Feedback focuses on critical atomicity risks in prefix deletions, potential memory exhaustion during unbounded scans, and performance bottlenecks in the O(N) diffing and range-scanning logic.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements a wide-column storage format for Redis Sorted Sets (ZSets), utilizing metadata, member, and score-index keys to replace the legacy single-blob format. The implementation includes migration logic and updates to ZSet commands to support the new structure. Feedback identifies a bug where new prefixes were not registered for KEYS and SCAN commands. Additionally, reviewers raised concerns about potential out-of-memory (OOM) vulnerabilities and Raft limit issues due to operations that load entire sets into memory or perform unbounded scans during deletions and updates.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a wide-column storage format for Redis Sorted Sets (ZSets), moving away from the legacy single-blob approach to a per-member key structure with a dedicated score index. The changes include new key construction logic in store/zset_helpers.go, migration logic to handle legacy data, and updates to various Redis commands like ZADD, ZREM, and ZRANGE to support the new format. Feedback focuses on ensuring consistent score encoding for zero values and addressing potential performance and memory issues when deleting large sets by avoiding unbounded scans and excessive mutation batches.
| func EncodeSortableFloat64(dst []byte, f float64) { | ||
| if len(dst) < zsetScoreBinarySize { | ||
| return | ||
| } | ||
| bits := math.Float64bits(f) | ||
| if bits&(1<<float64SignBitPos) != 0 { // negative | ||
| bits = ^bits | ||
| } else { | ||
| bits ^= 1 << float64SignBitPos | ||
| } | ||
| binary.BigEndian.PutUint64(dst, bits) | ||
| } |
There was a problem hiding this comment.
The current float64 encoding does not normalize -0.0 to +0.0. In IEEE 754, these have different bit patterns (0x8000000000000000 vs 0x0000000000000000), which results in different sortable byte sequences (0x7FFFFFFFFFFFFFFF vs 0x8000000000000000). Since Redis treats -0.0 and +0.0 as equal scores, they should have the same representation in the score index to ensure consistent range scans and ordering.
func EncodeSortableFloat64(dst []byte, f float64) {
if len(dst) < zsetScoreBinarySize {
return
}
if f == 0 {
f = 0.0 // normalize -0.0 to +0.0
}
bits := math.Float64bits(f)
if bits&(1<<float64SignBitPos) != 0 { // negative
bits = ^bits
} else {
bits ^= 1 << float64SignBitPos
}
binary.BigEndian.PutUint64(dst, bits)
}
adapter/redis_compat_helpers.go
Outdated
| store.ZSetMemberScanPrefix(key), | ||
| store.ZSetScoreScanPrefix(key), | ||
| } { | ||
| kvs, err := r.store.ScanAt(ctx, prefix, store.PrefixEnd(prefix), math.MaxInt, readTS) |
There was a problem hiding this comment.
Using math.MaxInt as a limit for ScanAt when collecting keys for deletion can lead to unbounded memory growth and potential OOM issues if a ZSet contains a very large number of members. This violates the general rule to apply fixed bounds to collections that can grow from external requests. Consider using kv.DelPrefix for the member and score prefixes if the storage engine supports it, or implementing a chunked deletion approach to keep memory usage under control.
References
- To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests.
| func (r *RedisServer) deleteZSetWideColumnElems(ctx context.Context, key []byte, readTS uint64) ([]*kv.Elem[kv.OP], error) { | ||
| _, metaExists, err := r.loadZSetMetaAt(ctx, key, readTS) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| if !metaExists { | ||
| return nil, nil | ||
| } | ||
|
|
||
| elems := []*kv.Elem[kv.OP]{ | ||
| {Op: kv.Del, Key: store.ZSetMetaKey(key)}, | ||
| } | ||
|
|
||
| // Scan member keys and score-index keys for individual deletion. | ||
| for _, prefix := range [][]byte{ | ||
| store.ZSetMemberScanPrefix(key), | ||
| store.ZSetScoreScanPrefix(key), | ||
| } { | ||
| kvs, err := r.store.ScanAt(ctx, prefix, store.PrefixEnd(prefix), math.MaxInt, readTS) | ||
| if err != nil { | ||
| return nil, errors.WithStack(err) | ||
| } | ||
| for _, kvp := range kvs { | ||
| elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: bytes.Clone(kvp.Key)}) | ||
| } | ||
| } | ||
|
|
||
| return elems, nil |
There was a problem hiding this comment.
The deleteZSetWideColumnElems function generates individual Del operations for every member and score index key. For large ZSets, this results in a massive number of operations being batched into a single Raft request, which can bloat the Raft log, increase apply latency, and potentially exceed message size limits. If kv.DelPrefix cannot be mixed with other mutations, consider if the DEL command logic can be refactored to use prefix deletion for ZSet internal data namespaces.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request migrates Redis Sorted Set (ZSet) storage from a legacy single-blob format to a wide-column structure, utilizing separate keys for metadata, members, and score-based indexing. The changes include new helper functions for incremental updates, member scanning, and migration logic to ensure compatibility with existing data. Review feedback highlights a potential memory risk when scanning large sets with unbounded limits and suggests an optimization to avoid redundant metadata writes when the set's length remains unchanged.
adapter/redis_compat_helpers.go
Outdated
| kvs, err := r.store.ScanAt(ctx, prefix, store.PrefixEnd(prefix), math.MaxInt, readTS) | ||
| if err != nil { | ||
| return nil, errors.WithStack(err) | ||
| } | ||
| for _, kvp := range kvs { | ||
| elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: bytes.Clone(kvp.Key)}) | ||
| } | ||
| } |
There was a problem hiding this comment.
The use of math.MaxInt for the scan limit can lead to unbounded memory growth and potential OOM issues when deleting a very large ZSet. To prevent this, apply a fixed bound to collections that can grow from external requests. Consider implementing a batched scan-and-delete approach to process keys in smaller chunks, which would mitigate the memory pressure.
References
- To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests, such as pending configuration changes. Reject new requests when the bound is reached.
adapter/redis_compat_helpers.go
Outdated
| } else if len(elems) > 0 { | ||
| metaBytes, err := store.MarshalZSetMeta(store.ZSetMeta{Len: int64(len(newMembers))}) | ||
| if err != nil { | ||
| return nil, errors.WithStack(err) | ||
| } | ||
| elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: store.ZSetMetaKey(key), Value: metaBytes}) | ||
| } |
There was a problem hiding this comment.
The meta key is currently rewritten on every diff that modifies any element, even if the total number of members in the ZSet does not change (e.g., when only a score is updated). Since the meta key only stores the length, this write is redundant. Consider updating the logic to only write the meta key when the length actually changes.
| } else if len(elems) > 0 { | |
| metaBytes, err := store.MarshalZSetMeta(store.ZSetMeta{Len: int64(len(newMembers))}) | |
| if err != nil { | |
| return nil, errors.WithStack(err) | |
| } | |
| elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: store.ZSetMetaKey(key), Value: metaBytes}) | |
| } | |
| } else if len(origMembers) != len(newMembers) { | |
| metaBytes, err := store.MarshalZSetMeta(store.ZSetMeta{Len: int64(len(newMembers))}) | |
| if err != nil { | |
| return nil, errors.WithStack(err) | |
| } | |
| elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: store.ZSetMetaKey(key), Value: metaBytes}) | |
| } |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements a wide-column storage format for Redis Sorted Sets, transitioning from a single legacy blob to a more granular structure using metadata, member, and score-index keys. The changes include new key construction utilities, updated transaction states to support diff-based commits, and refactored Redis commands (e.g., ZADD, ZRANGE, BZPOPMIN) that handle both storage formats. Feedback was provided regarding the zrangeRead implementation, where a slice returned from the storage layer is reversed in-place; it is recommended to copy this slice first to prevent unintended side effects on shared data during the legacy migration phase.
| entries, err := r.loadZSetEntriesSorted(context.Background(), key, readTS) | ||
| if err != nil { | ||
| conn.WriteError(err.Error()) | ||
| return | ||
| } |
There was a problem hiding this comment.
The reverseZSetEntries function modifies its slice argument in-place. The entries slice returned from loadZSetEntriesSorted might be backed by a shared array, especially when falling back to the legacy blob format which involves unmarshaling. Modifying it directly could lead to unexpected side effects. It's safer to create a copy before performing the in-place reversal, as was done in the previous implementation.
entries, err := r.loadZSetEntriesSorted(context.Background(), key, readTS)
if err != nil {
conn.WriteError(err.Error())
return
}
// Make a copy before potential in-place reversal.
entries = append([]redisZSetEntry(nil), entries...)…review - Address PR #485 gemini review (4094524421): - Add stale claim key cleanup in PUSH write path - Enforce contiguous-only Head/Tail advancement in compaction - Add explicit RPOP claim flow (reverse scan from Tail-1) - Add Tail-side GC for RPOP claim keys - Extend delta metadata pattern to Hash, Set, and ZSet: - Hash: wide-column decomposition + delta - Set: wide-column decomposition + delta - ZSet: delta on existing wide-column (PR #483) - Unified compactor, shared delta limits, transition plan per type - Rename list_metadata_delta_design.md to collection_metadata_delta_design.md
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a wide-column storage format for Redis Sorted Sets, replacing the legacy single-blob storage with granular keys for metadata, members, and a score-based index. The implementation includes migration logic and support for standard ZSet commands. Review feedback highlights significant scalability risks, noting that several operations still load entire sets into memory or generate excessively large transaction batches, which could lead to OOM errors or cluster instability with large datasets.
| func (r *RedisServer) loadZSetMembersMap(ctx context.Context, key []byte, readTS uint64) (zsetLoadResult, error) { | ||
| // Check wide-column meta first. | ||
| meta, metaExists, err := r.loadZSetMetaAt(ctx, key, readTS) | ||
| if err != nil { | ||
| return zsetLoadResult{}, err | ||
| } | ||
| if metaExists { | ||
| entries, err := r.scanZSetAllMembers(ctx, key, meta.Len, readTS) | ||
| if err != nil { | ||
| return zsetLoadResult{}, err | ||
| } | ||
| return zsetLoadResult{members: zsetEntriesToMap(entries), exists: true}, nil | ||
| } | ||
| // Fall back to legacy blob format. | ||
| value, exists, err := r.loadZSetAt(ctx, key, readTS) | ||
| if err != nil { | ||
| return zsetLoadResult{}, err | ||
| } | ||
| if !exists { | ||
| return zsetLoadResult{members: map[string]float64{}}, nil | ||
| } | ||
| return zsetLoadResult{members: zsetEntriesToMap(value.Entries), exists: true, fromLegacy: true}, nil | ||
| } |
There was a problem hiding this comment.
The loadZSetMembersMap function loads the entire sorted set into an in-memory map. For large sets, this can lead to excessive memory consumption and potential OOM (Out of Memory) errors on the adapter node. This monolithic access pattern defeats the scalability benefits of the new wide-column storage format. Consider refactoring commands like ZADD, ZREM, and ZSCORE to use point lookups or range scans instead of loading the full set.
References
- To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests, such as pending configuration changes. Reject new requests when the bound is reached.
| } | ||
|
|
||
| value, _, err := c.server.loadZSetAt(context.Background(), key, c.startTS) | ||
| load, err := c.server.loadZSetMembersMap(context.Background(), key, c.startTS) |
There was a problem hiding this comment.
Similar to the adapter commands, Lua scripts also load the entire ZSet into memory via loadZSetMembersMap. This poses a significant OOM risk for large sets processed within scripts. Consider providing Lua APIs that can perform partial updates or range scans on ZSets.
References
- To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests, such as pending configuration changes. Reject new requests when the bound is reached.
| func (r *RedisServer) deleteZSetWideColumnElems(ctx context.Context, key []byte, readTS uint64) ([]*kv.Elem[kv.OP], error) { | ||
| meta, metaExists, err := r.loadZSetMetaAt(ctx, key, readTS) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| if !metaExists { | ||
| return nil, nil | ||
| } | ||
|
|
||
| elems := []*kv.Elem[kv.OP]{ | ||
| {Op: kv.Del, Key: store.ZSetMetaKey(key)}, | ||
| } | ||
|
|
||
| // Use meta.Len to bound the scan. Each member has one member key and one | ||
| // score-index key, so we expect exactly meta.Len entries per prefix. | ||
| scanLimit := int(meta.Len) | ||
|
|
||
| // Scan member keys and score-index keys for individual deletion. | ||
| for _, prefix := range [][]byte{ | ||
| store.ZSetMemberScanPrefix(key), | ||
| store.ZSetScoreScanPrefix(key), | ||
| } { | ||
| kvs, err := r.store.ScanAt(ctx, prefix, store.PrefixEnd(prefix), scanLimit, readTS) | ||
| if err != nil { | ||
| return nil, errors.WithStack(err) | ||
| } | ||
| for _, kvp := range kvs { | ||
| elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: bytes.Clone(kvp.Key)}) | ||
| } | ||
| } | ||
|
|
||
| return elems, nil | ||
| } |
There was a problem hiding this comment.
deleteZSetWideColumnElems performs a full scan of all member and score index keys and generates individual kv.Del operations for each. For large ZSets, this creates a massive transaction that can exceed Raft message size limits and cause cluster instability. Additionally, the scan itself can lead to OOM if the number of keys is very large. Consider implementing a more scalable deletion strategy, such as asynchronous background cleanup or using a single prefix deletion if the system constraints allow.
References
- To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests, such as pending configuration changes. Reject new requests when the bound is reached.
| return 0, wrongTypeError() | ||
| } | ||
| value, _, err := r.loadZSetAt(context.Background(), key, readTS) | ||
| load, err := r.loadZSetMembersMap(context.Background(), key, readTS) |
There was a problem hiding this comment.
zaddTxn (and similarly zincrby at line 1399) loads the entire ZSet map to perform updates. This is inefficient for large sets. Point lookups using the new wide-column keys should be used to check for member existence and retrieve old scores, allowing for O(1) updates regardless of the set size.
References
- To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests, such as pending configuration changes. Reject new requests when the bound is reached.
[Test Coverage] レビュー結果PR #483: adapter: decompose ZSet storage into per-member wide-column keys変更ファイル: 問題点1. 新規ファイル
|
| 新機能 | テスト有無 |
|---|---|
store/zset_helpers.go 全関数 |
なし |
EncodeSortableFloat64 のスコートソート順 |
なし |
| ZSet ワイドカラム差分コミット | なし |
fromLegacy マイグレーションパス |
部分的(既存テスト更新のみ) |
スコアインデックスの正確性は ZRANGE/ZRANGEBYSCORE の正確さに直結するため、特に EncodeSortableFloat64 の境界値テストを強く推奨します。
[Performance] レビュー結果PR #483 の性能観点のレビューです。ZSet をワイドカラム形式に分解する設計変更自体は I/O 削減の方向性として正しいですが、以下の問題点があります。 🔴 重大: ZADD/ZINCRBY/ZREM がワイドカラムパスでも O(N) の全メンバーロード該当箇所: // zaddTxn — 1メンバー追加でも全件スキャン
load, err := r.loadZSetMembersMap(context.Background(), key, readTS)
// → loadZSetMetaAt(1 I/O) + scanZSetAllMembers(N I/O) + zsetEntriesToMap(O(N) alloc)ワイドカラム化の主なメリットは「単一メンバー操作を O(1) I/O にする」ことですが、現実装では 提案: ZADD/ZINCRBY は「古いスコアキーを取得する → member key のみ点読み (O(1) I/O)」で実現できます。 🟠 重要: 書き込みのたびに O(N) のマップコピーが 2 回発生該当箇所: // loadZSetState — origMembers スナップショット
origMembers := make(map[string]float64, len(load.members))
for mk, mv := range load.members {
origMembers[mk] = mv
}
// 呼び出し元 (例: zaddTxn) — さらにもう 1 コピー
members := make(map[string]float64, len(load.members))
for k, v := range load.members {
members[k] = v
}単一メンバー操作でも常に「全メンバーマップ × 2」のメモリを確保しています。N=10,000 なら 16 バイト × 10,000 × 2 = 320KB のアロケーションが毎回発生します。 🟠 重要:
|
| 優先度 | 問題 | 影響 |
|---|---|---|
| 🔴 重大 | ZADD 等が wide-column でも全件 O(N) I/O ロード | 大規模 ZSet で性能劣化 |
| 🟠 重要 | 書き込み毎に O(N) map × 2 コピー | メモリ圧力・GC 負荷増 |
| 🟠 重要 | delete で member/score を別々にスキャン | I/O 2× |
| 🟡 軽微 | buildZSetDiffElems cap=0 |
realloc によるパフォーマンスロス |
| 🟡 軽微 | キー構築で 3 alloc/キー | ホットパスでの GC 圧 |
[Consistency] レビュー結果PR #483: adapter: decompose ZSet storage into per-member wide-column keys 問題: スコアのみ変更時にメタキーが更新されず、競合検出が機能しない(線形化可能性違反)
} else if len(origMembers) != len(newMembers) {
// Only update meta if cardinality changed
elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: store.ZSetMetaKey(key), Value: metaBytes})
}メンバー数が変わらない場合(スコア更新: ZADD/ZINCRBY で既存メンバーのスコアだけ変わるケース)、メタキー ( 一方、 t.trackReadKey(store.ZSetMetaKey(key))個々のメンバーキー ( 問題のシナリオ(Lost Update):
結果: 修正案: メンバー数が変わらない場合でも、スコア変更があれば常にメタキーを更新する(値を同一のままでも書き込む)か、個々のメンバーキー・スコアインデックスキーも読み取りセットに追加する。 例えばメタキーを常に書き込む場合: // len(origMembers) != len(newMembers) の条件を削除し、len(newMembers) > 0 なら常に更新
metaBytes, err := store.MarshalZSetMeta(store.ZSetMeta{Len: int64(len(newMembers))})
if err != nil {
return nil, errors.WithStack(err)
}
elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: store.ZSetMetaKey(key), Value: metaBytes})この変更により、ZINCRBY/ZADD(スコア更新)でもメタキーへの書き込みが競合検出のアンカーとなり、線形化可能性が保たれます。 |
[Performance] レビュー結果P1: キー構築関数での多重アロケーション(ホットパス・ループ内)store/zset_helpers.go:43–103
// 現状: ZSetMemberKey の例
func ZSetMemberKey(userKey, member []byte) []byte {
prefix := []byte(ZSetMemberPrefix) // alloc 1
comp := zsetUserKeyComponent(userKey) // alloc 2 (内部make)
buf := make(...) // alloc 3
...
}改善案: func ZSetMemberKey(userKey, member []byte) []byte {
prefixLen := len(ZSetMemberPrefix)
buf := make([]byte, 0, prefixLen+zsetUserKeyLenSize+len(userKey)+len(member))
buf = append(buf, ZSetMemberPrefix...)
binary.BigEndian.AppendUint32(buf[len(buf):len(buf)+zsetUserKeyLenSize], uint32(len(userKey)))
buf = buf[:len(buf)+zsetUserKeyLenSize]
buf = append(buf, userKey...)
buf = append(buf, member...)
return buf
}P2:
|
[Consistency] レビュー結果対象: adapter: decompose ZSet storage into per-member wide-column keys指摘事項: 2件 [重大]
|
[Data Loss] レビュー結果1 件の重大なリスクと 1 件の軽微なリスクを確認しました。 [高] FLUSHALL/FLUSHDB が
|
| プレフィックス | 用途 |
|---|---|
| `!zs | meta |
| `!zs | mem |
| `!zs | scr |
一方、PR #486 で変更された FLUSHALL/FLUSHDB は !redis| と !lst| のみを削除対象とし、!zs| プレフィックスは対象外です。
// PR #486 の flushDatabase(変更後)
{Op: kv.DelPrefix, Key: []byte("!redis|")},
{Op: kv.DelPrefix, Key: []byte("!lst|")},
// !zs|meta|, !zs|mem|, !zs|scr| は削除されない両 PR がマージされると、FLUSHALL/FLUSHDB を実行しても ZSet データが残存します。ユーザーは Redis の FLUSHALL がデータを完全に消去すると期待するため、これは深刻な動作不一致です。
対処案: PR #486 の flushDatabase に {Op: kv.DelPrefix, Key: []byte("!zs|")} を追加する。
[低] deleteZSetWideColumnElems のスキャン上限が meta.Len に依存
該当箇所: adapter/redis_compat_helpers.go の deleteZSetWideColumnElems
scanLimit := int(meta.Len)
for _, prefix := range [][]byte{
store.ZSetMemberScanPrefix(key),
store.ZSetScoreScanPrefix(key),
} {
kvs, err := r.store.ScanAt(ctx, prefix, store.PrefixEnd(prefix), scanLimit, readTS)
// ← meta.Len より多くキーが存在した場合、末尾のキーが削除されない通常は各書き込みトランザクション内でメンバー数と meta.Len が同期されるため問題になりませんが、万一 meta.Len が実キー数を下回る状態(例: 将来のバグ等による不整合)になると、削除しきれない孤立キーがストレージに残ります。ユーザーからは ZSet が削除されたように見えますが、内部には古いキーが残存します(ストレージリーク)。
[Concurrency] レビュー結果PR: adapter: decompose ZSet storage into per-member wide-column keys 問題あり(中程度)1. ミュータブルな map を別名で渡している(latent aliasing bug)該当箇所: コード例(bzPopMinLegacy): delete(load.members, popped.Member) // load.members を直接変更
if err := r.persistZSetMembersTxn(ctx, key, readTS, load, load.members); err != nil {問題: 現在は安全な理由: レガシーパスでは 潜在的リスク: 将来のリファクタリングや、データ不整合によって 同様のパターン: 推奨修正: // bzPopMinLegacy: newMembers は load.members のコピーを削除後に作成する
newMembers := make(map[string]float64, len(load.members))
for k, v := range load.members {
newMembers[k] = v
}
delete(newMembers, popped.Member)
if err := r.persistZSetMembersTxn(ctx, key, readTS, load, newMembers); err != nil {2. context.Background() の使用(既存の問題だが範囲拡大)該当箇所: 外部から渡された 3. meta.Len を scan limit に使用(軽微)
|
[Performance] レビュー結果PR #483: adapter: decompose ZSet storage into per-member wide-column keys[P1] 読み取り専用アクセスでの不要な O(N) マップコピー場所: origMembers := make(map[string]float64, len(load.members))
for mk, mv := range load.members {
origMembers[mk] = mv
}
改善案: [P2] zaddTxn / zincrby での二重マップアロケーション場所: load, err := r.loadZSetMembersMap(...) // map を新規確保
members := make(map[string]float64, len(load.members)) // さらにコピー
for k, v := range load.members {
members[k] = v
}
[P3] deleteZSetWideColumnElems での二回の ScanAt場所: for _, prefix := range [][]byte{
store.ZSetMemberScanPrefix(key),
store.ZSetScoreScanPrefix(key),
} {
kvs, err := r.store.ScanAt(ctx, prefix, store.PrefixEnd(prefix), scanLimit, readTS)
...
}各プレフィックスが連続した範囲を占めていれば1回の広めのスキャンで代替できる可能性がある。構造上2スキャン不可避であれば問題なし。ただし |
[Consistency] レビュー結果問題なし。 確認した主要ポイントWide-column ZSet の meta/member/score キー整合性:
Conflict detection の正確性(
Legacy blob → wide-column マイグレーションのアトミック性:
TTL と ZSet データ本体の整合性:
|
[Data Loss] レビュー結果対象PR: #483 adapter: decompose ZSet storage into per-member wide-column keys 問題1 (MEDIUM):
|
[Data Loss] レビュー結果対象PR: adapter: decompose ZSet storage into per-member wide-column keys1. レガシーBLOB → ワイドカラム移行のアトミック性問題なし
2. meta.Len の整合性維持問題なし(ただし構造上の注意あり)
3. 削除パスの網羅性問題なし
4. diff ベースコミット(buildZSetDiffElems)の正確性問題なし
5. zremrangebyrankInner のワイドカラムパス軽微な注意点(データロストには至らない) entries, err := r.scanZSetScoreEntries(ctx, key, e+1, readTS)
// ...
elems, err := buildZSetRemoveEntryElems(key, toRemove, meta.Len)
総評データロストリスク: なし(正常運用時) レガシーからワイドカラムへの移行はアトミックに設計されており、書き込みパス全体で |
[Concurrency] レビュー結果対象: #483 問題1 (中リスク):
|
[Consistency] レビュー結果PR #483「ZSet ストレージをper-member wide-column キーに分解」を、データ一貫性の観点からレビューしました。 問題1:
|
[Data Loss] レビュー結果1. [要注意] TTLキーの削除漏れ — MULTI/EXECトランザクションパスにおける非対称性場所: 問題: // レガシー空ケース(TTLを削除する)
case len(st.members) == 0 && st.fromLegacy:
return []*kv.Elem[kv.OP]{
{Op: kv.Del, Key: redisZSetKey(keyBytes)},
{Op: kv.Del, Key: redisTTLKey(keyBytes)}, // ← TTL削除あり
}, nil
// wide-column 空ケース(TTLを削除しない)
case len(st.members) == 0:
return buildZSetDiffElems(keyBytes, st.origMembers, st.members) // ← TTL削除なしデータロストシナリオ:
補足: 非トランザクション系の ZREM( 2. [軽微]
|
[Performance] レビュー結果対象PR: adapter: decompose ZSet storage into per-member wide-column keys 全体的に diff-based commit による書き込み最小化は良い改善ですが、以下の性能上の懸念点があります。 [高] ホットパスでの O(N) マップコピーファイル: members := make(map[string]float64, len(load.members))
for k, v := range load.members {
members[k] = v
}ZSet の全メンバーを毎回フルコピーしてから差分を計算している。1 件の ZADD/ZREM/ZINCRBY でも N メンバー分のコピーが走る。 提案: ロードした [中] key builder の多重アロケーション(N メンバー書き込み時に顕著)ファイル: 各ビルダーで以下の 3 アロケーションが発生: prefix := []byte(ZSetMetaPrefix) // 文字列定数→[]byte 変換で毎回アロケーション
comp := zsetUserKeyComponent(userKey) // 2回目
buf := make([]byte, 0, ...) // 3回目
提案:
[中]
|
| # | 箇所 | 深刻度 | 概要 |
|---|---|---|---|
| 1 | zaddTxn/zincrby/zrem |
高 | 毎書き込みで全メンバーマップを O(N) コピー |
| 2 | ZSetMemberKey 等キービルダー |
中 | N メンバー書き込み時に 6N+ アロケーション |
| 3 | deleteZSetWideColumnElems |
中 | 無制限スキャンで大 ZSet の DEL が OOM リスク |
| 4 | rawKeyTypeAt |
低 | ZSet 型判定の二重ルックアップ |
[Test Coverage] レビュー結果対象PR: adapter: decompose ZSet storage into per-member wide-column keys (#483)✅ 十分にテストされている箇所
|
[Concurrency] レビュー結果Issue 1 (Medium):
|
| 観点 | 評価 |
|---|---|
| Raftリーダー切り替え | proxyToLeader は全ZSet書き込みコマンドに存在、変更なし ✓ |
| write conflict 検出 | retryRedisWrite + MVCC readTS による競合検出は適切 ✓ |
| 競合トラッキングキー | trackTypeReadKeys に store.ZSetMetaKey を追加、migration中の競合も検出可能 ✓ |
| トランザクション原子性 | legacy→wide-column migration は単一 Raft エントリで原子的に実行 ✓ |
| ロック順序 | Mutex/RWMutex の使用パターンに変更なし ✓ |
| goroutine リーク | 新規 goroutine 導入なし ✓ |
| コンテキストキャンセル | context.Background() 使用は既存コードからの踏襲(新規リグレッションなし)✓ |
| フォロワー→リーダー プロキシ | 全 ZSet コマンドで proxyToLeader 呼び出し済み ✓ |
| TOCTOU (meta vs legacy 二重チェック) | 同一 readTS による snapshot isolation で保護済み ✓ |
[Consistency] レビュー結果対象 PR#483 adapter: decompose ZSet storage into per-member wide-column keys 問題あり(要確認・要対処)[C1] 重大:
|
[Data Loss] レビュー結果このPRは ZSet ストレージをレガシーブロブ形式からワイドカラム形式(per-member キー)へ分解する変更です。 1. [Medium]
|
[Performance] レビュー結果🔴 高優先度1.
|
[Test Coverage] レビュー結果概要PR #483 の ZSet ストレージ分解(blob 形式 → wide-column 形式)に対するテスト網羅性レビューです。 ✅ 適切にカバーされている箇所
❌ 不足しているテスト(要対応)1. (adapter/redis.go)— 直接テストなし4 つの分岐が存在するが、いずれも直接テストされていない。 2. (adapter/redis_compat_helpers.go)— 直接テストなし
3. (adapter/redis_compat_helpers.go)— テストなしmeta + member + score のすべてを削除するロジックが無テスト。orphaned キーのスキャン境界も未検証。 4. の wide-column パス — テストなしwide-column パス(metaExists=true)と legacy フォールバックの両方が未テスト。特に「削除範囲が全体」「単一メンバー」などの境界ケースが必要。 5. TTL × wide-column ZSet の相互作用 — テストなしwide-column 形式の ZSet に TTL をセットし、期限切れ後のアクセスが「存在しない」を返すことが未検証。 6. の ZSet meta prefix スキャン — テストなし
7. NaN スコアのエンコード — テスト対象外
8. Lua コンテキスト()の各分岐 — テストなし
|
[Consistency] レビュー結果問題1 [重大]: スコア変更時の ZSetMetaKey 未更新によるコンフリクト検出漏れ場所:
t.trackReadKey(store.ZSetMetaKey(key))一方 } else if len(origMembers) != len(newMembers) {
// count が変わらなければ ZSetMetaKey は書き込まれない
}
修正案: スコアのみの変更であっても常に 問題2 [中]: 新フォーマットで ZSet を空にした際の TTL キー残存場所: ZSet が空になった場合の処理の非対称性があります: 新フォーマットで全メンバーを ZREM した後、 修正案: 補足情報: レガシーキーのコンフリクト追跡削除について
|
[Concurrency] レビュー結果対象PR: #483 問題1:
|
[Test Coverage] レビュー結果概要PR #483 の ZSet ストレージ分解(blob 形式 → wide-column 形式)に対するテスト網羅性レビューです。 ✅ 適切にカバーされている箇所
❌ 不足しているテスト(要対応)1.
|
[Data Loss] レビュー結果PR #483: ZSet ワイドカラムストレージ分解明確なデータロストに直結する問題は見当たりませんでした。 [中]
|
[Data Loss] レビュー結果対象PR: #483 adapter: decompose ZSet storage into per-member wide-column keys結論: 潜在的なデータ可視性リスクあり(要確認)懸念事項
|
[Performance] レビュー結果PR #483: adapter: decompose ZSet storage into per-member wide-column keys1. [高影響] 削除時の無制限スキャン (,
|
[Data Loss] レビュー結果対象PR: adapter: decompose ZSet storage into per-member wide-column keys (#483)問題 1 (重大):
|
[Consistency] レビュー結果✅ 正しい実装meta/member/score キーの原子的コミット
|
[Data Loss] レビュー結果対象PR: #483 adapter: decompose ZSet storage into per-member wide-column keys 要確認: 1件の潜在的データ損失パス + 1件の事前存在バグ[要確認]
|
[Concurrency] レビュー結果PR #483 —
|
[Performance] レビュー結果
|
[Test Coverage] レビュー結果PR #483: adapter: decompose ZSet storage into per-member wide-column keysカバーされているテスト ✅
テスト不足 ❌1.
|
[Concurrency] レビュー結果PR #483: adapter: decompose ZSet storage into per-member wide-column keys問題なし。 以下の点を確認しました:
|
[Performance] レビュー結果問題あり1. 単一メンバー操作でも全メンバーをロードする(重大)場所:
例: 10,000 件のメンバーを持つ ZSet への 改善案: diff-based commit を維持しつつ、差分対象メンバーだけをロードする専用パス( 2. ZSet ロード時に origMembers の二重マップ確保(中程度)場所: ZSet をロードするたびに:
read-only な ZSet アクセス( 改善案: 3.
|
[Consistency] レビュー結果問題あり: TTL 整合性バグ(Critical)1. [Critical] 期限切れキーへの ZADD が TTL キーをクリアしない問題箇所:
case len(st.origMembers) == 0:
// Brand new ZSet: full write
return buildZSetWriteElems(keyBytes, st.members) // TTL キーに触れない
結果として:
対比: 同じファイル内の case len(st.members) == 0 && st.fromLegacy:
return []*kv.Elem[kv.OP]{
{Op: kv.Del, Key: redisZSetKey(keyBytes)},
{Op: kv.Del, Key: redisTTLKey(keyBytes)}, // ← こちらは正しく削除
}, nil修正案: 2. キーエンコーディングの一貫性
3. diff ベースコミットのアトミック性
4. メタカウントとメンバー数の整合性
5. レガシー移行のアトミック性
まとめ
|
[Consistency] レビュー結果対象PR: #483 (adapter: decompose ZSet storage into per-member wide-column keys) 🔴 [重要]
|
| # | 分類 | 内容 |
|---|---|---|
| 1 | 🔴 重要 | wide-columnケースでのTTL削除非対称性 — 孤立TTLキーのリスク |
| 2 | buildZSetDiffElems 削除が origMembers のみ対象(孤立キー対策として deleteZSetWideColumnElems の使用を検討) |
[Performance] レビュー結果[高]
|
[Performance] レビュー結果[高]
|
No description provided.