Skip to content

feat(storage): MemTable 字节级令牌桶写背压#125

Merged
NeverENG merged 1 commit into
mainfrom
feat/memtable-backpressure
Jun 6, 2026
Merged

feat(storage): MemTable 字节级令牌桶写背压#125
NeverENG merged 1 commit into
mainfrom
feat/memtable-backpressure

Conversation

@NeverENG
Copy link
Copy Markdown
Owner

@NeverENG NeverENG commented Jun 6, 2026

承接 M1 发现的「MemTable 写入无背压 → 内存随负载无界增长」。

改动

  • 新增 pkg/credit:可复用字节信用池(令牌桶式背压),含单测(阻塞/解除、超大单条放行、预算关闭、对账归零)。
  • MemTable 接入:Put/Delete 写入前预占字节信用,不足则先触发 flush 再阻塞;Flush 成功后归还信用。SkipList 维护 byteSize(覆盖写按增量,对账防泄漏)。
  • Flush 健壮性:失败保留 dirty 重试,修掉原实现「失败后被下次刷盘覆盖丢数据」的隐患。
  • 配置 MemTableMaxInflightBytes(默认 64MiB,<=0 关闭)。
  • bench-budget 开关与 inflight 采样。

验证(让 MemTable 成为瓶颈)

吞吐 inflight 峰值 heap 峰值
背压关闭 1.77M w/s 784MiB 无界 2014MiB
背压 16MiB 1.23M w/s 16.0MiB 精确封顶 173MiB

诚实边界

默认小值配置下按条数刷盘早于字节预算触发,背压平时不绑定——它是内存硬上限/安全网,不解决 M1 小值高频下 SSTable 元数据累积导致的增长(另一条线,M3 候选)。详见 docs-step/M2-backpressure-result.md

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added -budget flag to enable/disable memory backpressure for inflight data (default: 64MiB)
    • Implemented byte-level token-bucket backpressure to prevent unbounded in-memory growth
    • Extended benchmark results to track and report inflight peak memory usage
    • Updated memory reporting output format for improved clarity
  • Documentation

    • Added design documentation for byte-level backpressure mechanism and experimental validation results

- 新增 pkg/credit:可复用字节信用池(Acquire/Release/TryAcquire),含单测
- MemTable Put/Delete 写入前预占字节信用,不足则触发 flush 并阻塞;flush 归还信用
- SkipList 维护 byteSize(覆盖写按增量),信用对账防泄漏
- Flush 失败保留 dirty 重试,修掉覆盖丢数据隐患
- 配置 MemTableMaxInflightBytes(默认 64MiB,<=0 关闭)
- bench 加 -budget 开关与 inflight 采样:实测未刷盘字节精确卡在预算(16MiB),堆 2GiB→173MiB

诚实边界见 docs-step/M2-backpressure-result.md:背压是内存硬上限/安全网,
不解决 M1 小值高频下 SSTable 元数据累积导致的增长(另一条线)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 6, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds byte-level token-bucket backpressure to MemTable writes using a new reusable credit pool. The feature prevents unbounded in-memory growth by blocking writes when unflushed bytes exceed a configured budget (64MiB default), reconciles byte accounting on overwrites, and releases credits after successful flush. Benchmark observability now tracks inflight peak bytes.

Changes

Byte-Level Backpressure for MemTable

Layer / File(s) Summary
Credit Pool Foundation
pkg/credit/credit.go, pkg/credit/credit_test.go
Introduces Pool struct implementing token-bucket backpressure with Acquire (blocking), TryAcquire (non-blocking), Release (credit return), and Used (current occupancy) methods using mutex and condition variable. Tests validate sequential accounting, boundary conditions, async blocking/unblocking, oversize allowance when empty, zero-budget bypass, and clamping.
Configuration and CLI Integration
config/global.go, benchmark/ingest/main.go
Adds MemTableMaxInflightBytes config field (64MiB default) to GlobalConfig; CLI flag -budget sets the value in the benchmark; startup banner prints the configured inflight budget.
MemTable Write-Path Backpressure Integration
storage/zstorage/memtable.go
Integrates credit pool: SkipList gains byteSize tracking cumulative key+value bytes; insert now returns byte delta for overwrite reconciliation; MemTable.Put pre-acquires credits before write, reconciles via delta after insert, and triggers flush on credit exhaustion; MemTable.Delete acquires credits for tombstone write; Flush retries on prior failure (keeping dirty non-nil), releases credits on success, and exposes InflightBytes() method.
Inflight Metrics and Benchmark Results
benchmark/ingest/main.go
Result struct gains InflightPeak field; memSampler refactored to sample MemTable.InflightBytes() alongside heap metrics; both saturation and open-loop runs populate inflight peak; per-result and summary table output updated to show inflight_peak alongside heap_peak and sys_peak.
Design and Verification Documentation
docs-step/M2-backpressure-result.md
Documents credit pool design, MemTable integration (byte tracking, acquire/release semantics, flush retry logic), configuration, and experimental results demonstrating backpressure caps unflushed bytes to budget while identifying remaining memory sources from SSTable metadata accumulation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NeverENG/BanDB#24: Directly modifies MemTable write and flush mechanics; integration of credit pool backpressure into this PR's put/delete/flush logic depends on the active/dirty buffer design from that PR.

Poem

🐰 A credit pool hops through the write path
Blocking bytes that'd overflow the bath
SkipLists count their weight with care
Flush returns what's worn and fair
Backpressure keeps the memory fair

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: introducing byte-level token-bucket write backpressure for MemTable to solve unbounded memory growth.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/memtable-backpressure

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 6, 2026

🐯 BanGD 数据库内核评审

整体风险:🟡 中

变更总结:## PR 变更总结

本 PR 在 MemTable 写入路径(存储层内存表)引入了字节级令牌桶背压,解决 M1 发现的「写入无背压 → 内存随负载无界增长」问题。核心改法在三个层面:

  1. 新增 pkg/credit:一个泛化的阻塞式字节信用池,用 sync.Mutex + sync.Cond 实现。Acquire 在信用不足时阻塞,Release 唤醒等待者。预算 ≤0 关闭背压。
  2. MemTable 写入路径接入背压
    • SkipList 新增 byteSize 字段,覆盖写按新旧 value 差值增量维护,作为信用对账的依据。
    • Put/Delete 写入前先 acquireCredit(full)(key+value 字节和),TryAcquire 失败则触发 StartFlush 再阻塞 Acquire
    • 写入后按 insert 返回的实际 delta 归还多占信用(full - delta),防止泄漏。
    • Flush 成功后 Release(dirty.byteSize) 归还信用并唤醒写者。
  3. Flush 健壮性:修掉了原 Flush 失败后被下次刷盘覆盖 dirty 的隐患——失败时保留 dirty 并重试。
  4. 配置与观测config.G.MemTableMaxInflightBytes(默认 64MiB),bench 工具集成 inflight 采样。

PR 附带了详尽的验证数据(背压 16MiB 下 inflight 精确封顶 16.0MiB,heap 从 2GiB 降到 173MiB),并诚实标注了背压的边界——「不解决小值高频下 SSTable 元数据累积」。

本评审不阻塞合入;架构级建议以 Issue 形式跟踪,普通问题在下方内联列出。

架构问题(共 4 项)

普通问题(共 1 项)

💡 [建议 · 兼容] pkg/credit/credit.go Pool 零值不可用但文档警告不够醒目

  • Pool 的注释写明了「零值不可用,请用 New 构造」,但 New 分配了 sync.Cond(绑定到 mu),零值 Pool 的 cond 为 nil,直接调用 Acquire/Release 会 nil pointer panic。
  • 建议:建议在 Acquire/Release/TryAcquire/Used 开头加 nil-check guard:if p.cond == nil { return } 或 panic with clear message;或者使零值可用(如懒初始化 cond)。

本次评审消耗 token:共 287836 tokens(输入 248289,输出 11387,缓存命中 28160,缓存写入 0)|维度 [concurrency, memory, lock, storage, performance]|补充阅读周边文件 [storage/engine.go, config/config.json]|对抗式复核 3 票/条,过滤疑似误报 1 条

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmark/ingest/main.go`:
- Around line 34-37: The code always overwrites
config.G.MemTableMaxInflightBytes with a hard-coded default (64<<20) when the
-budget flag is not provided; change the flag default to the current config
value so omission preserves configured settings. Specifically, when creating the
flag (flag.Int64/flag.Int64Var for variable budget) use
config.G.MemTableMaxInflightBytes as the default instead of 64<<20, then after
flag.Parse() assign config.G.MemTableMaxInflightBytes = *budget only if the flag
was actually parsed (or simply rely on the flag default matching the config so
the assignment is safe). Reference symbols: budget, flag.Int64/flag.Int64Var,
flag.Parse, config.G.MemTableMaxInflightBytes.

In `@pkg/credit/credit.go`:
- Around line 29-58: Ensure the exported API rejects negative amounts: in
TryAcquire(n int64) return false immediately for n <= 0 (treat 0 as no-op), and
in Acquire(n int64) and Release(n int64) validate n >= 0 at the top and panic on
n < 0 (treat n == 0 as a no-op) so callers cannot corrupt Pool.used; add these
guards to the start of Pool.TryAcquire, Pool.Acquire and Pool.Release (keep
existing locking/cond logic unchanged and use Pool.fits as before).

In `@storage/zstorage/memtable.go`:
- Around line 189-196: The blocking path in MemTable.acquireCredit can deadlock
writers during shutdown; change acquireCredit so the blocking wait is
shutdown-aware: instead of calling m.credits.Acquire(n) unconditionally, perform
a cancellable wait that selects on credit availability and m.stopCh (or use a
context-aware Acquire if available), and return an error when stopCh is closed;
update callers (Put/Delete call sites referenced around the blocks at the
Put/Delete flows) to handle the new error return (propagate or abort the
operation) and ensure StartFlush() is still invoked before entering the
cancellable wait.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6ff30fa5-c309-4882-b3aa-969642207fb9

📥 Commits

Reviewing files that changed from the base of the PR and between 2a042c6 and b726ff6.

📒 Files selected for processing (6)
  • benchmark/ingest/main.go
  • config/global.go
  • docs-step/M2-backpressure-result.md
  • pkg/credit/credit.go
  • pkg/credit/credit_test.go
  • storage/zstorage/memtable.go

Comment thread benchmark/ingest/main.go
Comment on lines +34 to +37
budget := flag.Int64("budget", 64<<20, "MemTableMaxInflightBytes 字节预算(令牌桶背压);0 关闭背压")
flag.Parse()

config.G.MemTableMaxInflightBytes = *budget // 由命令行覆盖,便于 before/after 对比
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Don't clobber the configured inflight budget when -budget is omitted.

This flag defaults to a hard-coded 64<<20, and Line 37 writes it back unconditionally, so any MemTableMaxInflightBytes value loaded from config is lost unless the caller passes -budget explicitly.

Suggested fix
-	budget := flag.Int64("budget", 64<<20, "MemTableMaxInflightBytes 字节预算(令牌桶背压);0 关闭背压")
+	budget := flag.Int64("budget", config.G.MemTableMaxInflightBytes, "MemTableMaxInflightBytes 字节预算(令牌桶背压);0 关闭背压")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmark/ingest/main.go` around lines 34 - 37, The code always overwrites
config.G.MemTableMaxInflightBytes with a hard-coded default (64<<20) when the
-budget flag is not provided; change the flag default to the current config
value so omission preserves configured settings. Specifically, when creating the
flag (flag.Int64/flag.Int64Var for variable budget) use
config.G.MemTableMaxInflightBytes as the default instead of 64<<20, then after
flag.Parse() assign config.G.MemTableMaxInflightBytes = *budget only if the flag
was actually parsed (or simply rely on the flag default matching the config so
the assignment is safe). Reference symbols: budget, flag.Int64/flag.Int64Var,
flag.Parse, config.G.MemTableMaxInflightBytes.

Comment thread pkg/credit/credit.go
Comment on lines +29 to +58
// TryAcquire 不阻塞地尝试占用 n 字节信用,成功返回 true。
func (p *Pool) TryAcquire(n int64) bool {
p.mu.Lock()
defer p.mu.Unlock()
if p.fits(n) {
p.used += n
return true
}
return false
}

// Acquire 占用 n 字节信用;信用不足时阻塞,直到他方 Release 释放出空间。
func (p *Pool) Acquire(n int64) {
p.mu.Lock()
defer p.mu.Unlock()
for !p.fits(n) {
p.cond.Wait()
}
p.used += n
}

// Release 归还 n 字节信用并唤醒所有等待者。
func (p *Pool) Release(n int64) {
p.mu.Lock()
p.used -= n
if p.used < 0 {
p.used = 0
}
p.mu.Unlock()
p.cond.Broadcast()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject negative credit amounts at the API boundary.

TryAcquire(-n) / Acquire(-n) currently reduce used, and Release(-n) increases it, so any caller can corrupt the pool’s accounting through the exported API.

Suggested fix
+func requireNonNegative(n int64) {
+	if n < 0 {
+		panic("credit: negative amount")
+	}
+}
+
 // TryAcquire 不阻塞地尝试占用 n 字节信用,成功返回 true。
 func (p *Pool) TryAcquire(n int64) bool {
+	requireNonNegative(n)
 	p.mu.Lock()
 	defer p.mu.Unlock()
 	if p.fits(n) {
@@
 // Acquire 占用 n 字节信用;信用不足时阻塞,直到他方 Release 释放出空间。
 func (p *Pool) Acquire(n int64) {
+	requireNonNegative(n)
 	p.mu.Lock()
 	defer p.mu.Unlock()
 	for !p.fits(n) {
@@
 // Release 归还 n 字节信用并唤醒所有等待者。
 func (p *Pool) Release(n int64) {
+	requireNonNegative(n)
 	p.mu.Lock()
 	p.used -= n
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// TryAcquire 不阻塞地尝试占用 n 字节信用,成功返回 true。
func (p *Pool) TryAcquire(n int64) bool {
p.mu.Lock()
defer p.mu.Unlock()
if p.fits(n) {
p.used += n
return true
}
return false
}
// Acquire 占用 n 字节信用;信用不足时阻塞,直到他方 Release 释放出空间。
func (p *Pool) Acquire(n int64) {
p.mu.Lock()
defer p.mu.Unlock()
for !p.fits(n) {
p.cond.Wait()
}
p.used += n
}
// Release 归还 n 字节信用并唤醒所有等待者。
func (p *Pool) Release(n int64) {
p.mu.Lock()
p.used -= n
if p.used < 0 {
p.used = 0
}
p.mu.Unlock()
p.cond.Broadcast()
func requireNonNegative(n int64) {
if n < 0 {
panic("credit: negative amount")
}
}
// TryAcquire 不阻塞地尝试占用 n 字节信用,成功返回 true。
func (p *Pool) TryAcquire(n int64) bool {
requireNonNegative(n)
p.mu.Lock()
defer p.mu.Unlock()
if p.fits(n) {
p.used += n
return true
}
return false
}
// Acquire 占用 n 字节信用;信用不足时阻塞,直到他方 Release 释放出空间。
func (p *Pool) Acquire(n int64) {
requireNonNegative(n)
p.mu.Lock()
defer p.mu.Unlock()
for !p.fits(n) {
p.cond.Wait()
}
p.used += n
}
// Release 归还 n 字节信用并唤醒所有等待者。
func (p *Pool) Release(n int64) {
requireNonNegative(n)
p.mu.Lock()
p.used -= n
if p.used < 0 {
p.used = 0
}
p.mu.Unlock()
p.cond.Broadcast()
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/credit/credit.go` around lines 29 - 58, Ensure the exported API rejects
negative amounts: in TryAcquire(n int64) return false immediately for n <= 0
(treat 0 as no-op), and in Acquire(n int64) and Release(n int64) validate n >= 0
at the top and panic on n < 0 (treat n == 0 as a no-op) so callers cannot
corrupt Pool.used; add these guards to the start of Pool.TryAcquire,
Pool.Acquire and Pool.Release (keep existing locking/cond logic unchanged and
use Pool.fits as before).

Comment on lines +189 to +196
// acquireCredit 为本次写入预占 n 字节信用;不足时先触发刷盘以归还信用,再阻塞等待。
func (m *MemTable) acquireCredit(n int64) {
if m.credits.TryAcquire(n) {
return
}
m.StartFlush() // 确保有 flush 在路上来归还信用,避免永久阻塞
m.credits.Acquire(n)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make blocked credit waits shutdown-aware.

With the new blocking Acquire() path, Close() can strand writers forever: once stopCh is closed, FlushWorker may exit before servicing a pending flush, and any Put/Delete already waiting in acquireCredit() has no way to unblock.

Also applies to: 309-311, 361-368

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@storage/zstorage/memtable.go` around lines 189 - 196, The blocking path in
MemTable.acquireCredit can deadlock writers during shutdown; change
acquireCredit so the blocking wait is shutdown-aware: instead of calling
m.credits.Acquire(n) unconditionally, perform a cancellable wait that selects on
credit availability and m.stopCh (or use a context-aware Acquire if available),
and return an error when stopCh is closed; update callers (Put/Delete call sites
referenced around the blocks at the Put/Delete flows) to handle the new error
return (propagate or abort the operation) and ensure StartFlush() is still
invoked before entering the cancellable wait.

@NeverENG NeverENG merged commit f71e8a0 into main Jun 6, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant