agg: lock-free Begin()#20462
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors db/state snapshot visibility handling to reduce contention by publishing a single immutable “visible bundle” via atomic.Pointer, enabling Aggregator.BeginFilesRo() to become lock-free while keeping cross-entity reads consistent (Domain + History + InvertedIndex).
Changes:
- Introduces
aggregatorVisibleand publishes it viaAggregator.visible(atomic pointer), replacingvisibleFilesLock/visibleFilesMinimaxTxNum. - Splits “recalc visible” into pure
calcVisibleFileshelpers and keeps mutatingreCalcVisibleFilespaths for standalone tests/debug usage. - Updates tests/benchmarks and integration debug tooling to use new internal/test-only begin helpers.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| db/state/aggregator.go | Publishes visibility via atomic bundle and makes BeginFilesRo() lock-free; computes minimax from bundle |
| db/state/domain.go | Adds pure calcVisibleFiles, adjusts Begin to accept bundle snapshot; introduces test-only begin |
| db/state/history.go | Adds internal begin that accepts snapshot inputs; adds BeginFilesRoForDebug for integration tools |
| db/state/inverted_index.go | Adds pure calcVisibleFiles and internal begin that accepts a visible snapshot |
| db/state/dirty_files.go | Adds visibleFiles.bumpRefcount() helper to pin non-frozen files |
| db/state/cache.go | Changes newDomainVisible / newIIVisible signatures to take visibleFiles |
| db/state/domain_test.go | Switches tests to beginWithRecalcForTests() |
| db/state/history_test.go | Switches tests/benchmarks to beginWithRecalcForTests() |
| db/state/history_key_txnum_range_test.go | Switches tests to beginWithRecalcForTests() |
| db/state/inverted_index_test.go | Switches tests to beginWithRecalcForTests() |
| db/state/merge_test.go | Switches tests to beginWithRecalcForTests() |
| db/state/merge_bench_test.go | Switches benchmark to beginWithRecalcForTests() |
| db/state/gc_test.go | Switches GC tests to beginWithRecalcForTests() |
| cmd/integration/commands/state_history.go | Uses History.BeginFilesRoForDebug() instead of removed/privatized begin |
Comments suppressed due to low confidence (2)
db/state/cache.go:63
newDomainVisiblenow takesvisibleFiles, but several call sites still pass[]visibleFile(e.g.calcVisibleFiles(...)returns[]visibleFile). As-is this won’t compile. Either (a) keep the parameter as[]visibleFileand callbumpRefcountvia a helper, or (b) changecalcVisibleFilesto returnvisibleFilesand/or explicitly convert at call sites (including literals like[]visibleFile{}).
func newDomainVisible(name kv.Domain, files visibleFiles) *domainVisible {
d := &domainVisible{
name: name,
files: files,
}
db/state/cache.go:132
newIIVisiblenow expectsvisibleFiles, but callers still pass[]visibleFile(calcVisibleFiles(...)returns[]visibleFile, andNewInvertedIndexuses[]visibleFile{}). This is a compile-time type mismatch. Consider switchingcalcVisibleFilesto returnvisibleFiles(and updating related fields) or add explicit conversions at every call site.
func newIIVisible(name string, files visibleFiles) *iiVisible {
if iiGetFromFileCacheLimit == 0 {
iiGetFromFileCacheEnabled = false
}
ii := &iiVisible{
name: name,
files: files,
caches: &sync.Pool{New: func() any { return NewIISeekInFilesCache() }},
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sudeepdino008
left a comment
There was a problem hiding this comment.
i was thinking about it yesterday - for recalcvisiblefiles, we take visible file lock, but we don't need to, we only need to lock when setting.
This is even better, LGTM.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
like #20462 - but for rosnapshots
Goal: reduce mutex-contention and make txs cheaper to create (increase throughput)
In PR:
agg.visibleatomic fieldvisibleFilesMinimaxTxNum) moved insideagg.visibleobject. So, they re-calcuclated together with other data (consistency).Aggregator.BeginFilesRo()is now lock-free.domain.BeginFilesRo()is now private method and acceptvisibleobject as a parameter (domain._visiblefield removed)