Skip to content

fix(skipmap): lost writes when Store races with Delete#37

Open
jmasters-git wants to merge 1 commit intobytedance:mainfrom
jmasters-git:issue/skipmap-lost-write-fix
Open

fix(skipmap): lost writes when Store races with Delete#37
jmasters-git wants to merge 1 commit intobytedance:mainfrom
jmasters-git:issue/skipmap-lost-write-fix

Conversation

@jmasters-git
Copy link
Copy Markdown

Resolves #36

Problem

Two linearizability bugs in skipmap's concurrent operations:

  1. Store writes to a dead node. Between checking !marked and calling storeVal, a concurrent Delete/LoadAndDelete can mark and unlink the node. The value is written to a node that is no longer reachable.

  2. LoadOrStore reads from a half-linked node. LoadOrStore and LoadOrStoreLazy only checked !marked before returning a loaded value, but Load also requires fullyLinked. A "loser" LoadOrStore can report a key as present while a concurrent Load returns nil for the same key.

Both were found using porcupine linearizability testing.

Fix

  • Store: require fullyLinked && !marked before touching the node, then lock and recheck marked before writing.
  • LoadOrStore / LoadOrStoreLazy: require fullyLinked && !marked (via MGet) before returning the loaded value, matching what Load already does.
  • Regenerated types.go.

LoadAndDelete, Delete, Load, and Range already had the correct checks.

Tests

  • TestStoreLoadAndDeleteRace: races Store vs LoadAndDelete on a pre-populated key, asserts the post-condition matches a valid linearization.
  • TestLoadOrStoreLoadRace: races 8 LoadOrStore goroutines on the same absent key, asserts Load succeeds immediately after on each goroutine.

Both tests fail reliably on the old code and pass on the fix.

…rent delete

Store could write to a node being concurrently deleted, losing the value.
LoadOrStore and LoadOrStoreLazy could read from a half-linked node not yet
visible to Load. Require fullyLinked before operating on found nodes, and
hold the node lock across the marked check and value write in Store.
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 26, 2026

CLA assistant check
All committers have signed the CLA.

@XQ-Gang XQ-Gang requested a review from SilverRainZ March 26, 2026 06:06
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 60.00000% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.27%. Comparing base (964b2cc) to head (d364084).

Files with missing lines Patch % Lines
collection/skipmap/gen_func.go 40.00% 5 Missing and 4 partials ⚠️
collection/skipmap/gen_ordereddesc.go 53.33% 4 Missing and 3 partials ⚠️
collection/skipmap/gen_ordered.go 86.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #37      +/-   ##
==========================================
- Coverage   96.42%   96.27%   -0.15%     
==========================================
  Files          34       34              
  Lines        3828     3843      +15     
==========================================
+ Hits         3691     3700       +9     
+ Misses         95       94       -1     
- Partials       42       49       +7     
Flag Coverage Δ
go-1.18.x 95.06% <55.55%> (-0.38%) ⬇️
go-1.19.x 95.38% <55.55%> (-0.12%) ⬇️
go-1.20.x 95.88% <55.55%> (-0.12%) ⬇️
go-1.21.x 95.86% <55.55%> (-0.15%) ⬇️
go-1.22.x 95.78% <55.55%> (-0.20%) ⬇️
go-1.23.x 95.70% <60.00%> (-0.25%) ⬇️
go-1.24.x 95.78% <55.55%> (-0.28%) ⬇️
unittests 96.27% <60.00%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: skipmap: Store value silently lost under concurrent LoadAndDelete

2 participants