Skip to content

Convert RocksDB kvstores to coroutines#13186

Open
tclinkenbeard-oai wants to merge 4 commits into
apple:mainfrom
tclinkenbeard-oai:dev/tclinkenbeard/kvstore-rocksdb-coroutines
Open

Convert RocksDB kvstores to coroutines#13186
tclinkenbeard-oai wants to merge 4 commits into
apple:mainfrom
tclinkenbeard-oai:dev/tclinkenbeard/kvstore-rocksdb-coroutines

Conversation

@tclinkenbeard-oai
Copy link
Copy Markdown
Collaborator

Summary

Convert the RocksDB and sharded RocksDB key-value store implementations from Flow actor syntax to standard coroutines.

As part of the migration, extend coroutine race() support to handle ThreadFutureStream directly and add focused coverage for that path.

Details

The bulk of this change is a mechanical migration of:

  • KeyValueStoreRocksDB.actor.cppKeyValueStoreRocksDB.cpp
  • KeyValueStoreShardedRocksDB.actor.cppKeyValueStoreShardedRocksDB.cpp

During validation, the initial coroutine rewrite exposed an important behavioral difference in refreshReadIteratorPool(). The actor version used choose { ... } over a timer and a ThreadFutureStream. The first coroutine version wrapped the stream in a helper Future so it could participate in race(), but on timer wins that allowed the next loop iteration to register a second waiter on the same ThreadFutureStream before the losing waiter had been torn down. That violates the stream’s single-waiter invariant and reproduced as:

SingleCallback<T>::next == this

Rather than preserve the helper workaround, this change teaches the coroutine machinery to understand ThreadFutureStream as a first-class stream input to race(). refreshReadIteratorPool() can then race the stream directly, which keeps loser cleanup inside the shared race implementation and matches the actor-era ownership model more closely.

This also adds focused coroutine tests for ThreadFutureStream participation in race().

Validation

  • Built the correctness package successfully with package_tests_u
  • Ran focused unit coverage:
    • /flow/coro/raceThreadFutureStreamReady
    • /flow/coro/raceThreadFutureStreamSuccess
  • Replayed the exact prior failing simulation successfully:
    • tests/rare/FailoverWithSSLag.toml
    • seed 1160450224
    • Joshua seed 2634315109526344483
  • Submitted a broader Joshua validation ensemble:
    • 20260509-204651-joshua-proxy-35ee7283f8e63538
    • clean at 2289 passed / 0 failed when last checked

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang-ide on Linux RHEL 9

  • Commit ID: 697f176
  • Duration 0:22:41
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr on Linux RHEL 9

  • Commit ID: 697f176
  • Duration 0:24:32
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

  • Commit ID: 697f176
  • Duration 0:24:31
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 697f176
  • Duration 0:44:24
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang on Linux RHEL 9

  • Commit ID: 697f176
  • Duration 1:01:02
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 697f176
  • Duration 1:15:45
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 697f176
  • Duration 1:17:19
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang-ide on Linux RHEL 9

  • Commit ID: 8b947a8
  • Duration 0:23:00
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 8b947a8
  • Duration 0:38:44
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 8b947a8
  • Duration 0:44:41
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang on Linux RHEL 9

  • Commit ID: 8b947a8
  • Duration 0:53:55
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr on Linux RHEL 9

  • Commit ID: 8b947a8
  • Duration 1:03:32
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

  • Commit ID: 8b947a8
  • Duration 1:08:36
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 8b947a8
  • Duration 1:50:49
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@gxglass gxglass requested a review from neethuhaneesha May 11, 2026 04:11
@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang-ide on Linux RHEL 9

  • Commit ID: 5281eef
  • Duration 0:24:54
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 5281eef
  • Duration 0:45:01
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr on Linux RHEL 9

  • Commit ID: 5281eef
  • Duration 0:48:23
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 5281eef
  • Duration 0:54:48
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang on Linux RHEL 9

  • Commit ID: 5281eef
  • Duration 0:58:45
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 5281eef
  • Duration 0:59:42
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

  • Commit ID: 5281eef
  • Duration 1:09:23
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@tclinkenbeard-oai tclinkenbeard-oai marked this pull request as ready for review May 13, 2026 07:05
@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang-ide on Linux RHEL 9

  • Commit ID: 9c0ce7f
  • Duration 0:25:10
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 9c0ce7f
  • Duration 0:35:48
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 9c0ce7f
  • Duration 0:44:52
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 9c0ce7f
  • Duration 0:47:05
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang on Linux RHEL 9

  • Commit ID: 9c0ce7f
  • Duration 1:03:01
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr on Linux RHEL 9

  • Commit ID: 9c0ce7f
  • Duration 1:03:20
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

  • Commit ID: 9c0ce7f
  • Duration 1:06:57
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

Copy link
Copy Markdown
Collaborator Author

@tclinkenbeard-oai tclinkenbeard-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by Codex.

What is it trying to do?

This PR converts the RocksDB and sharded RocksDB key-value store implementations from Flow actor syntax to standard coroutines, then extends race() so ThreadFutureStream can participate directly. The latter is used to preserve the old choose { delay(...) / waitNext(...) } behavior in refreshReadIteratorPool() without reintroducing the single-waiter violation described in the PR. It also adds focused coroutine coverage for ready, successful, and loser-cleanup ThreadFutureStream race paths.

Is it correct?

I think so. I inspected the coroutine rewrites in both RocksDB kvstores, the ThreadFutureStream additions in the coroutine race plumbing, and the new tests. The race() change is consistent with the existing FutureStream handling: ThreadFutureStream is classified as a stream type, uses the same single-callback registration path, and therefore gets the same callback removal behavior on losing branches. That is exactly the property refreshReadIteratorPool() needs.

I also checked the more failure-prone migration areas: close/dispose flows remain intentionally fire-and-forget via Uncancellable, semaphore-backed read helpers still retain their releasers across awaits, commit paths preserve post/wait ordering, and the follow-up checkpoint/test changes eliminate the visible awaited brace-init temporary lifetime hazards introduced in the first version.

I did not run builds or tests myself. The latest visible PR status rollup is green across the FoundationDB PR builder, clang, clang-arm, clang-ide, macOS, macOS M1, and cluster-test checks.

Are there bugs?

I did not find any correctness bugs.

Are there omissions?

None that I think block this.

A direct race(ThreadFutureStream, ...) error-path test would tighten the new specialization a bit further, but the current coverage already exercises the important new behavior: ready consumption, successful async wakeup, and loser cleanup/reuse of the same stream after another race wins.

Are there better ways of doing things?

The direct race() support for ThreadFutureStream looks like the better design than keeping a local adapter future in refreshReadIteratorPool(). Centralizing the ownership and loser-cleanup behavior in the shared race implementation is simpler and more robust than preserving a one-off wrapper at the call site.

Should this CL be LGTMd?

Yes, LGTM.

I reviewed the coroutine race implementation, the ThreadFutureStream integration and tests, and the RocksDB/sharded RocksDB coroutine conversions around close, commit, read, iterator refresh, and checkpoint-related awaits. The main residual risk is the usual one for a broad actor-to-coroutine migration: a subtle lifetime or semantic mismatch hiding in a mechanically converted path. I did not find one here, and the targeted follow-up fixes/tests address the sharpest surfaced edge case.

@neethuhaneesha
Copy link
Copy Markdown
Contributor

Will you please be able to test performance with mako that you did for other critical changes(like storage server)

@tclinkenbeard-oai
Copy link
Copy Markdown
Collaborator Author

@neethuhaneesha I did detect some regression, working on fixing that

@tclinkenbeard-oai
Copy link
Copy Markdown
Collaborator Author

It looks like the regression was actually from accidentally comparing to a different configuration running on main, further testing shows no regression
mako-trace-comparison-918b68e-9c0ce7f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants