refactor: SVS shared thread pool with rental model (MOD-9881)#925
Merged
refactor: SVS shared thread pool with rental model (MOD-9881)#925
Conversation
…l terminology** Rename `setNumThreads`/`getNumThreads` → `setParallelism`/`getParallelism` and `getThreadPoolCapacity` → `getPoolSize` across VectorSimilarity. Public info API fields (`numThreads`, `lastReservedThreads`, `NUM_THREADS`, `LAST_RESERVED_NUM_THREADS`) remain unchanged. No behavioral changes.
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #925 +/- ##
==========================================
- Coverage 96.98% 96.91% -0.08%
==========================================
Files 129 129
Lines 7574 7651 +77
==========================================
+ Hits 7346 7415 +69
- Misses 228 236 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
thpool tests fix fp16 tests
align bm
dor-forer
reviewed
Apr 9, 2026
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit df969a1. Configure here.
dor-forer
reviewed
Apr 15, 2026
| @@ -436,24 +438,44 @@ class VecSimSVSThreadPoolImpl { | |||
| // on shrink. new_size is total parallelism including the calling thread (minimum 1). | |||
| // Occupied threads (held by renters) survive shrink via shared_ptr — their OS thread | |||
Collaborator
There was a problem hiding this comment.
Not so important but I think it's not a shared_ptr anymore
dor-forer
approved these changes
Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Overview
Replaces per-index SVS thread pools with a single shared thread pool that can be physically resized at runtime, and introduces a rental-based execution model for
parallel_for. This eliminates over-provisioning (previously K indexes × N threads = K×N OS threads) and enables the SVS pool to actually grow/shrink when the external pool capacity changes — something the old design could not do.Also adds a new C API
VecSim_UpdateThreadPoolSize()that combines write-mode toggling with pool resizing, and deprecatesSVSParams::num_threads.Key Guarantees and Assumptions
The fundamental scheduling invariant
The shared SVS thread pool has the same size as the RediSearch worker thread pool (
search-workers). Every SVS multi-threaded operation (update, GC) is dispatched viaSVSMultiThreadJob, which submits N reserve jobs to the RediSearch worker pool — one per worker thread. Only the threads that actually check in (are not busy with other work) participate in the SVS operation. This means:rent()requests can never exceed the pool capacity, because each rented SVS thread corresponds to a RediSearch worker that checked in via a reserve job. If a worker is busy, it doesn't check in, so it's never counted inavailableThreadsand never requested from the SVS pool.setParallelism(n)is always called withn = min(availableThreads, problemSize), whereavailableThreads ≤ poolSize(). This is why the assertionn <= pool->size()is safe.ThreadSlots are held viashared_ptr; shrinking drops the pool's reference but the renter keeps the slot (and its OS thread) alive until~RentedThreadsreleases it. Resize never blocks on in-flight operations.Other invariants
VecSimSVSThreadPoolImpl::instance()). There is nonullptrstate; write-in-place mode uses a pool of size 1 (0 worker threads, calling thread only).parallelism_ >= 1— the calling thread always participates, so parallelism can never be 0.pending_jobs_) prevents the pool from shrinking while scheduled jobs are in flight. Since all multi-threaded SVS operations run within scheduled jobs, rentedThreadSlotraw pointers are guaranteed to remain valid for the lifetime ofRentedThreads.shared_ptr<atomic<size_t>> parallelism_;setParallelism()on one index never affects another.Relationship with RediSearch
VecSim_UpdateThreadPoolSize(N)when the worker count changes, so VecSim can create/destroy the matching number of OS threads.Changes from Old Implementation
VecSimSVSThreadPoolImplVecSimSVSThreadPoolImplshared across all indexes viashared_ptrresize()clamped a logical counterresize()spawns/destroys OS threadsparallel_forthreadingatomic<bool>setNumThreads(n)/resize(n)— mutated the poolsetParallelism(n)— sets a per-indexatomic<size_t>, pool unchangedcapacity()(max pre-allocated)poolSize()(current shared pool size)manage_exception_during_run()threw unconditionallymanage_workers_after_run()only throws if an error occurredAPI Changes
Renamed methods
setNumThreads(n)setParallelism(n)SVSIndex,SVSIndexBasevirtual interfacegetNumThreads()getParallelism()SVSIndex,SVSIndexBasevirtual interfacegetThreadPoolCapacity()getPoolSize()SVSIndex,SVSIndexBasevirtual interfaceRemoved methods
VecSimSVSThreadPool::capacity()poolSize()for shared pool sizeVecSimSVSThreadPool::resize()setParallelism()for per-index control,VecSimSVSThreadPool::resize(size_t)(static) for global poolNew C API
New internal APIs
VecSimSVSThreadPoolImpl::instance()shared_ptr<VecSimSVSThreadPoolImpl>by valueVecSimSVSThreadPool(void* log_ctx)VecSimSVSThreadPool::resize(size_t)VecSimSVSThreadPool::poolSize()VecSimSVSThreadPoolImpl::rent(count, log_ctx)RentedThreads)Deprecated
VecSim_SetWriteMode()VecSim_UpdateThreadPoolSize()VecSim_UpdateThreadPoolSize) and unit tests. Not called by RediSearch — RediSearch will use onlyVecSim_UpdateThreadPoolSizeSVSParams::num_threadsVecSim_UpdateThreadPoolSize()New types
ThreadSlot— wrapssvs::threads::Thread+atomic<bool> occupied; non-copyable, non-movableRentedThreads— RAII guard; move-only; holds rawThreadSlot*pointers (noshared_ptroverhead); releases slots via lock-free atomic stores in destructorbool isScheduledflag onSVSMultiThreadJob; its destructor callsendScheduledJob()on the shared pool (no separateScheduledJobTokenclass)Behavioral Changes
setParallelism(0)resize(0)silently clamped to 1setParallelism(n > poolSize)search-workersn <= pool->size()— scheduling bugn > parallelisminparallel_forThreadingExceptionifn > size_parallelism_initial valuelogCallback, asserts in debug, usesrented.count()for graceful degradation in releaseparallel_formanage_exception_during_run()always threwmanage_workers_after_run()only throws if an error actually occurredupdateSVSIndexWrapperlabels_to_moveis emptySVSParams::num_threadsset to non-zeroPublic info API fields — unchanged
The debug info field names
numThreadsandlastReservedThreads(and their string keysNUM_THREADS,LAST_RESERVED_NUM_THREADS) are unchanged. Their semantics shift slightly:numThreads→ now reportsgetPoolSize()(shared pool size, not per-index capacity)lastReservedThreads→ now reportsgetParallelism()(per-index parallelism for last operation)What Does NOT Change
deps/ScalableVectorSearch/) — no modificationsSVSMultiThreadJobpattern — still submits N reserve jobs to the external worker poolThreadstate machine — used as-is; lifecycle managed viashared_ptrFiles Changed
src/VecSim/algorithms/svs/svs_utils.hThreadSlot,RentedThreads, rental-basedVecSimSVSThreadPoolImpl, per-indexVecSimSVSThreadPoolwrappersrc/VecSim/algorithms/svs/svs.hnum_threadsdeprecation warningsrc/VecSim/algorithms/svs/svs_tiered.hupdateSVSIndexWrapper,SVSMultiThreadJobusesbool isScheduled+ destructor-basedendScheduledJob()instead ofScheduledJobToken,createJobsImplmerged intocreateJobssrc/VecSim/vec_sim.h/vec_sim.cppVecSim_UpdateThreadPoolSize()C APIsrc/VecSim/vec_sim_common.hSVSParams::num_threadsmarked as deprecated in commentstests/unit/test_svs_tiered.cppVecSim_UpdateThreadPoolSizein setup, removednum_threadsfrom params, updated assertionstests/unit/test_svs_fp16.cpptests/unit/test_svs.cppnumThreadsinfo field expects default; newNumThreadsParamIgnoredtesttests/benchmark/bm_utils.h/bm_vecsim_svs.hTesting
testThreadPoolrewritten to cover: default parallelism,setParallelismboundary assertions (ASSERT_DEATH),parallel_forwithn < parallelism, write-in-place mode, exception handlingTestDebugInfoThreadCount/TestDebugInfoThreadCountWriteInPlaceupdated:lastReservedThreadsnow starts at 1 (notnum_threads)ThreadsReservationtest usestieredIndexMock(num_threads)to properly size the shared poolNumThreadsParamIgnored— verifies that settingSVSParams::num_threadsemits a deprecation warning, does not affect the shared pool size, and that omitting it produces no warningnum_threads = 1inSVSParamsnow usetieredIndexMock(1)to resize the shared pool insteadMark if applicable
Note
High Risk
High risk due to substantial changes to SVS concurrency, thread scheduling, and resize semantics that affect async update/GC execution and can introduce deadlocks or race conditions if invariants are violated.
Overview
Refactors SVS threading from per-index pools to a shared singleton thread pool that can physically grow/shrink and rent worker threads per
parallel_forcall, including a deferred-shrink mechanism to avoid invalidating scheduled jobs.Renames the SVS per-operation knob from
get/setNumThreadstoget/setParallelism, updates tiered SVS update/GC scheduling to create scheduled jobs that snapshot pool size, and adjusts debug info to report shared pool size (numThreads) and last reserved parallelism.Adds
VecSim_UpdateThreadPoolSize()to atomically toggle write mode (async vs in-place) and resize the shared SVS pool, marksSVSParams::num_threadsas deprecated/ignored (with warning log), and updates benchmarks/unit tests including new coverage for resize/rental/deferred-shrink behavior.Reviewed by Cursor Bugbot for commit f5d65fd. Bugbot is set up for automated code reviews on this repo. Configure here.