Skip to content

(improvement): LWT routing plan cache#769

Closed
mykaul wants to merge 4 commits intoscylladb:masterfrom
mykaul:lwt_routing_plan_cache
Closed

(improvement): LWT routing plan cache#769
mykaul wants to merge 4 commits intoscylladb:masterfrom
mykaul:lwt_routing_plan_cache

Conversation

@mykaul
Copy link
Copy Markdown

@mykaul mykaul commented Mar 12, 2026

Summary

  • Add a zero-allocation LWT pick path (pickLWTReplicas) that avoids heap allocations for the used-host map and replica clone, with deterministic local-before-remote replica ordering
  • Cache immutable routingPlan structs per prepared statement in a bounded LRU (routingPlanLRU), enabling lock-free reads for isLWT(), getPartitioner(), Keyspace(), and Table() via atomic.Pointer
  • Unify the duplicate createRoutingKey / createRoutingKeyFromPlan into a single function, eliminating ~30 lines of duplicated marshaling logic
  • Guard against nil host from GetHostForToken on empty token ring (pre-existing bug, independent fix)

Commits

# SHA Description Cherry-pickable
1 91f2b1b Zero-alloc LWT pick path with deterministic replica ordering Yes (standalone)
2 1227ce9 routingPlan cache with bounded LRU to eliminate per-query mutex overhead Yes (standalone)
3 3ccddad Unify createRoutingKey and createRoutingKeyFromPlan Requires commit 2
4 23b6806 Guard against nil host from GetHostForToken on empty token ring Yes (standalone, pre-existing bug fix)

Design

Commit 1: pickLWTReplicas

  • 3-phase state machine: phase 0 (local replicas), phase 1 (remote replicas by ascending tier), phase 2 (fallback with deduplication)
  • Uses a fixed [9]*HostInfo array for dedup tracking; lazily allocates a map[*HostInfo]bool only if replicas exceed 9
  • isLWT check hoisted before replica lookup
  • policyHostTier helper deduplicates tier computation between LWT and non-LWT paths
  • Documents the immutability contract on hostTokens.hosts (LWT path relies on this to skip cloning)

Commit 2: routingPlan cache

  • routingPlan is an immutable struct holding keyspace, table, partitioner, lwt, indexes, types
  • Shared across all queries using the same prepared statement via Session.routingPlanCache (routingPlanLRU)
  • routingPlanLRU wraps internal/lru.Cache with Get/Put/Remove methods; Put implements get-or-store semantics under a single mutex lock to handle concurrent races
  • Bounded by MaxRoutingKeyInfo (default 1000) to prevent unbounded memory growth
  • queryRoutingInfo.plan is an atomic.Pointer[routingPlan] — when set, hot-path readers bypass the RWMutex entirely
  • conn.go *RequestErrUnprepared handlers invalidate both routingPlanCache and routingKeyInfoCache entries, plus the per-query/batch plan pointer, so recursive re-execution does not see stale plan data after schema changes
  • Mutex-protected fields remain as fallback for the conn.go write path
  • Behavioral change: Batch.Keyspace() now consults the cached routingPlan before falling back to session default, enabling tablet routing for batches (previously a no-op since Batch.Table() returned "" without a plan)

Commit 3: createRoutingKey unification

  • Single createRoutingKey(indexes []int, types []TypeInfo, values []interface{}) replaces both old functions
  • Query.GetRoutingKey() and Batch.GetRoutingKey() extract indexes/types from cached plan
  • Bounds validation: returns descriptive errors instead of panicking on mismatched inputs
  • New unit tests covering bounds validation error paths (fewer types than indexes, out-of-range index)

Commit 4: nil host guard (pre-existing bug fix)

  • When tokenRing.tokens is empty, GetHostForToken returns nil. The old code unconditionally wrapped this in a []*HostInfo{nil} slice, which propagated to policyHostTier(nil, ...)fallback.IsLocal(nil)nil.DataCenter()panic
  • Fix: only add host to replicas if non-nil
  • Affects both LWT and non-LWT paths — this is a pre-existing bug, not a regression from this PR
  • Independent commit for cherry-picking to other branches

Tests

34 new unit tests covering:

  • LWT pick (commit 1): deterministic ordering, local-before-remote, all-local-down fallback, non-LWT shuffle preservation, rack-aware 3-tier, high-RF map fallback
  • routingPlan cache (commit 2): isLWT/getPartitioner with/without plan, Keyspace/Table with plan and fallback, query reset, pool reset, LRU cache behavior (eviction, race semantics), concurrent access, batch Keyspace/Table with plan, batch plan-empty fallback, batch tablet routing enabled/disabled, batch plan invalidation
  • createRoutingKey (commit 3): single and composite keys, empty indexes, bounds validation errors (fewer types than indexes, out-of-range positive/negative index, composite second index out of range)
  • nil host guard (commit 4): empty token ring with LWT=true and LWT=false, verifies no panic and fallback yields hosts

All tests pass with -race.

Benchmark Results

Benchmarks run on 12th Gen Intel Core i7-1270P, go test -tags unit -bench=... -benchtime=1s -count=5. All values below are medians of 5 runs.

Commit 1: LWT Pick Path — First Host Selection

The baseline for LWT queries is NonLWT (the standard pick path), since before this PR all queries — including LWT — used the same code path.

Metric Baseline (NonLWT) This PR (LWT) Speedup
ns/op 611 375 1.6x faster (−39%)
B/op 570 458 −20%
allocs/op 13 11 −15%

Commit 1: LWT Pick Path — Full Drain (all 12 hosts)

Metric Baseline (NonLWT) This PR (LWT) Speedup
ns/op 1816 1458 1.2x faster (−20%)
B/op 1472 1658 +13% (fixed array vs map trade-off)
allocs/op 38 35 −8%

Note: B/op is slightly higher in the LWT full-drain path because the fixed [9]*HostInfo array occupies more stack space than the NonLWT map-based dedup for small replica counts. This is a deliberate trade-off: the array avoids heap allocations entirely in the common first-pick case, which is the hot path for LWT queries.

Commit 2: routingPlan Cache — isLWT() Hot Path

The baseline is WithoutPlan (the existing RWMutex.RLock path). WithPlan uses the new atomic.Pointer lock-free read.

Metric Baseline (WithoutPlan) This PR (WithPlan) Speedup
ns/op 11.3 1.0 11x faster (−91%)
B/op 0 0
allocs/op 0 0

Commit 2: routingPlan Cache — getPartitioner() Hot Path

Metric Baseline (WithoutPlan) This PR (WithPlan) Speedup
ns/op 12.8 1.5 8.3x faster (−88%)
B/op 0 0
allocs/op 0 0

Commit 2: routingPlan Cache — LRU Lookup

Benchmark ns/op B/op allocs/op
CacheHit 20.7 0 0
CacheMiss_Store 179.3 80 2

Commit 3: createRoutingKey (Unified)

No regression from unification — identical performance to the previous two-function implementation:

Benchmark ns/op B/op allocs/op
Single 20.4 4 1
Composite 156.6 264 3

Pre-existing Issues (not addressed)

  • TestInitialRetryPolicy fails on master due to peers_v2 error message mismatch — unrelated
  • TestShardAwarePortMockedNoReconnections flaky — unrelated
  • Keyspace() and Table() reading mutex-protected fields without lock is pre-existing, not a regression

@mykaul mykaul marked this pull request as draft March 12, 2026 21:29
@mykaul mykaul requested a review from Copilot March 13, 2026 08:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the hot-path performance of token-aware routing for LWT queries by introducing a zero-allocation LWT pick path and caching immutable per-statement routing metadata (routingPlan) for lock-free reads across queries. It also removes duplicated routing-key marshaling logic by unifying routing key creation into a single helper.

Changes:

  • Add pickLWTReplicas for deterministic, allocation-minimized LWT host ordering (local-first, then remote tiers, then fallback).
  • Cache per-statement routingPlan in Session.routingPlans (sync.Map) and expose it via queryRoutingInfo.plan (atomic.Pointer) for lock-free reads.
  • Unify routing key construction into createRoutingKey(indexes, types, values) and update Query/Batch to use it.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
policies.go Adds LWT-specific pick iterator and avoids replica cloning for LWT queries.
session.go Introduces routingPlan + session-level cache, atomically stored plan pointer, and unified routing key creation.
conn.go Skips redundant routingInfo writes when plan exists; invalidates plan cache on UNPREPARED; uses qry.Keyspace()/Table() for tablet hint decoding.
policies_test.go Adds unit tests + benchmarks covering LWT ordering, fallback behavior, rack-aware tiers, and high-RF behavior.
session_unit_test.go Adds unit tests + benchmarks for routingPlan fast-path reads and unified createRoutingKey.

Comment thread session.go
Comment thread session.go
Comment thread session_unit_test.go Outdated
Comment thread policies_test.go
Comment thread policies_test.go Outdated
Comment thread policies_test.go Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes query routing for lightweight transactions (LWT) in the GoCQL driver by introducing a dedicated low-allocation LWT replica pick path and caching immutable per-statement routing metadata to reduce hot-path locking and repeated metadata work.

Changes:

  • Add a dedicated LWT replica selection path in tokenAwareHostPolicy.Pick() with deterministic local-before-remote ordering and reduced allocations.
  • Introduce a per-Session sync.Map routing plan cache and a per-query atomic.Pointer to enable lock-free reads for routing metadata (isLWT, partitioner, keyspace/table).
  • Unify routing key construction into a single createRoutingKey(indexes, types, values) implementation.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
policies.go Adds pickLWTReplicas and routes LWT queries through it to reduce allocations and enforce deterministic ordering.
session.go Adds routingPlans cache + atomic plan pointer on queries/batches and unifies routing-key creation.
conn.go Avoids redundant routingInfo writes when a plan exists; invalidates cached plan(s) on RequestErrUnprepared.
policies_test.go Adds unit tests + benchmarks covering LWT pick ordering/dedup and plan-based LWT flag behavior.
session_unit_test.go Adds unit tests + benchmarks for routing plan reads and unified createRoutingKey.
Comments suppressed due to low confidence (1)

session.go:1464

  • When a routing plan is present but its keyspace is the empty string, this code skips the existing routingInfo.keyspace fallback and goes straight to session.cfg.Keyspace. That can change behavior vs the pre-plan path (and differs from Batch.Keyspace(), which still checks routingInfo.keyspace when plan.keyspace is empty). Consider falling back to routingInfo.keyspace when plan.keyspace is empty to preserve prior semantics and keep Query/Batch consistent.
func (q *Query) Keyspace() string {
	if q.getKeyspace != nil {
		return q.getKeyspace()
	}
	if p := q.routingInfo.plan.Load(); p != nil {
		if p.keyspace != "" {
			return p.keyspace
		}
	} else if q.routingInfo.keyspace != "" {
		return q.routingInfo.keyspace
	}

	if q.session == nil {
		return ""
	}
	// TODO(chbannis): this should be parsed from the query or we should let
	// this be set by users.
	return q.session.cfg.Keyspace
}

Comment thread session.go Outdated
Comment thread session.go Outdated
Comment thread conn.go
Comment thread conn.go
@mykaul mykaul force-pushed the lwt_routing_plan_cache branch 2 times, most recently from 25bf253 to 7c4f3a1 Compare March 16, 2026 23:34
@mykaul mykaul requested a review from Copilot March 17, 2026 17:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes token-aware host selection and routing-key computation, focusing on LWT (lightweight transaction) queries and prepared-statement routing metadata reuse across queries in the GoCQL driver.

Changes:

  • Add a dedicated LWT replica-pick path (pickLWTReplicas) to reduce allocations and enforce deterministic local-before-remote ordering.
  • Introduce a bounded per-session routingPlan cache (routingPlanLRU) and per-query atomic.Pointer to enable lock-free reads of LWT/partitioner/keyspace/table on hot paths.
  • Unify routing-key construction logic into a single createRoutingKey(indexes, types, values) implementation with bounds validation, and invalidate cached routing metadata on UNPREPARED.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
topology.go Documents immutability requirement for replica slices used by the LWT pick fast-path.
session.go Adds routingPlan + LRU cache + atomic plan pointer usage; unifies routing-key creation and adds validation.
conn.go Avoids redundant routingInfo writes when plan is present; invalidates routingPlan/routingKeyInfo caches on UNPREPARED; uses qry.Keyspace()/Table() for tablet hints.
policies.go Adds LWT-specific pick path, avoids replica slice clone for LWT, and guards against nil host from empty token ring.
session_unit_test.go Adds unit tests and benchmarks for routingPlan cache behavior, plan-based accessors, and createRoutingKey validation.
policies_test.go Adds extensive tests/benchmarks for LWT pick ordering, fallback behavior, and the nil-host guard.

Comment thread policies.go
mykaul added a commit to mykaul/gocql that referenced this pull request Mar 19, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [9]*HostInfo array — sized for NTS with 3 DCs at RF=3 each
(9 replicas). Overflow is handled gracefully (silently skipping
tracking) with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
mykaul added a commit to mykaul/gocql that referenced this pull request Mar 20, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
@mykaul mykaul requested a review from Copilot March 20, 2026 21:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes token-aware host selection for LWT queries and reduces routing-metadata overhead by introducing a cached, immutable per-statement routing plan and a zero-allocation LWT replica selection path, plus a small safety fix for empty token rings.

Changes:

  • Add a dedicated pickLWTReplicas path for deterministic, allocation-reduced LWT host iteration (and guard nil host on empty token ring).
  • Cache immutable routingPlan objects per statement in a bounded LRU and expose lock-free reads via atomic.Pointer in queryRoutingInfo.
  • Unify routing-key construction into a single createRoutingKey(indexes, types, values) helper and update Query/Batch routing-key lookup to use the cached plan.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
topology.go Documents the immutability contract for hostTokens.hosts relied on by the new LWT pick path.
policies.go Introduces pickLWTReplicas, avoids cloning replicas for LWT, and guards against nil host from an empty token ring.
conn.go Avoids redundant routingInfo writes when a plan is present; invalidates routing caches on RequestErrUnprepared; uses qry.Keyspace()/Table() for tablet hint parsing.
session.go Adds routingPlanLRU + Session.getRoutingPlan, atomic plan pointer in queryRoutingInfo, and unifies routing key creation with added bounds checks.
session_unit_test.go Adds unit tests and benchmarks for routing-plan reads, cache behavior, and routing-key creation validation.
policies_test.go Adds extensive unit tests and benchmarks for the LWT pick behavior, determinism, tier ordering, and nil-host guard.

Comment thread policies.go
Comment thread session_unit_test.go Outdated
mykaul added a commit to mykaul/gocql that referenced this pull request Mar 23, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
@mykaul mykaul force-pushed the lwt_routing_plan_cache branch from 7c4f3a1 to 9c9482b Compare March 31, 2026 19:15
@mykaul mykaul requested a review from Copilot March 31, 2026 19:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment thread session.go Outdated
Comment thread session.go
Comment thread session.go Outdated
mykaul added a commit to mykaul/gocql that referenced this pull request Apr 3, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
mykaul added a commit to mykaul/gocql that referenced this pull request Apr 5, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
mykaul added a commit to mykaul/gocql that referenced this pull request Apr 5, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
mykaul added a commit to mykaul/gocql that referenced this pull request Apr 5, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
mykaul added a commit to mykaul/gocql that referenced this pull request Apr 5, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
@mykaul mykaul force-pushed the lwt_routing_plan_cache branch 2 times, most recently from a8ca47b to f1f4746 Compare April 5, 2026 20:34
mykaul added a commit that referenced this pull request Apr 6, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR #769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
mykaul added a commit to mykaul/gocql that referenced this pull request Apr 6, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR scylladb#769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
dkropachev pushed a commit that referenced this pull request Apr 6, 2026
Replace the per-query heap-allocated map[*HostInfo]bool (used for
deduplication in the Pick() closure) with an inline hostSet backed by a
fixed [4]*HostInfo array — sized for RF=3-4, covering most production
deployments. Overflow is handled gracefully (silently skipping tracking)
with no correctness impact.

Replace shuffleHosts() (which copies into a new slice and uses the
global math/rand mutex) with shuffleHostsInPlace() using math/rand/v2
for lock-free operation.

Replace the healthyReplicas/unhealthyReplicas make+append pattern with
partitionHealthy(), which performs an in-place stable partition using a
small stack buffer.

Add conditional cloning so replicas are only copied when mutation
(shuffle or slow-replica avoidance) is actually needed, otherwise the
ring's slice is referenced directly.

Combined savings: ~250-500 bytes per token-aware query.

Note: This change targets the non-LWT path. PR #769 adds a dedicated
pickLWTReplicas() function for LWT queries. Both changes are
compatible and complementary — this PR handles the non-LWT hot path.
@mykaul mykaul force-pushed the lwt_routing_plan_cache branch from f1f4746 to 8088d6a Compare April 7, 2026 18:55
@dkropachev
Copy link
Copy Markdown
Collaborator

it is a bad idea to persist plan, it may change on node events or when driver get new tablets

@mykaul
Copy link
Copy Markdown
Author

mykaul commented Apr 10, 2026

it is a bad idea to persist plan, it may change on node events or when driver get new tablets

It may be sub-optimal. Node changes - such as failures? or additions? The latter is immaterial, the former I can look into. New tablets - already handled by the server.
But anyway, I can look into it more.

Note that this PR has multiple independent items. Should I extract them separately anyway? (need to resolve multiple conflicts anyway here!)

@mykaul
Copy link
Copy Markdown
Author

mykaul commented Apr 13, 2026

Closing - will return to this some day.

@mykaul mykaul closed this Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants