Migrate to proper circuit breakers.#4432
Open
markrmiller wants to merge 1 commit into
Open
Conversation
215ec54 to
792b19b
Compare
Move circuit-breaker admission control out of SearchHandler and ContentStreamHandlerBase and into a new SolrQoSFilter that suspends requests asynchronously when a breaker is tripped, dispatching them once the breaker clears (modeled after Jetty's QoSFilter/QoSHandler). Async queueing is enabled by default. Suspended requests are sorted into three priority lanes (HIGH for admin/probe, MEDIUM for queries, LOW for updates) and each lane has its own admission cap, so a flood of LOW-priority work cannot reject HIGH-priority probes. Clients can opt into a specific lane with the Solr-Request-Priority header. The drain budget per lane is scaled by priority and auto-scales with maxSuspendedRequests so a saturated queue drains within the suspension timeout. Internal intra-cluster shard requests bypass suspension to avoid distributed deadlock. When QoS queueing is disabled, a tripped breaker still fails fast synchronously, preserving the prior handler-enforced behavior. Add GcOverheadCircuitBreaker. Rewrite MemoryCircuitBreaker to read post-GC live bytes from the old/tenured pool instead of a moving average of raw heap usage. TTL-cache CPU and load-average samples so high-QPS admission control does not repoll expensive OS/JVM signals per request.
792b19b to
b0928d7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Experimental Branch Not for contribution review.
Move circuit breaker enforcement into an async SolrQoSFilter
Motivation
The previous design enforced circuit breakers synchronously inside
SearchHandler.checkCircuitBreakersandContentStreamHandlerBase.checkCircuitBreakers. When a breaker tripped, requests were immediately rejected with a 429. During transient stress events — a 1-2 second GC pause, a burst of expensive faceted queries, brief CPU saturation — every in-flight client got an error even though the underlying condition would clear in moments. Operators saw error spikes that bore no proportional relationship to actual cluster trouble.This PR replaces that with a Jetty-
QoSFilter-style admission control layer: when a breaker is tripped, requests are suspended viaAsyncContextand held in a small priority queue until the breaker clears, at which point they're dispatched as normal. Transient pressure is just some added latency instead of a wall of 429s.Architectural shift
SearchHandler/ContentStreamHandlerBaseafter dispatchSolrQoSFilterbefore dispatchWhat's in this PR
SolrQoSFilter(new)CircuitBreakerRegistry.checkTrippedAcrossCores(cc, type)on each request.req.startAsync()and parks theAsyncContextin a priority+type queue.drainAll(default every 100 ms) re-checks the breaker and dispatches queued requests when it clears, walking lanes in priority order (HIGH → MEDIUM → LOW) and types within each lane.AsyncListenerso timeouts and client disconnects are accounted correctly (counters decremented, response written, slot freed).Solr-Request-Context: SERVERheader or theisShard=true/distrib.from=query params — bypasses both the breaker check and the suspension queue to avoid the obvious distributed-deadlock failure mode (parent request waits for sub-shard, sub-shard is queued waiting for parent's breaker to clear).Priority lanes
Requests are sorted into three lanes:
/admin/ping,/admin/infoClients can override via the new
Solr-Request-Priorityrequest header (HIGH/MEDIUM/LOW, case-insensitive). Unknown values fall through to the path/type heuristic. SolrJ already populatesSolr-Request-Type; the priority hint is opt-in for code paths that want explicit control without parsing the request body.Per-priority admission caps
The big bug class the lanes are meant to insulate against — a flood of LOW-priority bulk updates filling the suspension queue and locking out HIGH-priority admin/health probes — would have remained open with a single global counter. Each lane therefore has its own admission cap, partitioned out of
maxSuspendedRequestsvia configurable shares:At the default
maxSuspendedRequests=1024: HIGH=102, MEDIUM=614, LOW=308. A saturated LOW lane refuses further LOW requests but leaves HIGH and MEDIUM headroom intact.Priority-scaled drain budget
drainBudget(the number of requests dispatched per lane per drain tick) is scaled by priority:So HIGH clears in essentially one tick, MEDIUM clears at the baseline rate, and LOW yields capacity back to MEDIUM after a breaker recovery.
Auto-scaled
drainBudgetThe base
drainBudgetdefaults tomax(200, 2 × maxSuspendedRequests × checkIntervalMs / suspendTimeoutMs). The formula ensures a fully-saturated queue can drain twice within the suspension timeout window, so requests don't expire before they're resumed. Floors at the static default so small/default deployments are unchanged; large deployments (maxSuspendedRequests=100000) get the budget they actually need without manual tuning. Explicitqos.drainBudgetoverrides win.Cached breaker scan
SolrQoSFiltercachesCircuitBreakerRegistry.checkTrippedAcrossCoresper request type forevaluationIntervalMs(default 200ms). Concurrent admission-control callers share one underlying breaker scan rather than each re-walking the registry and re-polling OS/JVM metrics.MemoryCircuitBreakerrewriteThe old signal was a 30-second moving average of
MemoryMXBean.getHeapMemoryUsage(). Under a generational collector, raw heap usage climbs steadily towardmaxbetween collections — that's the normal shape, not a problem. The moving average inherited that climb and tripped on healthy heaps.The new signal reads
MemoryPoolMXBean.getCollectionUsage()on the old/tenured pool, which reports the bytes resident immediately after the most recent collection that affected that pool. That's the only point at which "how full is the heap really?" has a defined answer. For non-generational collectors (non-generational ZGC, Shenandoah) the breaker sumsgetCollectionUsage()across every HEAP-typed pool. Threshold semantics are unchanged (percentage of max heap); the underlying signal is now meaningful.GcOverheadCircuitBreaker(new)Trips when the JVM is spending more than a configured percentage of wall-clock time in garbage collection over a sliding window. Complementary to
MemoryCircuitBreaker:MemoryCircuitBreakerfires when post-GC live data is exhausting the heap.GcOverheadCircuitBreakerfires when GC is keeping up (live data may be small) but consuming so much CPU that the application is starving.Both conditions usually precede an OOM, but each catches the other's blind spot. Configurable via
solr.circuitbreaker.{update,query}.gcoverhead=<percent>andsolr.circuitbreaker.gcoverhead.windowSeconds(default 30).TtlSampledMetric(new)Tiny utility wrapping
AtomicReference<Sample>for time-bounded caching of expensive metric reads. Used byCPUCircuitBreaker,LoadAverageCircuitBreaker,MemoryCircuitBreaker(post-GC live-bytes lookup), andGcOverheadCircuitBreaker(ratio computation). Configurable globally viasolr.circuitbreaker.sampleTtlMs(default 1000ms). Stops high-QPS admission control from hammeringOperatingSystemMXBean, Prometheus metric scans, andMemoryPoolMXBeanwalks on every request.CircuitBreakerRegistryadditionscheckTrippedGlobal(SolrRequestType)— static; consults only the process-wide global map. Used by filter-tier callers that have no per-core context. Warn-only breakers excluded.checkTrippedLocal(SolrRequestType)— per-instance; consults only this registry's per-core breakers. Warn-only excluded.checkTrippedAcrossCores(CoreContainer, SolrRequestType)— combines global + every per-core registry. Used bySolrQoSFilter. Iterates all cores; if any core's breaker for the type is tripped, the request is treated as tripped cluster-wide (intentionally conservative — the filter doesn't yet know which core a request will resolve to).gcoverheadbreaker type inparseCircuitBreakersFromProperties.Handler changes
SearchHandler.checkCircuitBreakersremoved.ContentStreamHandlerBase.checkCircuitBreakersremoved.SolrExceptionon tripped breakers now assert against the registry directly.web.xmlSolrQoSFiltermapping.<async-supported>true</async-supported>propagated to all filters in the chain ahead ofSolrServlet(RequiredSolrRequestFilter,TracingFilter,AuthenticationFilter,RateLimitFilter) — required by the Servlet spec forstartAsync()to work.Configuration reference
All new system properties:
New request header:
New metrics (counter unless noted):
Behavior changes
qos.enableddefaults totrue. New deployments will async-suspend tripped breakers instead of fail-fast. Operators who want the legacy synchronous 429 can setsolr.circuitbreaker.qos.enabled=false.MemoryCircuitBreakersemantics changed. Threshold is now compared against post-GC live data rather than the raw heap usage moving average. The breaker should fire less often on healthy generational-GC clusters; thresholds previously tuned around the noisy signal may now feel conservative and can be lowered. Tuning review recommended at upgrade.SolrServletare now declaredasync-supported. Downstream Solr filters/servlets are unchanged, but custom third-party filters injected into the chain must also declareasync-supportedorstartAsync()will throw.