perf: cache WithTags handlers in tallyMetricsHandler to reduce allocations (replaced with https://github.com/temporalio/temporal/pull/9620 )#10049
Closed
mykaul wants to merge 1 commit into
Conversation
…tions Add sync.Map-based caching of child handlers in WithTags(). On cache hit, zero allocations — skips tagsToMap(), scope.Tagged(), and handler struct allocation entirely. Allocation reduction (pprof alloc_space, 5min ScyllaDB workload): WithTags cumulative: 1,930 MB -> 316 MB (-83.6%) Total server allocs: 18,030 MB -> 16,481 MB (-8.6%) Benchmark (omes throughput_stress, mc150, 5 min, host networking, i7-1270P 4 cores/component, inter-run data resets): Cassandra: 294 iterations (+5.0% vs 280 baseline, +13.5% vs prev) ScyllaDB: 296 iterations (+2.1% vs 290 baseline, -1.3% vs prev)
Contributor
Author
|
#9620 is a more complete PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Cache
WithTags()child handlers intallyMetricsHandlerviasync.Mapto eliminate repeatedtagsToMap(),scope.Tagged(), and handler struct allocations on the hot path. On cache hit: zero allocations.Design
childCache(sync.Map): maps normalized tag cache keys to*tallyMetricsHandlerchildren. UsesLoadOrStorefor safe concurrent access.normalizeTagsForCaching: normalizes excluded tags before cache key computation so high-cardinality excluded values (e.g.activityTypewith thousands of distinct activity names) collapse to a single cache entry, preventing unbounded cache growth. Zero-alloc fast path when no tags need substitution.tagsCacheKey: builds compact\x00-separated string keys from tag slices, with a single-tag fast path that avoids byte buffer allocation.WithTags()calls short-circuit and returnself.Allocation Reduction (pprof alloc_space, 5min ScyllaDB workload)
Benchmark (omes throughput_stress, mc150, 5 min)
Host networking, i7-1270P 4 cores/component, inter-run data resets:
Note: Throughput variance at mc150 is ~5-10%. The allocation reduction is confirmed by pprof but throughput gains are within noise at this concurrency level.
Testing
require.Same)-race)tagsCacheKeyboundary ambiguity preventionnormalizeTagsForCachingunit tests (zero-alloc fast path, excluded, allowed, mixed, empty)