Improve Slice pool hit rate #1998

shanson7 · 2021-08-27T17:40:42Z

These changes are to try to improve the SlicePool hit rate. Our query nodes have a very disappointing hit rate of ~80%.

I believe the reason is that most queries handle raw series that aren't 2000 datapoints long, yet will be added to the pool. Then most functions will be asking for 2000 (because they don't specify what they need). Because unfit slices are added back to the pool, we could (unverified) also be repeatedly grabbing the same low capacity slice over and over again.

So these changes:

Call GetMin everywhere we have a good idea of how many points we need
Make the min/default slice size configurable
Unbias PointSlicePool metrics - The metrics as they existed lumped large with default, making it difficult to know how often the default size was insufficiently large.

Dieterbe

LGTM. pinging @robert-milan / @colega / @jesusvazquez as they have an interest in the expr library now too.

Dieterbe · 2021-08-30T10:33:44Z

api/config.go

@@ -57,6 +58,7 @@ func ConfigSetup() {
 	apiCfg.BoolVar(&optimizations.PreNormalization, "pre-normalization", true, "enable pre-normalization optimization")
 	apiCfg.BoolVar(&optimizations.MDP, "mdp-optimization", false, "enable MaxDataPoints optimization (experimental)")
 	apiCfg.BoolVar(&middleware.LogHeaders, "log-headers", false, "output query headers in logs")
+	apiCfg.UintVar(&minSliceSize, "min-slice-pool-size", 2000, "Minimum (and default) length of slice to allocate from pool")


please add this to metrictank-sample.ini and then run scripts/dev/sync-configs.sh

Dieterbe · 2021-08-30T10:34:50Z

api/init.go

@@ -9,7 +9,7 @@ import (
 var pointSlicePool *pointslicepool.PointSlicePool

 func init() {
-	pointSlicePool = pointslicepool.New(pointslicepool.DefaultPointSliceSize)
+	pointSlicePool = pointslicepool.New(int(minSliceSize))


this means we can now delete DefaultPointSliceSize from the code

Dieterbe · 2021-08-30T10:47:14Z

pointslicepool/pointslicepool.go

 		p.putLarge.Inc()
-	} else {
+	} else if cap(s) < p.defaultSize {


So we no longer track puts and get-makes if they equal defaultSize.
I can see how that helps with troubleshooting the kind of efficiencies you're looking into. OTOH only having metrics for a subset of the puts and get-makes is a bit weird, especially because the sum no longer relates to the get-candidate metrics. should we introduce get-make-default and put-default (or get-make-min and put-min) to track these? or are all these metrics getting out of hand (seems we rarely use them. but then again, that goes for most of the metrics i guess)

Yeah, I debated this back and forth with myself. I don't want to over-instrument the code and these metrics aren't generally interesting except when trying to determine the efficacy of the slice pool in reducing allocations (which is exactly what I was just doing).

Ultimately, it would be nice to know:

Allocations/bytes saved using the pool

Allocation/bytes required through the pool (and maybe additional bytes due to default size)

Reason the allocation is required (miss or unfit)

I'm not sure I find the "put" metrics that interesting, except maybe in determining that we are "losing" slices that should be returned.

I lean towards adding in get-make-min and put-min, to make the set of metrics coherent.
if we decide the metrics are not (or no longer) useful, we can take them out all at once.

Dieterbe

LGTM.
just want to keep this open for a while to give @robert-milan / @colega / @jesusvazquez a chance to check this out

colega · 2021-09-01T09:34:44Z

Thanks for the heads up @Dieterbe, reviewed and makes sense to me.

Offtopic: we're importing this in a codebase that is instrumented with Prometheus so I'd like to make the instrumentation configurable in a separate PR (should be straightforward since both Prometheus and Graphite counters are interface { Inc() })

shanson7 added 3 commits August 27, 2021 17:25

Update funcs to use GetMin

c4a1f80

Configurable point slice pool min size

87d6e36

Unbias pointSlicePool metrics

1f67005

Dieterbe reviewed Aug 30, 2021

View reviewed changes

shanson7 added 5 commits August 31, 2021 11:54

Update configs with new param

fe7a0af

Remove unused DefaultPointSliceSize

f2e863f

Add metrics for default size

15c22e2

Fix default size on init

2417ab9

Fix metric docs

f9694ea

Dieterbe approved these changes Aug 31, 2021

View reviewed changes

colega approved these changes Sep 1, 2021

View reviewed changes

Dieterbe merged commit 2833b60 into grafana:master Sep 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Slice pool hit rate #1998

Improve Slice pool hit rate #1998

shanson7 commented Aug 27, 2021

Dieterbe left a comment

Dieterbe Aug 30, 2021

Dieterbe Aug 30, 2021

Dieterbe Aug 30, 2021

shanson7 Aug 31, 2021

Dieterbe Aug 31, 2021

Dieterbe left a comment

colega commented Sep 1, 2021

Improve Slice pool hit rate #1998

Improve Slice pool hit rate #1998

Conversation

shanson7 commented Aug 27, 2021

Dieterbe left a comment

Choose a reason for hiding this comment

Dieterbe Aug 30, 2021

Choose a reason for hiding this comment

Dieterbe Aug 30, 2021

Choose a reason for hiding this comment

Dieterbe Aug 30, 2021

Choose a reason for hiding this comment

shanson7 Aug 31, 2021

Choose a reason for hiding this comment

Dieterbe Aug 31, 2021

Choose a reason for hiding this comment

Dieterbe left a comment

Choose a reason for hiding this comment

colega commented Sep 1, 2021