fix: Use a pool of LabelPairs to reduce memory allocs #2838

bryanhuhta · 2023-12-12T23:15:15Z

We see pretty heavy memory usage when querier.v1.QueryService/Series calls are made on blocks with large a index.tsdb. For a real world example, see here.

In this flamegraph, a ~7.7mb index.tsdb file is being used, resulting in using ~680mb of memory.

The culprit is this loop in getUniqueLabelsSets:

pyroscope/pkg/phlaredb/block_querier.go

Lines 1992 to 2006 in e1a13ca

    
           for postings.Next() { 
        
           	matchedLabels := make(phlaremodel.Labels, 0, len(names)) 
        
           	for _, name := range names { 
        
           		value, err := b.index.LabelValueFor(postings.At(), name) 
        
           		if err != nil { 
        
           			if err == storage.ErrNotFound { 
        
           				continue 
        
           			} 
        
           			return nil, err 
        
           		} 
        
           		matchedLabels = append(matchedLabels, &typesv1.LabelPair{ 
        
           			Name:  name, 
        
           			Value: value, 
        
           		}) 
        
           	}

The high memory usage comes from us allocating a slice every iteration and even more egregiously, allocating a new typesv1.LabelPair on the heap every time we have a label name/value match.

This PR changes getUniqueLabelsSets to use a slice pool strategy, where we allocate a pool of the exact size and fill it with pointers to heap memory. We reuse this pool instead of allocating new memory each iteration of postings.Next(). If we need the contents of the pool, we explicitly copy out of the pool. This substantially reduces memory usage.

After the change, the new flamegraph shows getUniqueLabelsSets using ~56mb of memory:

And some benchmarks using the same 7.7mb index.tsdb:

Name	iterations	ns/op	B/op	allocs/op
old	37	27478623	9061538	188762
new	39	26336128	1681138	83927
change		-4.1%	-81.4%	-55.5%

Of course, now we have the problem of index.(*Reader).LabelValueFor using a lot of memory, but this is a significant and simple enough step that I think it's justifiable to merge on its own.

pkg/phlaredb/block_querier.go

cyriltovena

LGTM !!!!!!

Use a pool of LabelPairs to reduce memory allocs

9513b21

bryanhuhta self-assigned this Dec 12, 2023

bryanhuhta requested a review from a team as a code owner December 12, 2023 23:15

bryanhuhta requested a review from cyriltovena December 12, 2023 23:15

Merge branch 'main' into fix-series-memory

7a8c54d

cyriltovena reviewed Dec 13, 2023

View reviewed changes

pkg/phlaredb/block_querier.go Outdated Show resolved Hide resolved

cyriltovena approved these changes Dec 13, 2023

View reviewed changes

bryanhuhta and others added 3 commits December 13, 2023 08:39

Use CloneVT

0f0779b

Fix wording of comments

d99fea8

Merge branch 'main' into fix-series-memory

6a27ac9

bryanhuhta merged commit fba9246 into main Dec 13, 2023
19 checks passed

bryanhuhta deleted the fix-series-memory branch December 13, 2023 14:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Use a pool of LabelPairs to reduce memory allocs #2838

fix: Use a pool of LabelPairs to reduce memory allocs #2838

bryanhuhta commented Dec 12, 2023 •

edited

cyriltovena left a comment

	for postings.Next() {
	matchedLabels := make(phlaremodel.Labels, 0, len(names))
	for _, name := range names {
	value, err := b.index.LabelValueFor(postings.At(), name)
	if err != nil {
	if err == storage.ErrNotFound {
	continue
	}
	return nil, err
	}
	matchedLabels = append(matchedLabels, &typesv1.LabelPair{
	Name: name,
	Value: value,
	})
	}

fix: Use a pool of LabelPairs to reduce memory allocs #2838

fix: Use a pool of LabelPairs to reduce memory allocs #2838

Conversation

bryanhuhta commented Dec 12, 2023 • edited

cyriltovena left a comment

Choose a reason for hiding this comment

bryanhuhta commented Dec 12, 2023 •

edited