pkg/symbolizer: Improved caching in symbolization #2311

brancz · 2022-12-16T14:39:38Z

Instead of indefinitely or caching liners via TTL, we take a few different approaches, that ultimately lower both the used memory as well as CPU.

The first time we see a piece of debuginfo we record whether it has
DWARF, .gopclntab, .symtab and .dynsym. When the same piece
of debuginfo is seen again the cached information is used. For
introspection, we also save this as part of the debuginfo metadata.
The first time we see a piece of debuginfo we record its valid PC
ranges. Using this we can filter out whether a location can even be
possible to be symbolized based on the known total range of program
counters in the debuginfo.
Liners are closed in subsequent symbolization rounds if their cached
value is unused. Therefore liners that are constantly used are also
kept alive and caches stay warm, but liners that are only used once
in a while are opened and then closed again to not occupy unnecessary
amounts of memory.

vercel · 2022-12-16T14:48:30Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
parca-ui	🔄 Building (Inspect)		Dec 16, 2022 at 2:48PM (UTC)

Instead of indefinitely or caching liners via TTL, we take a few different approaches, that ultimately lower both the used memory as well as CPU. 1) The first time we see a piece of debuginfo we record whether it has `DWARF`, `.gopclntab`, `.symtab` and `.dynsym`. When the same piece of debuginfo is seen again the cached information is used. For introspection, we also save this as part of the debuginfo metadata. 2) The first time we see a piece of debuginfo we record its valid PC ranges. Using this we can filter out whether a location can even be possible to be symbolized based on the known total range of program counters in the debuginfo. 3) Liners are closed in subsequent symbolization rounds if their cached value is unused. Therefore liners that are constantly used are also kept alive and caches stay warm, but liners that are only used once in a while are opened and then closed again to not occupy unnecessary amounts of memory.

kakkoyun

Amazing PR! I love how you simplified the flow.

I have some suggestions. Feel free to ignore, they aren't blockers.

kakkoyun · 2022-12-19T09:41:11Z

pkg/symbolizer/symbolizer.go

-
-	// Fetch the debug info for the build ID.
-	rc, err := s.debuginfo.FetchDebuginfo(ctx, m.BuildId)
+func (s *Symbolizer) symbolizeLocationsForMapping(ctx context.Context, m *pb.Mapping, locations []*pb.Location) ([][]profile.LocationLine, liner, error) {


Is it possible to split this method into two? One that returns a liner and another one consumes that an returns location lines?

kakkoyun · 2022-12-19T09:44:17Z

pkg/symbolizer/symbolizer.go

+		delete(s.linerCache, k)
+	}
+	for _, liner := range s.linerCache {
+		// These are liners that didn't show up in the latest iteration.


How do you feel about adding a metric to track the number of times we see the same build id? It could help us to fine-tune the lifecycle of a cached liner. I guess the cache is to correlate it with cache hits/misses and symbolization rounds.

v-thakkar

LGTM. 🎉 Glad to have .dynsym back 😄

Not a blocker at all but I'll be curious to see if some of the correctness problems we were seeing with elf symbolization before #1930 are also handled by this while keeping .dynsym. Will do some manual testing later this week.

brancz · 2022-12-19T12:08:55Z

Glad to have .dynsym back 😄

For the moment it's just a check to see if it exists, using it for symbolization is not implemented yet.

Thanks for the reviews! I'm going to go ahead and merge this as is and deploy it on the demo cluster to see if it has the effect that we want to achieve. If so I'll follow up by addressing the refactoring and adding more metrics.

brancz requested review from a team as code owners December 16, 2022 14:39

vercel bot deployed to Preview – parca-ui December 16, 2022 14:52 View deployment

brancz force-pushed the sym-imp branch from 93cf4b0 to abd07a4 Compare December 16, 2022 15:27

vercel bot deployed to Preview – parca-ui December 16, 2022 15:30 View deployment

[pre-commit.ci lite] apply automatic fixes

3985666

vercel bot deployed to Preview – parca-ui December 16, 2022 15:37 View deployment

kakkoyun reviewed Dec 19, 2022

View reviewed changes

v-thakkar approved these changes Dec 19, 2022

View reviewed changes

brancz merged commit 4060ebb into main Dec 19, 2022

brancz deleted the sym-imp branch December 19, 2022 12:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkg/symbolizer: Improved caching in symbolization #2311

pkg/symbolizer: Improved caching in symbolization #2311

brancz commented Dec 16, 2022

vercel bot commented Dec 16, 2022

kakkoyun left a comment

kakkoyun Dec 19, 2022

kakkoyun Dec 19, 2022

v-thakkar left a comment

brancz commented Dec 19, 2022 •

edited

pkg/symbolizer: Improved caching in symbolization #2311

pkg/symbolizer: Improved caching in symbolization #2311

Conversation

brancz commented Dec 16, 2022

vercel bot commented Dec 16, 2022

kakkoyun left a comment

Choose a reason for hiding this comment

kakkoyun Dec 19, 2022

Choose a reason for hiding this comment

kakkoyun Dec 19, 2022

Choose a reason for hiding this comment

v-thakkar left a comment

Choose a reason for hiding this comment

brancz commented Dec 19, 2022 • edited

brancz commented Dec 19, 2022 •

edited