Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/symbolizer: Improved caching in symbolization #2311

Merged
merged 2 commits into from
Dec 19, 2022
Merged

pkg/symbolizer: Improved caching in symbolization #2311

merged 2 commits into from
Dec 19, 2022

Conversation

brancz
Copy link
Member

@brancz brancz commented Dec 16, 2022

Instead of indefinitely or caching liners via TTL, we take a few different approaches, that ultimately lower both the used memory as well as CPU.

  1. The first time we see a piece of debuginfo we record whether it has
    DWARF, .gopclntab, .symtab and .dynsym. When the same piece
    of debuginfo is seen again the cached information is used. For
    introspection, we also save this as part of the debuginfo metadata.
  2. The first time we see a piece of debuginfo we record its valid PC
    ranges. Using this we can filter out whether a location can even be
    possible to be symbolized based on the known total range of program
    counters in the debuginfo.
  3. Liners are closed in subsequent symbolization rounds if their cached
    value is unused. Therefore liners that are constantly used are also
    kept alive and caches stay warm, but liners that are only used once
    in a while are opened and then closed again to not occupy unnecessary
    amounts of memory.

@brancz brancz requested review from a team as code owners December 16, 2022 14:39
@vercel
Copy link

vercel bot commented Dec 16, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
parca-ui 🔄 Building (Inspect) Dec 16, 2022 at 2:48PM (UTC)

Instead of indefinitely or caching liners via TTL, we take a few
different approaches, that ultimately lower both the used memory as well
as CPU.

1) The first time we see a piece of debuginfo we record whether it has
   `DWARF`, `.gopclntab`, `.symtab` and `.dynsym`. When the same piece
   of debuginfo is seen again the cached information is used. For
   introspection, we also save this as part of the debuginfo metadata.
2) The first time we see a piece of debuginfo we record its valid PC
   ranges. Using this we can filter out whether a location can even be
   possible to be symbolized based on the known total range of program
   counters in the debuginfo.
3) Liners are closed in subsequent symbolization rounds if their cached
   value is unused. Therefore liners that are constantly used are also
   kept alive and caches stay warm, but liners that are only used once
   in a while are opened and then closed again to not occupy unnecessary
   amounts of memory.
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing PR! I love how you simplified the flow.

I have some suggestions. Feel free to ignore, they aren't blockers.


// Fetch the debug info for the build ID.
rc, err := s.debuginfo.FetchDebuginfo(ctx, m.BuildId)
func (s *Symbolizer) symbolizeLocationsForMapping(ctx context.Context, m *pb.Mapping, locations []*pb.Location) ([][]profile.LocationLine, liner, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to split this method into two? One that returns a liner and another one consumes that an returns location lines?

delete(s.linerCache, k)
}
for _, liner := range s.linerCache {
// These are liners that didn't show up in the latest iteration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about adding a metric to track the number of times we see the same build id? It could help us to fine-tune the lifecycle of a cached liner. I guess the cache is to correlate it with cache hits/misses and symbolization rounds.

Copy link

@v-thakkar v-thakkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 🎉 Glad to have .dynsym back 😄

Not a blocker at all but I'll be curious to see if some of the correctness problems we were seeing with elf symbolization before #1930 are also handled by this while keeping .dynsym. Will do some manual testing later this week.

@brancz
Copy link
Member Author

brancz commented Dec 19, 2022

Glad to have .dynsym back 😄

For the moment it's just a check to see if it exists, using it for symbolization is not implemented yet.

Thanks for the reviews! I'm going to go ahead and merge this as is and deploy it on the demo cluster to see if it has the effect that we want to achieve. If so I'll follow up by addressing the refactoring and adding more metrics.

@brancz brancz merged commit 4060ebb into main Dec 19, 2022
@brancz brancz deleted the sym-imp branch December 19, 2022 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants