Introduce symdb #767

kolesnikovae · 2023-06-14T11:13:08Z

For context, please refer to the proposal.

TODO:

cyriltovena

LGTM

cyriltovena · 2023-06-26T09:19:52Z

I'll merge this in few hours !

luisgerhorst · 2023-06-28T13:04:26Z

Unfortunately, this introduced a dependency on https://github.com/dgryski/go-groupvarint for which no license has been specified. I have created an issue there: dgryski/go-groupvarint#3

Could you update the dependency once the issue has been resolved? This will ensure automatic license checkers also recognize this is fine.

dgryski · 2023-06-30T16:05:04Z

License updated for groupvarint.

* Increase parquet writer PageBufferSize * reduce by 2 page buffer size * Introduce symdb * Add chunk format description * Add chunk format description * Improve naming * Implement stack trace appender * Limit chunk by number of nodes * Stacktrace ID is uint32 * Add in-memory stacktrace resolver * Add writer * Add writer * Fix stacktrace resolver * Single pass write * Index file refactoring * Fixes, improvements, notes * Ignore empty stacktraces * Fix chunk boundary check * Fix tests * Store chunk headers sorted * Make chunk index explicit * Add file reader * Use group varint encoding * Refine stacktrace tree * Stacktrace tree race condition elimination * Remove unused stacktracesResolve.do * Better nil coalescence in stack trace appender * Format imports * Use the new symDB package (grafana/phlare#770) * Ingest stacktraces in the new symdb * Setup read in memory read path * Fix up a comment placement * Start setting up the read path * Update to uint32 * Introduce stacktrace partition (grafana/phlare#775) * Introduce stacktrace partition This determines the partition of a particular profile, by looking first at its metadata: * If there is a `Filename` on the main mapping use its filepath.Base(Filename) * Failing that take the externally supplied `service_name` * Fallback to `unknown` Take the underlying string value and hash. * After a chat with cyril we decided to not longer mod and use the hash straight away. We don't wanted to risk the collisions of two very big stacktrace applications. * Remove reconstructMeta from singleBlockQuerier * support multiple versions of stacktraces resolver * Integrate v2 reader for stacktraces in block reader * Fixes tests * Rewrite locations Ids * Rewrite test for counting uniq stacktraces * lint and fmt * Fixes more tests * Fixes leftover from todo --------- Co-authored-by: Christian Simon <simon@swine.de> * Use prefixed bucket for symbols * Initialize locationsIdsByStacktraceID * Initialize locationsIdsByStacktraceID for pprof as well * Fix chunk headers sort * Inline node alloc * Mapping filename extraction * Tidy go.mod * Fix TestHeadIngestStacktraces * Use symdb.DefaultDirName * Sort mappings on write * Make column iterator to respect the context * Fix unexpected EOF on stacktrace chunk unmarshal * Fix symbols upload * Fix symbols upload * Release fetched data * 3MB Page Buffer Size * Sort stacktraces IDs as expected by the resolver --------- Co-authored-by: Cyril Tovena <cyril.tovena@gmail.com> Co-authored-by: Christian Simon <simon@swine.de>

cyriltovena and others added 4 commits June 2, 2023 10:29

Increase parquet writer PageBufferSize

813ac6a

reduce by 2 page buffer size

97eb8c5

Introduce symdb

5b33929

Add chunk format description

046e825

simonswine assigned simonswine and kolesnikovae Jun 14, 2023

kolesnikovae added 2 commits June 14, 2023 16:46

Add chunk format description

bacbe04

Improve naming

1c96cd5

kolesnikovae mentioned this pull request Jun 14, 2023

WIP: Shard stacktraces table by service name #749

Closed

kolesnikovae added 21 commits June 14, 2023 21:50

Implement stack trace appender

a967b21

Limit chunk by number of nodes

b380330

Stacktrace ID is uint32

b108bac

Add in-memory stacktrace resolver

401c00f

Add writer

25bce59

Add writer

6bba758

Fix stacktrace resolver

ddab48e

Single pass write

71a9ee1

Index file refactoring

b48d915

Fixes, improvements, notes

42c607d

Ignore empty stacktraces

9ce0a86

Fix chunk boundary check

38d1e0b

Fix tests

0240f17

Store chunk headers sorted

c7892a6

Make chunk index explicit

58cbafc

Add file reader

da00cc0

Use group varint encoding

44fa701

Refine stacktrace tree

845d559

Stacktrace tree race condition elimination

2d5abc0

Remove unused stacktracesResolve.do

1271212

Better nil coalescence in stack trace appender

8fb9a64

kolesnikovae and others added 11 commits June 21, 2023 19:28

Tidy go.mod

902ec0c

Fix TestHeadIngestStacktraces

6757ee1

Use symdb.DefaultDirName

0acb4de

Sort mappings on write

2f5753b

Make column iterator to respect the context

10d1dbf

Fix unexpected EOF on stacktrace chunk unmarshal

825235c

Fix symbols upload

c31f93d

Fix symbols upload

20a815e

Release fetched data

9241faa

Merge branch 'experiment-page-size' into feat/symdb

03dc721

3MB Page Buffer Size

4d7eb65

simonswine mentioned this pull request Jun 26, 2023

Experimenting with storing stacktraces differently #757

Closed

cyriltovena marked this pull request as ready for review June 26, 2023 08:17

cyriltovena changed the title ~~WIP: Introduce symdb~~ Introduce symdb Jun 26, 2023

cyriltovena mentioned this pull request Jun 26, 2023

StacktraceResolver returns a non-requested stacktraceID #794

Closed

Sort stacktraces IDs as expected by the resolver

819f6e9

cyriltovena approved these changes Jun 26, 2023

View reviewed changes

cyriltovena merged commit ebc3e04 into main Jun 26, 2023
17 checks passed

cyriltovena deleted the feat/symdb branch June 26, 2023 12:20

simonswine mentioned this pull request Jun 27, 2023

Fix backwards compatibility with block version 1 #802

Merged

simonswine mentioned this pull request Jun 29, 2023

Add MIT license dgryski/go-groupvarint#4

Merged

This was referenced Jul 19, 2023

SymDB maintainability and support grafana/pyroscope#2036

Open

Stack trace symbols resolution is slow #690

Closed

kolesnikovae added kind/enhancement New feature or request kind/performance area/database labels Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce symdb #767

Introduce symdb #767

kolesnikovae commented Jun 14, 2023 •

edited

cyriltovena left a comment

cyriltovena commented Jun 26, 2023

luisgerhorst commented Jun 28, 2023

dgryski commented Jun 30, 2023

Introduce symdb #767

Introduce symdb #767

Conversation

kolesnikovae commented Jun 14, 2023 • edited

cyriltovena left a comment

Choose a reason for hiding this comment

cyriltovena commented Jun 26, 2023

luisgerhorst commented Jun 28, 2023

dgryski commented Jun 30, 2023

kolesnikovae commented Jun 14, 2023 •

edited