perf(profiling): store string data in an arena allocator by morrisonlevi · Pull Request #227 · DataDog/libdatadog

morrisonlevi · 2023-08-22T18:35:33Z

What does this PR do?

This uses bumpalo::Bump as an arena allocator to store the string data contiguously with fewer calls to the system allocator.

Since the StringTable owns the arena and stores references to that data inside other members of the StringTable, it is a self-referencing data structure, which is generally unsafe. This uses ouroboros::self_referencing to have a safe abstraction for this pattern.

Motivation

This produces a better memory layout with fewer system allocations. It likely reduces the total memory used, and it may improve the amount of memory returned to the OS when arena chunks are dropped (see below).

Additional Notes

The initial size of the arena is 16384 bytes aka 16KiB. It reserves a few bytes for itself. When it's full, it creates another chunk that seems at least as big as the previous chunk. This relieves pressure on the system allocator with string-by-string allocation, and reduces fragmentation.

This StringTable is copied and simplified from the PHP profiler. I intend for the PHP profiler to use this version of the StringTable going forward.

How to test the change?

This is an internal change only. Nothing special needs to be done for testing.

danielsn

LGTM.

ivoanjo · 2023-08-23T09:52:03Z

I've re-ran some of the experiments I mentioned in the memory usage RFC doc:

`examples/ffi/profiles.c`:

libdatadog version	RSS Before drop (task manager)	RSS After drop (task manager)	Maximum RSS
v3.0.0	2.2 GB	1.1 GB	2181076 KB
main@62e329fd4ddb7a34 (2023-08-23)	1.7 GB	640.2 MB	1747392 KB
levi/string-table	1.8 GB	640.3 MB	1748420 KB

`ruby_overhead_experiment.rb`:

(config: {:threads=>16, :depth=>50, :seconds=>60, :timeline_enabled=>true})

libdatadog version	Before serialization rss	After serialization rss	After GC rss ('shrinking', in the script)
main@62e329fd4ddb7a34 (2023-08-23)	78484k	134068k	134068k
levi/string-table	78732k	134352k	85504k

I think overall the profiles.c benchmark doesn't gain a lot since it doesn't contain that many strings, so most of the overhead is elsewhere.

The Ruby benchmark does get a reduced rss after GC, which does seem to match our expectation that more memory is able to be returned to the OS (although there's still a lot hanging around -- we're still using up more rss than even before serialization; if we were releasing all libdatadog memory we would get below that number).

ivoanjo · 2023-08-25T09:57:32Z

After yesterday's discussion of rss impact, I got curious on what we'd see for examples/ffi/profiles.c with tcmalloc and jemalloc, so here's the results:

libdatadog version	RSS Before drop (task manager)	RSS After drop (task manager)	Maximum RSS
levi/string-table glibc	1.8 GB	640.3 MB	1748420 KB
levi/string-table tcmalloc	1.7 GB	1.7 GB	1656732 KB
levi/string-table jemalloc	1.3 GB	165.9 MB	1299504 KB

ivoanjo · 2023-09-27T09:22:03Z

I re-ran the Ruby test script with the latest version of this branch (6489a16):

`ruby_overhead_experiment.rb`:

(config: {:threads=>16, :depth=>50, :seconds=>60, :timeline_enabled=>true})

libdatadog version	Before serialization rss	After serialization rss	After GC rss ('shrinking', in the script)	Maximum RSS
main@8712cad31ce4d3 (2023-09-26)	78740k	77004k	77004k	79540k
levi/string-table	78568k	76964k	76964k	79528k

...Unfortunately, I think the current test scripts are not particularly suited to measure the difference in this PR, since both examples/ffi/profiles.c as well as the Ruby example repeat the same stacks again and again, which means very few unique strings.

Maybe the replayer with a realistic complex profile with lots of unique strings may be better here?

danielsn

Mostly looks good, one worry about possible use-after-free

danielsn · 2023-09-27T15:39:22Z

+    /// don't currently have metrics on this.
+    ///
+    /// So... for now, the selected number of pages is arbitrarily chosen.
+    pub const GOOD_INITIAL_CAPACITY: usize = 8 * Self::PAGE_SIZE - Self::BUMP_OVERHEAD;


Should we make this a tunable parameter to enable easier experimentation? How much does it matter?

This uses `bumpalo::Bump` as an arena allocator to store the string data contiguously with fewer calls to the system allocator. Since the `StringTable` owns the arena and stores references to that data inside other members of the `StringTable`, it is a self- referencing data structure, which is generally unsafe. This uses `ouroboros::self_referencing` to have a safe abstraction for this pattern.

morrisonlevi · 2023-09-27T21:05:41Z

On 50 runs of the replayer using the PHP timeline symfony-demo pprof, the average memory delta was:

main: 800.72
levi/string-table: 792.14

This seems within the noise tolerance of each other.

danielsn

One ⛏️ about adding a comment explaining why we iterate over strings differently than everything else.

danielsn · 2023-09-28T15:05:08Z

        }

-        for item in self.strings.into_iter() {
+        for item in self.strings.iter() {


This is the one case where we don't into_iter. Might be worth a comment explaining why

danielsn · 2023-09-28T15:13:08Z

+
+    /// Returns an iterator over the strings in the table. The items are
+    /// returned in the order they were inserted, matching the [StringId]s.
+    pub fn iter(&self) -> impl Iterator<Item = &str> {


Not necessary, and might be tricky with lifetimes but this got me wondering if we can have an into_iter implementation that drops the underlying storage when the iteration is done.

It's not worth it, IMO, because we'd have to make another self-referencing struct, and it doesn't actually buy us anything (end result is the same).

ivoanjo · 2024-05-01T16:10:01Z

This is superceded by #404, closing! :)

github-actions Bot added the profiling Relates to the profiling* modules. label Aug 22, 2023

morrisonlevi marked this pull request as ready for review August 22, 2023 18:36

morrisonlevi requested review from a team as code owners August 22, 2023 18:36

danielsn previously approved these changes Aug 22, 2023

View reviewed changes

Comment thread profiling/src/profile/internal/string_table.rs Outdated

Comment thread profiling/src/profile/internal/string_table.rs Outdated

Comment thread profiling/src/profile/internal/string_table.rs Outdated

morrisonlevi dismissed danielsn’s stale review via 9eb2edc August 22, 2023 23:50

morrisonlevi force-pushed the levi/string-table branch from f82e26a to 55abfc6 Compare August 29, 2023 02:54

morrisonlevi changed the base branch from main to levi/collections August 29, 2023 02:55

morrisonlevi force-pushed the levi/collections branch from e489935 to 08cdeb7 Compare August 29, 2023 03:26

morrisonlevi force-pushed the levi/string-table branch from 55abfc6 to b4bbaeb Compare August 29, 2023 03:29

Base automatically changed from levi/collections to main August 31, 2023 17:09

morrisonlevi marked this pull request as draft August 31, 2023 17:10

morrisonlevi force-pushed the levi/string-table branch 2 times, most recently from 0170833 to 9cfe7d6 Compare September 26, 2023 03:27

morrisonlevi marked this pull request as ready for review September 27, 2023 14:10

danielsn reviewed Sep 27, 2023

View reviewed changes

morrisonlevi force-pushed the levi/string-table branch from 16ee890 to eb11362 Compare September 27, 2023 17:37

morrisonlevi added 3 commits September 27, 2023 11:40

avoid magic number

f022015

tinker with initial string table size

83a10df

PROF-8285: adjust docs

9894bdd

danielsn approved these changes Sep 28, 2023

View reviewed changes

danielsn reviewed Sep 28, 2023

View reviewed changes

ivoanjo closed this May 1, 2024

morrisonlevi deleted the levi/string-table branch October 17, 2024 15:01

bm1549 mentioned this pull request May 22, 2026

feat(sampling): accept Remote Config list-shape tags natively #2033

Open

Conversation

morrisonlevi commented Aug 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Additional Notes

How to test the change?

Uh oh!

danielsn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ivoanjo commented Aug 23, 2023

examples/ffi/profiles.c:

ruby_overhead_experiment.rb:

Uh oh!

ivoanjo commented Aug 25, 2023

Uh oh!

ivoanjo commented Sep 27, 2023

ruby_overhead_experiment.rb:

Uh oh!

danielsn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

danielsn Sep 27, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

morrisonlevi commented Sep 27, 2023

Uh oh!

danielsn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

danielsn Sep 28, 2023

Choose a reason for hiding this comment

Uh oh!

danielsn Sep 28, 2023

Choose a reason for hiding this comment

Uh oh!

morrisonlevi Sep 28, 2023

Choose a reason for hiding this comment

Uh oh!

ivoanjo commented May 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

morrisonlevi commented Aug 22, 2023 •

edited

Loading

`examples/ffi/profiles.c`:

`ruby_overhead_experiment.rb`:

`ruby_overhead_experiment.rb`: