Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling improvements #1767

Merged
merged 5 commits into from
May 22, 2024
Merged

Profiling improvements #1767

merged 5 commits into from
May 22, 2024

Conversation

ryzhyk
Copy link
Contributor

@ryzhyk ryzhyk commented May 18, 2024

  • Fixed a bug in the implementation of SizeOf for vectors
  • Added a heap profiler

Is this a user-visible change (yes/no): yes

A trivial typo in the implementation of LeanVec resulted in completely
incorrect reporting of heap usage in operator profiles.

Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
@ryzhyk ryzhyk requested a review from blp May 18, 2024 22:47
@ryzhyk
Copy link
Contributor Author

ryzhyk commented May 18, 2024

@blp , one potential use of this is that we should be able to see where memory is still used in the nexmark benchmark running with storage.

This commit switches to using jemalloc as our default allocator,
enables its continuous profiling feature, and adds an HTTP endpoint to
extract the heap profile at runtime.  It's based on this blog:

https://www.polarsignals.com/blog/posts/2023/12/20/rust-memory-profiling

In more detail:

The jemalloc allocator has a built-in profiler that can be used to
continuously track heap usage of the program with low overhead,
to the extent that it can be kept always on (at least that's the theory).
It outputs the profile in a format compatible with the `pprof` tool
from Google.  I tried it with a release build of a pipeline and it
generates perfectly legible profiles.

I switched to using jemalloc as the global allocator with profiling
enabled unconditionally for all pipelines.  This way we will be able
to get the profile of any pipeline.

Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
Leonid Ryzhyk added 2 commits May 18, 2024 16:05
Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
The test applied integrate_trace_retain_keys to unspilled batches,
so it didn't have the expected effect, but the test did not properly
check this.

Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
@@ -1455,8 +1455,8 @@ impl<T: SizeOf> SizeOf for LeanVec<T> {
fn size_of_children(&self, context: &mut Context) {
self.vec.size_of_children(context, &mut |x, ctx| {
let x = unsafe { &*(x as *const T) };
x.size_of_children(ctx);
})
T::size_of_children(x, ctx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we have properly implemented this trait everywhere. We should make an audit

//println!("retain_vals: {}bytes", trace.size_of().total_bytes());
assert!(trace.size_of().total_bytes() < 50000);
// println!("retain_vals: {}bytes", trace.size_of().total_bytes());
assert!(trace.size_of().total_bytes() < 70000);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes for to the bug you fixed?

@ryzhyk ryzhyk merged commit 54eca45 into main May 22, 2024
5 checks passed
@ryzhyk ryzhyk deleted the profiling_fixes branch May 22, 2024 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants