Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hprof] Process heap profiler #6639

Merged
merged 1 commit into from Sep 6, 2023
Merged

Conversation

max-au
Copy link
Contributor

@max-au max-au commented Jan 8, 2023

Similar to eprof for time profiling, hprof enables heap profiling for Erlang processes. It simplifies selection of processes to trace and allows programmatic analysis.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2023

CT Test Results

    2 files    22 suites   8m 42s ⏱️
214 tests 211 ✔️ 3 💤 0
234 runs  231 ✔️ 3 💤 0

Results for commit ce53959.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

@max-au
Copy link
Contributor Author

max-au commented Jan 8, 2023

This PR requires #6351 merged first, and preloaded files must be updated (at least erlang.beam) to ensure Dialyzer pass. Tests in hprof_SUITE are expected to fail without #6351 merged.

Several decisions I made working on hprof:

  1. It is indeed very similar to eprof, and at first I considered just updating eprof to accept some argument telling whether it's time or heap profiling happening. But existing eprof APIs are somewhat old-fashioned, code is outdated, and tests aren't useful for hprof.
  2. hprof server does not keep track of traced processes. However it manages modules/functions/arities to remove trace_pattern from. The reason for that is set_on_spawn flag, it's not easy to track spawned processes by the hprof.
  3. Documentation may not be comprehensive enough (although eprof is even harder to read/understand).

One more thing I'd love to have is eprof exporting the same APIs that hprof has. It is really convenient to have "all children" traced, and two profiling modes. But it means almost complete rewrite of eprof, likely making it incompatible with the existing implementation. I can attempt that as a separate PR if it's seen useful, but honestly, with JIT and Linux perf, to me, there is not much value left in eprof.

@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Jan 9, 2023
@max-au
Copy link
Contributor Author

max-au commented Jan 27, 2023

@sverker and @frazze-jobb is there any chance to facilitate the review and testing?

I'd really love to include this in OTP-26RC1 so we could start testing too.

@max-au
Copy link
Contributor Author

max-au commented Feb 7, 2023

Rebased to the latest master, and updated with changes from #6351 (megawords removed).

@max-au
Copy link
Contributor Author

max-au commented Feb 21, 2023

  • rebase to trigger included tests in CI

@josevalim
Copy link
Contributor

This is pretty neat!

I have a question: my understanding is that cprof, eprof, and hprof are all based on the same structure. They set up trace patterns and then use trace_info to get information about MFAs. Shouldn't we then be trying to provide a unified API for all three?

Otherwise, if I want to measure both time and heap allocation, I will need to measure twice. Not only that, because they have different APIs, we would need to change tools like Erlang's compiler to support both eprof and hprof. Perhaps wouldn't it be better to introduce a prof (or tprof for tracing profiler) that by default measures only counts, and we could opt-in for both call_time and/or call_memory measurements? This could eventually allow eprof/cprof to be deprecated, so we end-up with two profiling modules (fprof+this) rather than four?

@max-au
Copy link
Contributor Author

max-au commented Feb 24, 2023

Shouldn't we then be trying to provide a unified API for all three?

There are several considerations why I haven't done this:

  1. Running call tracing affects timings, and heap profiling affects it even more. It's probably worth documenting, or changing the "Note" I have in the documentation into "Warning": heap profiling affects call time profiling, and better be used separately. In fact it does not make much sense to combine heap profiles with call time profiling (and I've never seen these two done at once).
  2. cprof has very limited use, and I'd rather deprecate it completely, as eprof also does call counting (and hprof too). I do not have usage telemetry, but something hints me there isn't a lot of use for it. Original designation was to find "hot functions" (that should supposedly be JIT-ted to native code).
  3. eprof lost quite an amount of value when JIT was introduced. I would also recommend against eprof as CPU/time profiler, in favour of "+JPperf" and Linux perf tool. I would defer a decision to the OTP team, but I'm fine to keep as-is, deprecate, or update APIs to match hprof. Or, indeed, merge eprof + hprof + cprof.

However I would note that hprof APIs are more convenient for production and development usage. I even daresay more modern than of eprof, able to dynamically change trace patterns, combine profiles, and trace supervision hierarchies (probably the most important feature for us)

@josevalim
Copy link
Contributor

Thank you @max-au, that clarifies a lot.

I would say that, even if you cannot run two modes at once, it would still be valuable to have a single module for profiling, except you choose a single mode instead of several. We can either support all three modes (count, time, memory) or just choose between (time and memory). The benefit being that tools like the compiler do not have to integrate two different profilers.

I understand that perf is better, and it keeps being improved, but having a solution that works on all OSes is also very valuable. Afaik macOS doesn't support perf either. :(

@max-au
Copy link
Contributor Author

max-au commented Feb 25, 2023

Mac OS has sample, which, IIRC, can yield results similar to perf, eventually building flamegraphs. I haven't used it though, and in fact never had to profile an Erlang program for MacOS.

I agree that a single prof accepting some "mode" (heap | time | calls) would be more convenient, and I for sure can turn hprof into that tool as well. But it would mean deprecating cprof/eprof. I'd like to hear OTP team opinion on that. If we can come to some agreement, I can probably work on the actual solution.

Similar to eprof for time profiling, hprof enables
heap profiling for Erlang processes. It simplifies selection
of processes to trace and allows programmatic analysis.
@max-au
Copy link
Contributor Author

max-au commented Mar 30, 2023

  • rebased to fix merge conflict.
  • added handling for unexpectantly disappearing traces (e.g. when hot code reload removes the trace flag), and updated documentation to warn against code reload while tracing. Thanks @mkuratczyk for reporting the issue

@jhogberg jhogberg modified the milestones: OTP-26.0-rc1, OTP-26.0 Apr 17, 2023
@max-au
Copy link
Contributor Author

max-au commented Apr 28, 2023

@jhogberg should this be headed to OTP-26 as experimental, or it needs to wait until 26.1?

@jhogberg
Copy link
Contributor

Sorry, I missed this message :-(

26.1 sounds fine, it's a whole new tool so it won't break anyone's code unless they've squatted the hprof module name. :D

@jhogberg jhogberg modified the milestones: OTP-26.0, OTP-26.1 May 16, 2023
@frazze-jobb frazze-jobb merged commit 640377c into erlang:master Sep 6, 2023
15 checks passed
@josevalim
Copy link
Contributor

Apologies folks, but before this module is officially released, are there any thoughts into calling this tprof (or similar) and allowing the mode to be an option? Otherwise we would have 4 profilers with 3 different APIs. Making this generic means we can fold eprof/cprof/hprof into a single module and API.

@frazze-jobb
Copy link
Contributor

I was a little bit trigger happy yesterday evening, and merged this to master. (which will make it to 27). If we manage to implement multiple trace sessions for 27, then we will probably want to change the APIs as well before making it official.

@sverker
Copy link
Contributor

sverker commented Sep 7, 2023

I like the idea of a generic API to rule them all.
However, we are looking at introducing the concept of "trace sessions" in OTP 27. A way to avoid different trace users (such as profilers) to interfere with each others trace settings. We haven't come very far, mostly talk and some early prototyping.

With trace sessions in place, or at least an API, would it not be possible that a new generic profiling API should reflect that with some sort of profiling session concept.

@josevalim josevalim mentioned this pull request Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants