Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elixir Support #2392

Open
dogukanzorlu opened this issue Jan 10, 2023 · 6 comments
Open

Elixir Support #2392

dogukanzorlu opened this issue Jan 10, 2023 · 6 comments

Comments

@dogukanzorlu
Copy link

dogukanzorlu commented Jan 10, 2023

Hi Parca peoples,

for Parca to provide Elixir support, I developed a library that imitates net/http/pprof in golang and generates meaningful pprof profile data from internal erlang library :fprof.

Many explanations you need are available in the documentation. But still, I will write the same ones and the additions here.

Aim
The observation and profiling culture of erlang and elixir is different from other languages. Elixir/Erlang runs on a virtual machine (a.k.a beam) and each block of code represents a process. Erlang provides many tools internally for monitoring abstracted processes and processes dependent functions. This library produces as meaningful pprof output as possible using the built-in Erlang profiling tools. Thus, it tries to spread the Erlang/Elixir observability with pprof-based ad-hoc profiling tools.

A few important things

  • This library an experimental and still under development, a library that needs to be careful to use on the production line.

  • The accuracy of the outputs has been proven by testing with cross-tools. But this alpha version does not offer a full pprof service.

  • fprof significantly slows down the application it is running on. It monitors all processes and collects their tracing data. Therefore, even a 10-second data collection in large-scale applications can mean GBs of data. It is recommended that the scrape seconds do not exceed 5 in order not to get lost in the abundance of data. More information: :fprof

  • The :suspend pseudo function has got an OWN time of zero. This is to prevent the process total OWN time from including time in suspension. Whether suspend time is really ACCUMULATED or OWN time is more of a philosophical question. more: analysis-format

    Installation and Usage

    Installation

     def deps do
      [
       {:pprof, "~> 0.1.0"}
      ]
     end

    After:

      $ mix deps.get

    Usage

    Add pprof config your config file like this:

      config :pprof, :port, 8080

    You can use with go pprof:

      $ go tool pprof http://localhost:8080/debug/pprof/profile?seconds=5&type=fprof
    

    Also using with Parca add configure in parca.yaml:

         # params:
         #   type: [ 'fprof' ]

Screenshot from 2023-01-08 23-17-48

Screenshot from 2023-01-10 12-20-38

source: github
Thanks!
@kakkoyun @brancz

@kakkoyun
Copy link
Member

xref parca-dev/parca-agent#145

@brancz
Copy link
Member

brancz commented Jan 10, 2023

Very cool! I'm curious, what do the numbers before the closing curly brackets mean are they parameters? For a 5 seconds profile 1.1k samples seems like a lot, what's the sampling frequency?

@dogukanzorlu
Copy link
Author

{ModuleName, Function, Arity} so its mean for example Erlang.garbage_collect(arg1, arg2) -> {:erlang, :garbage_collect, 2}. actually fprof is not simple call stack sampling, fprof traces the entire process on the beam and the invoked functions that depend on the processes. therefore most of time sampling so huge. Unfortunately, I couldn't find any content about frequency in fprof and its source code. @brancz

@brancz
Copy link
Member

brancz commented Jan 10, 2023

{ModuleName, Function, Arity} so its mean for example Erlang.garbage_collect(arg1, arg2) -> {:erlang, :garbage_collect, 2}.

Got it! Thanks for the explanation!

actually fprof is not simple call stack sampling, fprof traces the entire process on the beam and the invoked functions that depend on the processes. therefore most of time sampling so huge. Unfortunately, I couldn't find any content about frequency in fprof and its source code.

Judging by the docs I would be careful with using this continuously, it seems like fprof and eprof are meant as one-off tools (even the docs claim they add significant overhead). Only cprof seems to be sampling and <10% impact (still rather large for continuous profiling purposes).

I would suggest doing the CPU profiling with the Parca Agent and using the ERL_FLAGS="+S 1 +JPperf true" flag on the erlang/elixir application. This should give great visibility at very low overhead.

Anything memory-related, especially heap is something that would make a ton of sense to be in a library like you created! In any case, nice work looking forward to seeing more!

@dogukanzorlu
Copy link
Author

Exactly you are right about continuously. But considering that there is no pprof in elixir or erlang, it's not bad for a start. I will also produce pprof for memory and process. I am also working on perf maps related VM. In any case, thank you for your comments. @brancz

@brancz
Copy link
Member

brancz commented Jan 10, 2023

Thank you so much for this work, the community needs this type of engagement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants