Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elixir/Erlang VM support #145

Open
brancz opened this issue Nov 29, 2021 · 15 comments
Open

Elixir/Erlang VM support #145

brancz opened this issue Nov 29, 2021 · 15 comments
Labels
area/debuginfo Something to do with handling debuginfos area/eBPF Something involving eBPF area/jit Something to do with Just-In-Time compilation enhancement New feature or request feature/language-support This feature describes support for a new language/runtime.

Comments

@brancz
Copy link
Member

brancz commented Nov 29, 2021

The Erlang VM has support for perf maps via the ERL_FLAGS="+S 1 +JPperf true" flags. However, even when setting those flags, profiling an Erlang process does not consistently work (occasionally individual addresses are symbolized).

Working theory: Erlang has a multi-process model, which could be a problem in this case if a process is short-lived (in the sense that a profiling loop of 10 seconds passed while the process was created and ended).

Ultimately even if the perf-map support works, it would be great for erlang users not to have to change anything about their deployment to reap the benefits, but it's a good intermediate step.

@brancz brancz added enhancement New feature or request feature/language-support This feature describes support for a new language/runtime. area/eBPF Something involving eBPF area/debuginfo Something to do with handling debuginfos labels Nov 29, 2021
@mkuratczyk
Copy link
Contributor

Here are my observations:

There is certainly a file format mismatch - whether it is the reason symbols cannot be resolved or whether it's a minor issue - I'm not sure. Erlang produces two files in the /tmp folder that perf successfully uses to resolve the symbols:

  1. /tmp/perf-XYZ.map which is an ASCII mapping file that looks like this:
0x7fc25ff306f0 88 $rabbit_mgmt_db:message_stats/1

so the address, a column I don't recognize and then the module:function/arity

  1. /tmp/jit-XYZ.dump, which is a file that follows JITDUMP specification version 2

However, parca-agent tries to open the latter using the debug/elf package which fails because that's not an ELF file. The following error is logged:

level=warn ts=2022-03-17T14:35:23.219959908Z caller=maps.go:102 msg="failed to read object build ID" object=/proc/8484/root/tmp/jit-12.dump err="failed to open elf: bad magic number '[68 84 105 74]' in record at byte 0x0"

I think those occasional symbols are simply non-JIT symbols - they are functions that can be found directly in the beam.smp binary (Erlang VM itself).

If you have any pointers for further debugging or if you can sketch out what would need to be done to support Erlang JIT, I'd be happy to take a stab at this.

Thanks,

@brancz
Copy link
Member Author

brancz commented Mar 17, 2022

That would be super awesome I didn't even know about the JITDUMP format! If the erlang stacks always require unwinding then unfortunately that's not something we can do today (@kakkoyun is actively working on all things stack unwinding). We already attempt to symbolize stacks using perf-maps, so maybe this would be a good place to start.

@kakkoyun kakkoyun self-assigned this Mar 18, 2022
@mkuratczyk
Copy link
Contributor

mkuratczyk commented Apr 25, 2022

It took me some time to get to this but it seems like Erlang support works just fine now. :)

All that's needed is +JPperf map in Erlang flags. Documentation suggests setting it to true, which also works. The difference is that when set to true, Erlang generates the mappings in two formats I mentioned above - setting it to map generates only one format and is sufficient.

I still see some unresolved symbols here and there but, these probably come from some libraries in our Docker image. A vast majority of symbols on this screenshot come from the JIT for sure:

erlang-symbols

Tested using ghcr.io/parca-dev/parca-agent:main-d2e701c2 and ghcr.io/parca-dev/parca:main-9b229a69

@kakkoyun
Copy link
Member

@mkuratczyk Thanks for taking the time and testing ❤️

@brancz
Copy link
Member Author

brancz commented Apr 28, 2022

That is amazing!!!

@tsloughter
Copy link

Hi, I'm the OpenTelemetry Erlang/Elixir maintainer and I'm interested in toying with this example and parca in sort of preparation for what the Otel Profiling SIG comes up with.

It isn't clear from the example how to actually use this with parca. I've played with using parca UI, have it installed locally via snap, and I've tried perf with Erlang, but how this demo and the parca agent are meant to work together I'm not seeing :)

@kakkoyun
Copy link
Member

Hey @tsloughter, maybe this repo and examples could help better https://github.com/parca-dev/parca-demo/tree/main/erlang

The idea behind the agent is to profile the whole system using eBPF. You need to drop the agent on the host and it will do the rest. If you configure it to send data to a Parca server you can see the collected profiles (you can see all this in the example above). For now, you need to set some flags for BEAM so that the agent can symbolize the collected stack traces.

I hope this helps. Let me know if it's not clear enough.

@tsloughter
Copy link

@kakkoyun oh, you mean run the agent locally on my machine which will collect from the Erlang process running in docker?

I expected a sidecar or something (so a docker-compose instead of a single dockerfile) :)

@kakkoyun
Copy link
Member

@tsloughter Yes, that approach should work. If the process is not in a VM on your local machine the agent can discover it. You might want to use relabelling support to drop all the process' profiles except the erlang one. PTAL
https://www.parca.dev/docs/parca-agent-labelling#configuration

If you're looking for a pre-baked solution, in the example, we run it in the k8s cluster. And you pick which demo you'd like to run.

@mkuratczyk
Copy link
Contributor

I believe we can close this issue. In the last versions (agent 0.13 + server 0.15/0.16), I have no problem resolving Erlang JIT symbols. Some issues I still have are related to symbols from binary files, but nothing Erlang specific.

Screenshot 2023-03-03 at 09 08 15

@brancz
Copy link
Member Author

brancz commented Mar 3, 2023

Our ambition is to not even have to enable the +JPperf true flag. It's really awesome that if people put a small amount of work into it that we can support it, but ultimately we want people to have to do absolutely nothing to their deployments. Zero-instrumentation all the way.

That said I agree it's important to recognize this milestone and on a support matrix we could probably already say we support Erlang.

@mkuratczyk
Copy link
Contributor

mkuratczyk commented Mar 3, 2023

I didn't know that was even possible :)

@brancz
Copy link
Member Author

brancz commented Mar 3, 2023

It'll probably be a while until we take these measures, but essentially it's reading the same data as +JPperf true exposes via the jitdump/perf-map interface directly from the Erlang VM. The information is there, we just need to find it. We'll first do that for languages where we have no other option though, such as Python and Ruby.

@tsloughter
Copy link

That'd be awesome.

Side note, curious if you have had any involved with OpenTelemetry profiling?

@brancz
Copy link
Member Author

brancz commented Mar 3, 2023

At least one person from the Parca project attends every OpenTelemetry Profiling working group call. If you have any more specific questions I think it would be best to continue that conversation on Parca Discord, so we can keep this issue on the topic of zero-instrumentation Erlang support.

@kakkoyun kakkoyun added the area/jit Something to do with Just-In-Time compilation label May 17, 2023
@kakkoyun kakkoyun removed their assignment May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/debuginfo Something to do with handling debuginfos area/eBPF Something involving eBPF area/jit Something to do with Just-In-Time compilation enhancement New feature or request feature/language-support This feature describes support for a new language/runtime.
Projects
None yet
Development

No branches or pull requests

4 participants