Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple events at the same time #150

Closed
t1mch0w opened this issue Aug 13, 2018 · 27 comments
Closed

Support multiple events at the same time #150

t1mch0w opened this issue Aug 13, 2018 · 27 comments

Comments

@t1mch0w
Copy link

t1mch0w commented Aug 13, 2018

Is it possible that async-profiler supports sampling on multiple events at the same time?

I have checked the code. I plan to change the function Error PerfEvents::start(const char* event, long interval) in async-profiler/src/perfEvents_linux.cpp, make it first parse the event string into a list (maybe separated by ","), then run the current procedure for each event in the list.

Is it Ok? Any suggestions?

@apangin
Copy link
Collaborator

apangin commented Aug 13, 2018

It's doable, but not that simple.

Currently PerfEvents engine creates one event descriptor per thread and maintains thread-to-event mapping. Multiple events support would require a major redesign of this part. Note that perf event management does not happen only on start/stop of profiling, but also every time a thread is created or destroyed.

I wonder, want kind of events would you like to profile together?

@t1mch0w
Copy link
Author

t1mch0w commented Aug 13, 2018

I want to profile sched:sched_switch, sched:sched_stat_sleep, and something else to generate off-cpu flame graph. If it's impossible at this moment, can I start two or more async-profilers together? Is the overhead much heavier?

@t1mch0w
Copy link
Author

t1mch0w commented Aug 13, 2018

So, the mapping is not from an event to a thread, but a thread to an event?
BTW, can you show me where to manage the start/stop profiling for new threads? Or it's a future plan?
Thank you.

@apangin
Copy link
Collaborator

apangin commented Aug 13, 2018

The mapping is bi-directional. Each perf event must be processed by the particular thread only. When a signal arrives, the related event and its corresponding ring buffer is discovered via thread id.

Thread start/end is handled in PerfEvents::ThreadStart and PerfEvents::ThreadEnd functions.

@apangin
Copy link
Collaborator

apangin commented Aug 13, 2018

Another problem with multiple events is that different events should be usually handled differently. Like in your case, if we simply mix sched_switch and sched_stat_sleep on the same graph, the results will be meaningless.

If you are interested in off-cpu profiling, check the recent "wall-clock profiling" feature available in wall-clock branch.

@t1mch0w
Copy link
Author

t1mch0w commented Aug 13, 2018

Thank you so much. I will look at the wall-clock branch.

I understand that the simple mixing of such two events on the graph is meaningless. However, if we want the final off-cpu graph makes sense, we have to use some-kind perf-inject tool to match such two events, then generate the corresponding off-cpu graph.

I have tried that I can run two async-profilers at the same time. For me, it's enough. I'll test the overhead later. I really like the tool you guys developed. If I have time, I'd like to try implement the support of multiple events.

Thank you, again.

@apangin
Copy link
Collaborator

apangin commented Aug 13, 2018

I have tried that I can run two async-profilers at the same time.

I'm afraid this won't work. async-profiler works in the context of the target process and it has some shared state (static variables).

@t1mch0w
Copy link
Author

t1mch0w commented Aug 13, 2018

But the results seem reasonable. How can I verify the results?

@apangin
Copy link
Collaborator

apangin commented Aug 13, 2018

When you run profiler twice (without -f option), it will print Profiler already started.

@oehme
Copy link

oehme commented Dec 21, 2018

+1 from me. In many cases the overhead of sampling multiple event types would probably be acceptable and worth the convenience of not having to run the code under test multiple times. E.g. I'd like to be able to get CPU and allocation recordings from the same run and write them to different files. Or to keep it simpler, async-profiler could write one standardized output format containing all collected data and then have post-processors that transform it into the format you want. That way you could also remove a lot of the formatting-specific options from the agent and move them to a different program. That way I don't need to know which format I want when I collect the data, only what data I want.

My main use case is offline profiling. At Gradle we have a large performance test suite and each test already takes a significant amount of time. Running each of them twice or more (depending on what events we want) would not be an option. With JFR (which we currently use) we can get both CPU and allocation data at the same time. The main drawback is that JFR is terrible for anything that does a lot of small IO operations. Your new wall profiling mode would be an ideal replacement.

@apangin
Copy link
Collaborator

apangin commented Dec 24, 2018

@oehme I see. This is exactly what I plan to do in future versions of async-profiler. But this is a large task and will take time.

@huonw
Copy link

huonw commented Jan 3, 2019

I'd love to see this feature as well.

I'd like to be able to get CPU and allocation recordings from the same run and write them to different files

This is my main use-case too: running a CPU profiler and profiling allocations at the same time. I think these theoretically use separate mechanisms, and thus could run independently without having to balance perf requirements? Additionally, it would be nice if profiling allocations could collect both count and size data (i.e. both collapsed=samples and collapsed=total) at the same time.

For my use-case, I'm emitting all data as the FlameGraph collapsed stack traces, and then post-processing them, and so could manage with multiple traces being mixed into a single file, but would prefer separate output files for each profiling mode.

Other than the above, async-profiler is working really well for us. Thanks!

@apangin
Copy link
Collaborator

apangin commented Jan 4, 2019

OK, got it. I've put this task on my roadmap. Not in the nearest release though.

@huonw Allocation profiler already collects both count and size simultaneously. Both are printed in flat and traces output format. collapsed and svg formats support only one metric, but after the profiling session you may use stop command to dump the results in different formats as many times as you wish.

@huonw
Copy link

huonw commented Jan 7, 2019

Ah, thanks for the tip. Async-profiler is better and better the more I understand about it!

@alblue
Copy link

alblue commented Oct 28, 2020

At the moment, it's only possible to capture the invocations of a single Java method with Async Profiler. Will the ability to capture multiple event types allow for the capturing of several Java methods in the same profile?

@apangin
Copy link
Collaborator

apangin commented Oct 29, 2020

@alblue Wildcard (*) can be used to profile multiple methods of the same class. However, it's not currently allowed to profile methods of different classes simultaneously. I plan to make this possible.

@PaulBGD
Copy link

PaulBGD commented Jan 26, 2021

Is there a way to pitch in to get this completed? Would be a game changer for me. Currently just alternating between taking wall and allocation profiles.

@apangin
Copy link
Collaborator

apangin commented Jan 26, 2021

@PaulBGD It's already available in v2.0 Early Access. Please, try out and provide your feedback.

@PaulBGD
Copy link

PaulBGD commented Jan 26, 2021

Oh wow, I didn't see! I'll give any feedback if I run into anything interesting :)

@Biunovich
Copy link

Hi everyone! Could somebody explain: I am collecting multiple events (cpu, alloc, lock) in jfr format with async profiler 2. Then I convert jfr file using converter to flame graph (jfr2flame). As I see in flame graph is only a cpu results. How can I get flame graphs for alloc and lock?

@felixbarny
Copy link
Contributor

Currently, the JfrParser only processes stack trace samples for CPU and wallclock events. See

https://github.com/jvm-profiling-tools/async-profiler/blob/3ff315ea8ff3ed8fb55d931f67cac634d8c51b99/src/converter/one/jfr/JfrReader.java#L308-L310

First, the JfrParser needs to be extended. After that, the jfr2flame program probably needs a flag that lets you specify which events should be converted to a flame graph. Mixing all events into a single flame graph wouldn't make much sense, I suppose.

@Biunovich
Copy link

@felixbarny thank you for the answer. And about that JfrParser needs to be extended: Does this feature is going to be implemented soon or I could help implement this functionality (if nobody already doing this)?

@felixbarny
Copy link
Contributor

Not sure, best to wait for @apangin to chime in.

@apangin
Copy link
Collaborator

apangin commented Mar 22, 2021

Right - jfr2flame currently produces only CPU graph.
I'll likely improve the tool in the next release to support alloc/lock graphs as well.
Meanwhile it's possible to open async-profiler's .jfr with JMC and analyze allocations and locks there.

@apangin
Copy link
Collaborator

apangin commented Apr 11, 2021

Added the following options to the jfr2flame script:

--alloc    Allocation Flame Graph
--lock     Lock contention Flame Graph
--threads  Split profile by threads
--total    Accumulate the total value (time, bytes, etc.)

@Muniyasamy
Copy link

@PaulBGD It's already available in v2.0 Early Access. Please, try out and provide your feedback.

Hi @apangin and @PaulBGD , I'm currently using async profiler version 1.7.1. I'm attempting to record invocations of multiple Java methods from both the same and different classes within a single profiling session. However, it seems to only profile the first methods encountered. Could you provide me with some sample commands or guidance on how to achieve this effectively?

@apangin
Copy link
Collaborator

apangin commented Mar 8, 2024

@Muniyasamy Async-profiler does not currently support instrumentation of methods from different classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants