JFR CPU profiler does not respect the configured sampling period #9013

roberttoyonaga · 2024-05-31T16:28:59Z

Describe the issue
When running with JFR enabled, the CPU sampler produces more samples than what's specified in the configuration (.jfc) file.

The sampling rate is 'N' times the specified rate in the configuration file (.jfc). Where 'N' is the number of active threads. This creates potential redundancy and overhead that scales with the number of threads at any given time.

This is a recent change that looks like it was introduced with #8517. It seems like the intent of that PR was to give each thread it's own timer (on linux). In the implementation, when each timer expires, the SIGPROF signal is sent every time. Any thread can be stopped to handle any other thread's timer expiry SIGPROF. I am unsure of why each thread must have it's own dedicated timer.

Why does this PR (#8517) add extra sampling? Was the intention to connect each timer's expiry signal to it's respective thread? If not, why not just increase the frequency of the old itimer implementation? That would allow for a more even sample distribution. Currently, all the timers expire at roughly the same time every period so we have bursty sampling.

Steps to reproduce the issue
Please include both build steps as well as run steps
Download latest EA build https://github.com/graalvm/oracle-graalvm-ea-builds/releases

git clone https://github.com/roberttoyonaga/graalvm-sampling-bug.git
javac Reproducer.java
native-image --enable-monitoring=jfr --install-exit-handlers Reproducer
./reproducer -XX:StartFlightRecording=settings=settings.jfc,filename=rec23.jfr
Inspect recording in JMC

We get more samples per period than specified in the configuration (1/s)

Describe GraalVM and your environment:

OS: [linux]
Architecture: [AMD64]

roberttoyonaga · 2024-05-31T16:30:37Z

@jovanstevanovic Is this observation intentional?

jovanstevanovic · 2024-06-03T08:12:26Z

Hey @roberttoyonaga, valid point. I'll make sure to fix that. Also, this only happens on Linux.

jovanstevanovic · 2024-06-07T05:58:26Z

The fix is on the master. The issue is internally tracked as GR-54471.

roberttoyonaga added bug native-image labels May 31, 2024

roberttoyonaga added the redhat-interest label May 31, 2024

selhagani assigned jovanstevanovic Jun 3, 2024

jovanstevanovic closed this as completed Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JFR CPU profiler does not respect the configured sampling period #9013

JFR CPU profiler does not respect the configured sampling period #9013

roberttoyonaga commented May 31, 2024 •

edited

Loading

roberttoyonaga commented May 31, 2024

jovanstevanovic commented Jun 3, 2024

jovanstevanovic commented Jun 7, 2024

JFR CPU profiler does not respect the configured sampling period #9013

JFR CPU profiler does not respect the configured sampling period #9013

Comments

roberttoyonaga commented May 31, 2024 • edited Loading

roberttoyonaga commented May 31, 2024

jovanstevanovic commented Jun 3, 2024

jovanstevanovic commented Jun 7, 2024

roberttoyonaga commented May 31, 2024 •

edited

Loading