Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JFR CPU profiler does not respect the configured sampling period #9013

Closed
roberttoyonaga opened this issue May 31, 2024 · 3 comments
Closed

Comments

@roberttoyonaga
Copy link
Collaborator

roberttoyonaga commented May 31, 2024

Describe the issue
When running with JFR enabled, the CPU sampler produces more samples than what's specified in the configuration (.jfc) file.

The sampling rate is 'N' times the specified rate in the configuration file (.jfc). Where 'N' is the number of active threads. This creates potential redundancy and overhead that scales with the number of threads at any given time.

This is a recent change that looks like it was introduced with #8517. It seems like the intent of that PR was to give each thread it's own timer (on linux). In the implementation, when each timer expires, the SIGPROF signal is sent every time. Any thread can be stopped to handle any other thread's timer expiry SIGPROF. I am unsure of why each thread must have it's own dedicated timer.

Why does this PR (#8517) add extra sampling? Was the intention to connect each timer's expiry signal to it's respective thread? If not, why not just increase the frequency of the old itimer implementation? That would allow for a more even sample distribution. Currently, all the timers expire at roughly the same time every period so we have bursty sampling.

Steps to reproduce the issue
Please include both build steps as well as run steps
Download latest EA build https://github.com/graalvm/oracle-graalvm-ea-builds/releases

  1. git clone https://github.com/roberttoyonaga/graalvm-sampling-bug.git
  2. javac Reproducer.java
  3. native-image --enable-monitoring=jfr --install-exit-handlers Reproducer
  4. ./reproducer -XX:StartFlightRecording=settings=settings.jfc,filename=rec23.jfr
  5. Inspect recording in JMC
    image
    We get more samples per period than specified in the configuration (1/s)

Describe GraalVM and your environment:

  • OS: [linux]
  • Architecture: [AMD64]
@roberttoyonaga
Copy link
Collaborator Author

@jovanstevanovic Is this observation intentional?

@jovanstevanovic
Copy link
Member

Hey @roberttoyonaga, valid point. I'll make sure to fix that. Also, this only happens on Linux.

@jovanstevanovic
Copy link
Member

The fix is on the master. The issue is internally tracked as GR-54471.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants