-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Groundwork to demonstrate FlyWeight transient memory consumption #245
Groundwork to demonstrate FlyWeight transient memory consumption #245
Conversation
This measures allocation rates in interpreted code, right? What if C2 escape analysis decides to scalarise half those allocations once the code gets hot? I would prefer to use |
I added these annotations to @Threads(8) // provoke any possible contention on auxiliary data structures
@Measurement(time = 30) // this should be long enough for accumulated garbage to make a difference
@Fork(value = 1, jvmArgsPrepend = {"-XX:-TieredCompilation", "-XX:+UseG1GC"}) // skip C1, use G1 The object graph in the benchmark isn't complex enough to make garbage collection difficult, and you observe no benefit from flyweights in this benchmark. Here are the garbage collection statistics (only measured with hot compiled code) showing that there is a lot of garbage generated by the standard iterator, and virtually none by the flyweight.
Conclusively, it can be seen that allocation rate is higher with standard iterators! But this should be expected and the question is whether this is actually harmful. What would be interesting would be to generate a difficult object graph to collect, run some background work to mutate it during the benchmark, and then run the benchmark itself. For the record, I don't have anything against flyweights. |
Ok guys... do we merge this? |
@lemire @blacelle I guess my point was that we already have the info available, and I produced the data above with openjdk, with confidence that it doesn't relate to interpreted code. I also ran the same test with an early access openjdk Shenandoah build, which was only possible because I was using JMH. |
@richardstartin I am trying to understand where this is going. My view is that @blacelle is establishing that flyweight iterations generate fewer allocations. That might be "self-evident", but we know not to take anything for granted. Then Richard points out that, in JITed code, the result could be different. Ok. That seems like an excellent question. Benoit, what is your take on this? Assuming that you agree that you are not measuring the result of JITed code, is there any easy fix for this? Another point that Richard raises is whether the extra allocation is harmful. I think we agree that it is a different question, but maybe we want to go one step at a time. I think that Richard's implicit point is that it is a myth that GC pressure is a performance bottleneck. The problem is that it is hard to prove a negative. That is, how can you prove that flyweight iterators never help? The counterpart is more interesting. Can we come up with an example where flyweight iterators help! Ok... but we have to start somewhere and this PR is a start. |
@lemire on the contrary: the JMH profiling demonstrates clearly that flyweights allocate a lot less memory. I’m pointing out we can already use JMH -prof GC to check that. |
Yes, I understand that we have established that flyweight iterators produce far fewer allocations even with JITed code. But that's not where I am at. I am facing a screen. There is a button "squash and merge". I am trying to decide whether to push it. That is all. Let me simplify the question. We have a choice.
Given this choice, what are you saying? |
Better information is already available via JMH. |
Leaving away the question about how com.sun.management.ThreadMXBean computes memory allocation, especially escape analysis, I feel this PR is still useful to write unit-tests. JMH excels in producing benchmarks. @richardstartin provided valuable insights regarding GC behaviors with small JMH tweaks: this is excellent. Still, this PR can be pushed further to ensure FlyWeight iterators do not allocates more memory in a future commit. The output from proposed JUnit tests are very simple to check/read; simpler in my POV than analyzing a JMH output. They also run much more quickly. In short, this PR enables anyway automatic checks related to memory allocation, and not very precise figure regarding benchmarks (at least less rich than those from JMH). |
Merging. We can always remove it later. |
Automated checks relating to allocation in interpreted code, which are essentially meaningless. The test produces a number, but the wrong number. |
Few weeks ago I played with google allocation instrumenter. As this is ready code I allow myself to open PR to at least compare results. |
…ringBitmap#245) * Groundwork to demonstrate FlyWeight transient memory consumption * Fix code style
This follows the work done in #243 and the discussion in #244.
This first commit introduces a test class (TestIteratorMemory) based on IteratorsBenchmark32 but to provide some transient memory allocation measures. It typically gives:
It shall provide figures only under Oracle JVM (as it relies on com.sun.management.ThreadMXBean), but the tests should remain green/not-throwing in other JVMs.