You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Performance of the Sparse Matrix Multiply kernel of the SciMark benchmark is significantly less with OpenJ9 compared to HotSpot.
Results on 32 core Xeon(R) CPU E7-8867:
OpenJ9 615 Mflops vs HotSpot 1755 Mflops
Part of the issue is that the test is short-running and we spend most of the run in a profiling compile. Wrapping the test in a harness for multiple iterations raises the throughput to 1104 Mflops.
Studying the compilation log suggests an opportunity to exploit x86 fused multiply add instructions, and this is being investigated.
The text was updated successfully, but these errors were encountered:
see Issue #199 for a discussion on profiling and some of the work underway to improve the performance of profiling code which should help here to at least an extent. The opportunity for fused multiply add is interesting and worthy of further study.
Performance of the Sparse Matrix Multiply kernel of the SciMark benchmark is significantly less with OpenJ9 compared to HotSpot.
Results on 32 core Xeon(R) CPU E7-8867:
OpenJ9 615 Mflops vs HotSpot 1755 Mflops
Part of the issue is that the test is short-running and we spend most of the run in a profiling compile. Wrapping the test in a harness for multiple iterations raises the throughput to 1104 Mflops.
Studying the compilation log suggests an opportunity to exploit x86 fused multiply add instructions, and this is being investigated.
The text was updated successfully, but these errors were encountered: