Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance deficit on Sparse Matrix Multiply #201

Open
JamesKingdon opened this issue Oct 2, 2017 · 1 comment
Open

Performance deficit on Sparse Matrix Multiply #201

JamesKingdon opened this issue Oct 2, 2017 · 1 comment

Comments

@JamesKingdon
Copy link
Contributor

JamesKingdon commented Oct 2, 2017

Performance of the Sparse Matrix Multiply kernel of the SciMark benchmark is significantly less with OpenJ9 compared to HotSpot.

Results on 32 core Xeon(R) CPU E7-8867:

OpenJ9 615 Mflops vs HotSpot 1755 Mflops

Part of the issue is that the test is short-running and we spend most of the run in a profiling compile. Wrapping the test in a harness for multiple iterations raises the throughput to 1104 Mflops.

Studying the compilation log suggests an opportunity to exploit x86 fused multiply add instructions, and this is being investigated.

@andrewcraik
Copy link
Contributor

see Issue #199 for a discussion on profiling and some of the work underway to improve the performance of profiling code which should help here to at least an extent. The opportunity for fused multiply add is interesting and worthy of further study.

@DanHeidinga DanHeidinga added this to User Raised issues in Issue tracking Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Issue tracking
User Raised issues
Development

No branches or pull requests

5 participants