Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding SIMD / New Vector API that's in incubation #142

Open
eix128 opened this issue Jun 1, 2021 · 7 comments
Open

Adding SIMD / New Vector API that's in incubation #142

eix128 opened this issue Jun 1, 2021 · 7 comments

Comments

@eix128
Copy link

eix128 commented Jun 1, 2021

Hi ,
Java 16 has released Vector API

You can look at the links for details:
https://metebalci.com/blog/what-is-new-in-java-16/
https://openjdk.java.net/jeps/338

Java 16's SIMD API has intrinsic capability.
Much faster then JNI.That directly converts these method calls to ARM NEON or AVX512 etc..

It will be good to fit ejml to new Java 16's Vector API

Also checkout for TornadoVM for very big matrix FPGA solutions

@lessthanoptimal
Copy link
Owner

I've been looking into this and it's definitely in the "plan" and early benchmarks look good. I'll need to do some redesigning so that you can swap out algorithms easily. EJML will always be stuck on ancient JDK's so this will need to go into a seperate module that has a different build path.

@lessthanoptimal lessthanoptimal changed the title JDK 16 SIMD Adding SIMD / New Vector API that's in incubation Jan 5, 2023
@lessthanoptimal
Copy link
Owner

Posting an update, but not much of one. Still very much something I would like to add but can't prioritize it at the moment. If anyone wants to give it a shot go here and we can work out integration details.

https://github.com/lessthanoptimal/VectorPerformance

@ennerf
Copy link
Contributor

ennerf commented Jan 5, 2023

@lessthanoptimal I recently did some tests with Aparapi and think that could be useful for large matrices as well. It converts bytecode to OpenCL and runs algorithms on the GPU. It's fairly easy to work with and backwards compatible with old versions.

There are some limitations like only being able to use static methods and primitive/array types, but EJML is set up that way anyways. Here is a small sample I was working with Mandelbrot GPU.

@lessthanoptimal
Copy link
Owner

@ennerf How much of a speed up were you seeing?

@ennerf
Copy link
Contributor

ennerf commented Jan 6, 2023

I think my GTX 2060 was about 20-50% faster than the parallel version on 12/24 threads, but I didn't do any real benchmarks and the dataset wasn't very large. The benchmarks in this blog post look like it scales well for larger problems.

@lessthanoptimal
Copy link
Owner

@ennerf Those articles are interesting. I also had no idea there was an active community writing renders for JavaFX. When I last tried it years ago JavaFX's 3D performance was really bad and I didn't see anyway to get a custom solution running in that framework.

@ennerf
Copy link
Contributor

ennerf commented Jan 15, 2023

@lessthanoptimal going a bit off-topic here, I think that the JavaFX 3D performance is actually pretty good as long as you stay within the supported parts (e.g. dynamic CubeWorld or rendering robots). It's also cool that the same code can run on Android and iOS.

Where things get tricky is when you need to render long lines or large dynamic objects like pointclouds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants