Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for SIMD WebAssembly #137

Open
tomByrer opened this issue Jul 14, 2017 · 6 comments
Open

Support for SIMD WebAssembly #137

tomByrer opened this issue Jul 14, 2017 · 6 comments
Labels

Comments

@tomByrer
Copy link

Interesting project you have here!

Is there any speed comparisons against WebAssembly anywhere? eg rewrite this for gpu.js
http://kripken.github.io/Massive/beta/

I used to hand-code SSE ASM for DSP back in the day, so I'm always looking to save a few cycles ;)

@robertleeplummerjr
Copy link
Member

I would love to! How would one go about doing this?

@robertleeplummerjr
Copy link
Member

Currently what we'd like to do is implement an accelerator for cpu. Right now there is overhead for creating a cpu in the form of loops and callbacks for each item in the arrays. Our current goal would be to unroll these loops where possible, and stick the kernel function body there, rather than a callback. At the very least this would prevent the looping and callback/closure cost, but there is a limit to the size of these functions and on this scale it can escalate quickly. A "small" 512*512 matrix, for example, has 262,144 kernel calls.

How does WebAssembly deal with this type of problem? Is this the right question to be asking?

@PicoCreator
Copy link
Contributor

@robertleeplummerjr, @tomByrer : Fuzz and I were discussing of doing this after v1. The SIMD aspect to be exact. Though we probably, would run it as a seperate mode (not CPU mode)

Mainly cause it will make for a hilarious tag line, GPU.JS, now transpiling from CPU to CPU!

@robertleeplummerjr
Copy link
Member

I, for one, would be in favor of the "CPU to CPU" tagline, it'd at first be funny, then they'd see the numbers. Their reaction: "Hahaha, what a funny joke {clicks link}... oooOOOooo!"

(But I'll do whatever you leaders feel is important 😛 )

@fuzzie360
Copy link
Member

Will leave this here so you guys can salivate at the CPU performance gains of SIMD:

image

Also a working CPU SIMD demo here:

http://peterjensen.github.io/idf2014-simd/idf2014-simd.html

This is not forgetting that we are technically close to SIMD on GPU at the moment:

  1. The beginning: 1 gpu thread, 1 output value
  2. Float textures: 1 gpu thread, 4 output values <- we are currently here
  3. Branch-less optimizer: squash if branches e.g:
if (x > 0) {
    y += 5;
}

// becomes
z = x > 0;
y += 5 * z;
  1. SIMD optimizer:
result.r = a[0] + b[0];
result.g = a[1] + b[1];
result.b = a[2] + b[2];
result.a = a[3] + b[3];

// becomes
result = a + b;

@PicoCreator PicoCreator changed the title perf test examples against WebAssembly Support for SIMD WebAssembly Jul 16, 2017
@ohenepee
Copy link

ohenepee commented Aug 5, 2018

Any speed comparisons against WebAssembly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants