New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark on Math.sqrt and the same implementation with "Fast Inverse Square Root" #1120
Comments
Here I will show you the source code: extern "C" {
double EMSCRIPTEN_KEEPALIVE f64_sqrt (double number) {
long i;
float x, y;
const float f = 1.5F;
x = number * 0.5F;
y = number;
i = *(long *) &y;
i = 0x5f3759df - (i >> 1);
y = *(float *) &i;
y = y * (f - (x * y * y));
return number * y;
}
} |
It's hard to tell without your benchmark, but it looks like you may actually be measuring the cost of the function call into wasm (which is currently relatively expensive in v8). Here's an alternate benchmark that sums square roots from 1 to N. It tests using three different sqrt functions:
function module(bytes) {
let buffer = new ArrayBuffer(bytes.length);
let view = new Uint8Array(buffer);
for (let i = 0; i < bytes.length; ++i) {
view[i] = bytes.charCodeAt(i);
}
return new WebAssembly.Module(buffer);
}
function instance(bytes) {
return new WebAssembly.Instance(module(bytes), {});
}
let inst = instance("\x00\x61\x73\x6d\x01\x00\x00\x00\x01\x0b\x02\x60\x01\x7c\x01\x7c\x60\x01\x7f\x01\x7c\x03\x05\x04\x00\x00\x01\x01\x07\x29\x04\x05\x73\x71\x72\x74\x31\x00\x00\x05\x73\x71\x72\x74\x32\x00\x01\x09\x73\x75\x6d\x5f\x73\x71\x72\x74\x31\x00\x02\x09\x73\x75\x6d\x5f\x73\x71\x72\x74\x32\x00\x03\x0a\x7a\x04\x33\x01\x01\x7d\x41\xdf\xb3\xdd\xf9\x05\x20\x00\xb6\xbc\x41\x01\x75\x6b\xbe\x22\x01\x43\x00\x00\xc0\x3f\x20\x01\x20\x00\x44\x00\x00\x00\x00\x00\x00\xe0\x3f\xa2\xb6\x20\x01\x94\x94\x93\x94\xbb\x20\x00\xa2\x0b\x05\x00\x20\x00\x9f\x0b\x1f\x01\x01\x7c\x03\x40\x20\x00\xb8\x10\x00\x20\x01\xa0\x21\x01\x20\x00\x41\x01\x6b\x22\x00\x41\x00\x4e\x0d\x00\x0b\x20\x01\x0b\x1e\x01\x01\x7c\x03\x40\x20\x00\xb8\x9f\x20\x01\xa0\x21\x01\x20\x00\x41\x01\x6b\x22\x00\x41\x00\x4e\x0d\x00\x0b\x20\x01\x0b");
let value = 100 * 1000 * 1000;
function measure(name, f) {
let start = performance.now();
let x = f(value);
let end = performance.now();
let elapsed = end - start;
print(name, elapsed.toFixed(3) + 'ms', x);
}
function sum_sqrt(n) {
let sum = 0;
for (; n >= 0; n--) {
sum += Math.sqrt(n);
}
return sum;
}
measure('fast inv sqrt', inst.exports.sum_sqrt1);
measure('wasm sqrt', inst.exports.sum_sqrt2);
measure('Math.sqrt', sum_sqrt); And here are the results on my machine:
|
@binji Really make sense!! Thanks! |
So, here is a conclusion: |
I don't think that's necessary. If I compile the following C file:
with emscripten using this command:
I get a module that uses the wasm instructions directly:
|
Please try this one: function measure (name, f) {
let value = 100 * 1000 * 1000
let start = performance.now()
let sum = 0
for (; value >= 0; value--) {
sum += f(value)
}
let end = performance.now()
let elapsed = end - start
console.log(name, elapsed.toFixed(3) + 'ms')
}
measure('i64_sqrt', eufa.Math.i64_sqrt)
measure('Math.sqrt', Math.sqrt) Result(Average):
i64_sqrt(generated by emscripten, and translated into .wat by binaryen): (func $17 (type $1) (param $var$0 i32) (result i32)
(i32.trunc_u/f64
(f64.sqrt
(f64.convert_u/i32
(get_local $var$0)
)
)
)
) |
It looks like that will have the same issue as above, you're likely seeing the (slight) overhead when calling from JavaScript into WebAssembly and back. Ultimately this is the sort of thing that a JavaScript engine will be able to optimize pretty effectively, so I wouldn't expect the results to differ that much. For example, when I ran this same benchmark on my desktop, I get these results:
|
Yes, webassembly seems does not have much advantages at those single basic-level functions (basically like: plus, minus, sqrt, abs and etc), I will try digging into more details on how to implement wasm in a high efficient condition, thanks!!! |
This was raised and discussed at Emscripten bug tracker also at emscripten-core/emscripten#5439. |
All those benchmark tests were implemented with Emscripten tool chain, and the generated webassembly binary was optimized with flag "-O3".
Generally speaking, the same "sqrt" function which implemented with "Fast Inverse Square Root" should be faster than "Math.sqrt" which is a native javascript function.
But the benchmark result is really different and weird.
The question is "I am not sure if this is a problem of emscripten or it should be like this?"
BTW, the test result is from "Version 62.0.3178.0 (Official Build) canary (64-bit)"
The text was updated successfully, but these errors were encountered: