Improve Math routines + reinmplement Mathf.random #297

MaxGraey · 2018-10-05T21:28:47Z

improve Math.round
reinmplement Mathf.random based on xoroshiro64**
switch to copysign if possible
~~avoiding select<f64/f32> because it produce complex branchless machine code unlike integer version (which use cmov instructions)~~
better instruction parallelism
fix randomSeed setup

Closes #59

dcodeIO · 2018-10-05T21:38:26Z

As far as I understood select has advantages where the condition is random and branch prediction doesn't perform well. Isn't that the case here?

MaxGraey · 2018-10-05T21:52:12Z

select make sense for simple branch expressions in ALU side (integer arithmetic) which produce single instruction cmov and its variants. SSE/FPU hasn't similar instruction and produce much complicated instruction set but it allow us stay in SSE context (switching between ALU and SSE/FPU contexts not cheap as well) but I try use tricks which don't force ALU condition routines and don't leave SSE/FPU context

EDIT SSE/AVX can simulate cmov via masking and blend both arguments but as I mentioned before that not cheap

MaxGraey · 2018-10-05T22:57:11Z

std/assembly/math.ts

      let z_h = cp_h * p_h;
-      let dp_l = select<f64>(dp_l1, 0.0, k);
+      // let dp_l = select<f64>(dp_l1, 0.0, k);
+      let dp_l: f64 = dp_l1 * <f64><bool>k;


@dcodeIO Hmm, not sure dp_l1 * <f64><bool>k is right way. May be

dp_l1 * <f64>(k != 0)

Is better?

I think the : f64 annotation isn't necessary because 1.0 and dp_l1 are already f64. <bool>k will compile to i32.and(k, 1) if I'm not mistaken, while (k != 0) is an i32.ne(k, 0), hmm. Maybe (k != 0) is easier to understand.

dcodeIO · 2018-10-05T23:47:10Z

std/assembly/math.ts

-    return y;
+  export function round(x: f64): f64 {
+    if (!isFinite(x) || x == 0) return x;
+    if (-0.5 <= x && x < 0) return -0.0;


maybe copysign(0, x)? and remove the x == 0 check? Overall this looks like it has 4 branches

dcodeIO · 2018-10-06T14:20:20Z

std/assembly/math.ts

+      let z = builtin_sqrt<f64>(yy + 1);
+           if (e >= 0x3FF + 1)  y = log(2 * y + 1 / (z + y));
+      else if (e >= 0x3FF - 26) y = log1p(y + yy / (z + 1));
+    }


This refactor doesn't seem to be worth it because it now calculates yy and z even if none of the if conditions is true

Yeah, you are right

dcodeIO · 2018-10-20T10:36:43Z

std/assembly/math.ts

    var twopk = reinterpret<f64>(u);
    var y: f64;
-    if (k < 0 || k > 56) {
+    if (<i32>(k < 0) | <i32>(k > 56)) {


I think we should make the compiler smarter about logical ors instead of convoluting the source like this. Wdyt?

Definitely yes! I even have opened proposal for that: #277

So, can we keep the || here and at the other places, if any, for now?

dcodeIO · 2018-10-25T21:31:44Z

std/assembly/math.ts

    if (ey == 0x7FF) return y;
    x = reinterpret<f64>(ux);
-    if (ex == 0x7FF || uy == 0) return x;
+    if (<i32>(ex == 0x7FF) | <i32>(uy == 0)) return x;


dcodeIO · 2018-10-25T21:31:54Z

std/assembly/math.ts

    var hx = <u32>(u >> 32);
    var k = 0;
-    if (hx < 0x00100000 || <bool>(hx >> 31)) {
+    if (<u32>(hx < 0x00100000) | (hx >> 31)) {


dcodeIO · 2018-10-25T21:32:02Z

std/assembly/math.ts

    var hx = <u32>(u >> 32);
    var k = 0;
-    if (hx < 0x00100000 || <bool>(hx >> 31)) {
+    if (<u32>(hx < 0x00100000) | (hx >> 31)) {


dcodeIO · 2018-10-25T21:32:15Z

std/assembly/math.ts

    var k = 1;
    var c = 0.0, f = 0.0;
-    if (hx < 0x3FDA827A || <bool>(hx >> 31)) {
+    if (<u32>(hx < 0x3FDA827A) | (hx >> 31)) {


dcodeIO · 2018-10-25T21:32:21Z

std/assembly/math.ts

    var hx = <u32>(u >> 32);
    var k = 0;
-    if (hx < 0x00100000 || <bool>(hx >> 31)) {
+    if (<u32>(hx < 0x00100000) | hx >> 31) {


dcodeIO · 2018-10-25T21:36:02Z

std/assembly/math.ts

        k = <i32>(invln2 * x + builtin_copysign<f32>(0.5, x));
      } else {
-        k = 1 - sign_ - sign_;
+        k = 1 - (sign_ << 1);


Does this improve something?

I think yes because sign_ - sign_ is two reads from locals, sign_ << 1 only one read operation

dcodeIO · 2018-10-25T21:36:10Z

std/assembly/math.ts

    var twopk = reinterpret<f32>(u);
    var y: f32;
-    if (k < 0 || k > 56) {
+    if (<i32>(k < 0) | <i32>(k > 56)) {


dcodeIO · 2018-10-25T21:36:21Z

std/assembly/math.ts

    var u = reinterpret<u32>(x);
    var k = 0;
-    if (u < 0x00800000 || <bool>(u >> 31)) {
+    if (<u32>(u < 0x00800000) | (u >> 31)) {


dcodeIO · 2018-10-25T21:36:25Z

std/assembly/math.ts

    var ix = reinterpret<u32>(x);
    var k = 0;
-    if (ix < 0x00800000 || <bool>(ix >> 31)) {
+    if (<u32>(ix < 0x00800000) | (ix >> 31)) {


dcodeIO · 2018-10-25T21:36:32Z

std/assembly/math.ts

    var c: f32 = 0, f: f32 = 0;
    var k: i32 = 1;
-    if (ix < 0x3ED413D0 || <bool>(ix >> 31)) {
+    if (<u32>(ix < 0x3ED413D0) | (ix >> 31)) {


dcodeIO · 2018-10-25T21:36:36Z

std/assembly/math.ts

    var ix = reinterpret<u32>(x);
    var k: i32 = 0;
-    if (ix < 0x00800000 || <bool>(ix >> 31)) {
+    if (<u32>(ix < 0x00800000) | (ix >> 31)) {


dcodeIO · 2018-10-25T21:36:41Z

std/assembly/math.ts

    if (iy == 0) return 1.0; // x**0 = 1, even if x is NaN
    // if (hx == 0x3F800000) return 1.0; // C: 1**y = 1, even if y is NaN, JS: NaN
-    if (ix > 0x7F800000 || iy > 0x7F800000) return x + y; // NaN if either arg is NaN
+    if (<i32>(ix > 0x7F800000) | <i32>(iy > 0x7F800000)) return x + y; // NaN if either arg is NaN


dcodeIO · 2018-10-25T21:38:13Z

std/assembly/math.ts

-      y *= Ox1p_126f;
-      n += 126;
+      y *= Ox1p_126f * Ox1p24f;
+      n += 126 - 24;


What's this doing?

See original code:
https://git.musl-libc.org/cgit/musl/tree/src/math/scalbnf.c

It seems this fix appear later in original source code. Ported "musl" don't reflect this changes

Oh, updates, nice :)

dcodeIO · 2018-10-25T21:40:14Z

std/assembly/math.ts

    if (ux << 1 == 0) return x;
    if (!ex) {
-      for (i = uxi << 9; i >> 31 == 0; ex--, i <<= 1) {}
+      ex -= builtin_clz<u32>(uxi << 9);


Nice find :)

dcodeIO · 2018-10-25T21:51:51Z

std/assembly/math.ts

      if (u < 0x3F800000 - (12 << 23)) return 1;
      let t = expm1(x);
-      return 1 + t * t / (2 * (1 + t));
+      return 1 + t * t / (2 + 2 * t);


Maybe also add a comment here

dcodeIO · 2018-10-25T21:51:55Z

std/assembly/math.ts

    if (u < 0x42B17217) {
      let t = exp(x);
-      return 0.5 * (t + 1 / t);
+      return 0.5 * t + 0.5 / t;


Maybe also add a comment here

dcodeIO · 2018-10-25T22:10:30Z

Great, thanks! :)

Improve Math routines

f6ad306

more "select" replacements

4ba5dbe

MaxGraey commented Oct 5, 2018

View reviewed changes

MaxGraey added 3 commits October 6, 2018 02:13

refactoring <book>k to k != 0. Avoid unnecessary annotations

004c11d

more tweaks

084bf28

simplificate Math.round

3fb8ae8

dcodeIO reviewed Oct 5, 2018

View reviewed changes

MaxGraey added 8 commits October 6, 2018 03:10

improve Math.round

a2ff04b

remove isFinite check for Math.round

c42d061

finalize Math.round

5796d2b

optimize expm1

4f25f40

more branch optimizations for expm1

4b8b724

fix seed random for second state

e333a40

minor improve asinh

174952b

add copysign for Mathf.atan

35545eb

dcodeIO reviewed Oct 6, 2018

View reviewed changes

MaxGraey added 12 commits October 6, 2018 17:39

revert back branch stuff for asinh and remove redundancy

00d5bd5

minor refactorings

693f2c6

optimize sinh

e5f8e0d

inline some internals

75e4d8c

revert Rf inlining

8a3cfca

minor improvments

3b00b76

use renterpreted select for float constants

29addf2

add cast_select internal helper

577f56d

fixes

c9b69dc

remove redundant casting for casted_select

59bcf0f

implement random for float math based on xoroshiro64starstar

11eed0e

use faster casting to float for Math.random

ad7af21

MaxGraey added 2 commits October 20, 2018 01:22

use ariphmetic OR instead logic OR for simple expressions

1b8cbf7

slightly optimize mod

87a4652

dcodeIO reviewed Oct 20, 2018

View reviewed changes

MaxGraey added 3 commits October 21, 2018 21:38

optimal balance for Math.sign

20b4d42

fix Math.sign

34db62b

add comments with original musl's code

8762829

dcodeIO reviewed Oct 25, 2018

View reviewed changes

revert back logical expresseions optimizations

b82ee2f

dcodeIO reviewed Oct 25, 2018

View reviewed changes

MaxGraey added 4 commits October 26, 2018 00:53

add more comments

72aee90

finalize revert logical ops

9beb49f

more reverts

c3bb8ae

last one

3899a4f

dcodeIO merged commit 376afd4 into AssemblyScript:master Oct 25, 2018

MaxGraey deleted the improve-math branch October 26, 2018 02:41

Uh oh!

Improve Math routines + reinmplement Mathf.random #297

Improve Math routines + reinmplement Mathf.random #297

Uh oh!

Conversation

MaxGraey commented Oct 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcodeIO commented Oct 5, 2018

Uh oh!

MaxGraey commented Oct 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcodeIO Oct 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcodeIO Oct 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

MaxGraey commented Oct 5, 2018 •

edited

Loading

MaxGraey commented Oct 5, 2018 •

edited

Loading

dcodeIO Oct 5, 2018 •

edited

Loading

dcodeIO Oct 5, 2018 •

edited

Loading