Skip to content

Conversation

MaxGraey
Copy link
Member

@MaxGraey MaxGraey commented Dec 6, 2019

Status

  • new Mathf.pow
  • new Mathf.exp
  • new Mathf.log2
  • new Mathf.log
  • new Math.pow
  • new Math.exp
  • new Math.log2
  • new Math.log
  • new Math.exp2 (non standard for symmetry with Math.log2 and opt in future)

Benchmark for Math.pow [f64]

Results

  • Firefox 71
old as pow: 190ms
NEW as pow: 31ms
js pow:     52ms
  • Chrome 79.0.3945.79
old as pow: 235.450927734375ms
NEW as pow: 91.623046875ms
js pow:     146.000732421875ms

Benchmark for Mathf.pow [f32]

  • Firefox 71
old as pow: 106ms
NEW as pow: 16ms
js pow:     26ms
  • Chrome 79.0.3945.79
old as pow: 136.93701171875ms
NEW as pow: 39.447265625ms
js pow:     51.557861328125ms

UPDATE

Benchmark for Math.pow [f64] using internal loops in AssemblyScript

  • Chrome 79.0.3945.88
old as pow: 153.734130859375ms
NEW as pow: 21.9619140625ms
js pow: 137.114990234375ms

@MaxGraey MaxGraey marked this pull request as ready for review December 11, 2019 20:33
@MaxGraey MaxGraey requested a review from dcodeIO December 11, 2019 20:33
@MaxGraey
Copy link
Member Author

MaxGraey commented Dec 12, 2019

I decided add new non-std Math functions Math.exp2 / Mathf.exp2 and Math.exp10 / Mathf.exp10. Not necessary use it explicitly but its will use for special lowering during Math.pow optimization when first argument known at compile time like:

2 ** y   ->  exp2(y)

10 ** y -> exp10(y) figure out this hasn't any benefits compare to pow(10, y)

e ** y   ->  exp(y)

// where y is f32 or f64 and result also f32 or f64

@dcodeIO
Copy link
Member

dcodeIO commented Dec 12, 2019

Can you explain a bit what's the theory behind these improvements? For instance, what did the old implementation do that made it slow, and does the new implementation do that makes it fast? What are the algorithms used here and where are they from? Stuff like that :)

@MaxGraey
Copy link
Member Author

MaxGraey commented Dec 12, 2019

Basicaly it's adoption of new ARM math lib: https://github.com/ARM-software/optimized-routines/tree/master/math (MIT). This link present in musl's implementation. New routines use lookup tables and more clever handling special cases when we could simplify path using twofold fast arithmetic some of LUTs need for speedup FMA emulation. All this significantly speedup pow/log/exp. But sometimes increase code size, so I use this routines mostly for ASC_SHRINK_LEVEL == 0. except Mathf.pow which accidentally decrease size.

@dcodeIO dcodeIO merged commit ab1e1dd into AssemblyScript:master Jan 1, 2020
@dcodeIO
Copy link
Member

dcodeIO commented Jan 1, 2020

Great, thanks!

@MaxGraey MaxGraey deleted the speedup-log-exp-pow branch January 1, 2020 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants