Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First round of performance improvements #138

Closed
9 tasks done
svaarala opened this issue Mar 19, 2015 · 5 comments
Closed
9 tasks done

First round of performance improvements #138

svaarala opened this issue Mar 19, 2015 · 5 comments
Milestone

Comments

@svaarala
Copy link
Owner

  • Add a barebones performance test set
  • Change refcount macros to manipulate refcounts directly instead of a calling a helper
  • Remove NULL checks from refcounts (they're almost always unnecessary and an assert suffices; add explicit NULL checks where that is necessary)
  • Improve hex encode/decode
  • Rework value pushing to require fewer helper steps
  • Add array write fast path to avoid string intern on array index writes
  • Add fastint support for soft float platforms (also seems to improve performance on some hard float platforms, e.g. x64) Integer arithmetic support for soft float platforms #105
  • Rework pre/post inc/dec opcodes for better loop performance
  • Update debugger bytecode dump for pre/post inc/dec opcode rework
@svaarala svaarala added this to the v1.2.0 milestone Mar 19, 2015
@svaarala
Copy link
Owner Author

These are done and merged locally. I'll push them to master early next week when I'm back from a mini-vacation so that I don't break master without being there to fix it :)

I'll post some results also at that time.

@svaarala
Copy link
Owner Author

The first round of performance improvements is now in master, the changes made are in the issue description. Below is a rough measurement of the effects on several small testcases:

  • Test host is a Lenovo X1 Carbon Intel i7-4600U, 2.1GHz, x64, gcc-4.8.1
  • duk.112 is Duktape 1.1.2 compiled with -Os
  • duk-Os, duk-O2, duk-O3, duk-O4 is master compiled with different
    optimization levels, -DDUK_OPT_FASTINT is enabled
  • Measurement is minimum-of-3 for time -f %U (not terribly accurate)

Overall performance is mostly close to e.g. Python which has a similar refcount and mark-and-sweep approach. Call handling is still somewhat slow and wasn't addressed in this round of changes. There's a considerable gap to Lua with some of that gap fixable (e.g. bytecode executor inefficiencies) and some not (e.g. additional processing needed by reference counting, more complex property and scope semantics, Unicode handling, etc).

Note that these performance testcases are not representative of real applications - the intent is to test individual aspects so that it's easier to see potential areas of improvement.

test-array-read.js            : duk-Os  6.10 duk-O2  6.17 duk-O3  6.02 duk-O4  6.04 duk.112 11.11 rhino  1.07 mujs 230.12 lua  1.32 python  5.82 perl  5.84 ruby  4.92
test-array-write.js           : duk-Os  6.87 duk-O2  6.71 duk-O3  6.49 duk-O4  6.51 duk.112 30.24 rhino  1.75 mujs 247.67 lua  1.56 python  6.88 perl  5.81 ruby  8.27
test-bitwise-ops.js           : duk-Os  1.25 duk-O2  1.22 duk-O3  1.17 duk-O4  1.16 duk.112  7.13 rhino  8.76 mujs  3.96 lua   n/a python   n/a perl   n/a ruby   n/a
test-call-basic.js            : duk-Os 18.84 duk-O2 17.39 duk-O3 16.81 duk-O4 16.80 duk.112 28.26 rhino  4.38 mujs 17.24 lua  2.68 python 10.40 perl 12.70 ruby  7.45
test-empty-loop.js            : duk-Os  3.36 duk-O2  3.37 duk-O3  3.22 duk-O4  3.22 duk.112  8.14 rhino  0.74 mujs  6.93 lua  1.24 python  4.53 perl  3.67 ruby  3.65
test-fib.js                   : duk-Os 10.79 duk-O2 10.46 duk-O3  9.33 duk-O4  9.21 duk.112 14.12 rhino  1.51 mujs  3.89 lua  1.51 python  2.77 perl  7.36 ruby  1.69
test-hello-world.js           : duk-Os  0.00 duk-O2  0.00 duk-O3  0.00 duk-O4  0.00 duk.112  0.00 rhino  0.24 mujs  0.00 lua  0.00 python  0.00 perl  0.00 ruby  0.00
test-hex-decode.js            : duk-Os  6.17 duk-O2  5.72 duk-O3  5.98 duk-O4  6.11 duk.112 12.08 rhino  0.24 mujs  0.00 lua   n/a python 13.66 perl   n/a ruby   n/a
test-hex-encode.js            : duk-Os 34.87 duk-O2 29.16 duk-O3 33.81 duk-O4 33.42 duk.112 53.90 rhino  0.26 mujs  0.00 lua   n/a python  3.25 perl   n/a ruby   n/a
test-json-serialize.js        : duk-Os  5.26 duk-O2  3.55 duk-O3  3.01 duk-O4  3.07 duk.112  4.78 rhino  4.62 mujs  0.67 lua   n/a python  0.62 perl   n/a ruby   n/a
test-json-string-bench.js     : duk-Os  7.12 duk-O2  6.07 duk-O3  5.72 duk-O4  5.76 duk.112  8.65 rhino  2.52 mujs 56.40 lua   n/a python   n/a perl   n/a ruby   n/a
test-prop-read.js             : duk-Os 11.03 duk-O2  9.55 duk-O3  8.99 duk-O4  8.97 duk.112 16.98 rhino  1.22 mujs 10.66 lua  1.38 python  6.88 perl  7.74 ruby 17.56
test-prop-write.js            : duk-Os  9.71 duk-O2  8.52 duk-O3  7.95 duk-O4  8.05 duk.112 16.24 rhino  2.40 mujs 10.58 lua  1.56 python  7.30 perl  7.59 ruby 21.11
test-reg-readwrite-object.js  : duk-Os  5.71 duk-O2  5.60 duk-O3  5.38 duk-O4  5.48 duk.112  8.14 rhino  0.40 mujs 10.08 lua  1.66 python  5.45 perl 33.66 ruby  4.38
test-reg-readwrite-plain.js   : duk-Os  4.24 duk-O2  4.17 duk-O3  3.95 duk-O4  3.99 duk.112  6.14 rhino  0.41 mujs  9.66 lua  1.68 python  5.55 perl 35.29 ruby  4.48
test-string-array-concat.js   : duk-Os 10.51 duk-O2  8.47 duk-O3  7.70 duk-O4  7.75 duk.112 33.56 rhino  2.23 mujs 257.04 lua  2.31 python  3.33 perl  8.37 ruby  9.16
test-string-compare.js        : duk-Os  5.49 duk-O2  5.06 duk-O3  5.08 duk-O4  5.05 duk.112  7.97 rhino  6.45 mujs 829.17 lua  3.21 python  5.40 perl 16.48 ruby  5.80
test-string-intern-match.js   : duk-Os  2.95 duk-O2  2.40 duk-O3  3.02 duk-O4  2.95 duk.112  3.00 rhino  0.28 mujs  0.00 lua   n/a python   n/a perl   n/a ruby   n/a
test-string-intern-miss.js    : duk-Os  3.02 duk-O2  2.67 duk-O3  3.17 duk-O4  3.12 duk.112  3.47 rhino  0.29 mujs  0.00 lua   n/a python   n/a perl   n/a ruby   n/a
test-string-plain-concat.js   : duk-Os  5.15 duk-O2  4.08 duk-O3  5.57 duk-O4  5.48 duk.112  5.21 rhino  0.33 mujs  1.15 lua  0.64 python  0.01 perl  0.43 ruby  0.83

@svaarala
Copy link
Owner Author

The changes in this issue are now in master and follow-up work is in #139, so I'll close this.

@fatcerberus
Copy link
Contributor

Holy crap is MuJS slow! As in, 1 to 2 orders of magnitude slower compared to Duktape...

I am curious how the string interning tests could be O(0) in MuJS though...

@svaarala
Copy link
Owner Author

@fatcerberus Well it's slower in some tests and faster in others, there's a trade-off for every design decision :) AFAIK mujs doesn't implement a "fast array" which then naturally impacts all tests that deal with arrays. How that translates to practical code is of course a different matter, and depends on what kind of code one is running.

The string intern tests are 0.00 seconds because they are Duktape-specific (they use a buffer value to setup data to intern), so they error out with mujs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants