Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upMove allocation out of _Utils_compare (improving performance) #918
Conversation
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
process-bot
Nov 1, 2017
Thanks for the pull request! Make sure it satisfies this checklist. My human colleagues will appreciate it!
Here is what to expect next, and if anyone wants to comment, keep these things in mind.
process-bot
commented
Nov 1, 2017
|
Thanks for the pull request! Make sure it satisfies this checklist. My human colleagues will appreciate it! Here is what to expect next, and if anyone wants to comment, keep these things in mind. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
evancz
Nov 1, 2017
Member
I think it would be best to refer to elm_lang$core$Basics$LT such that it gets minified better. This also means the runtime representation can be chosen by the compiler more easily.
I'll keep this open as a reminder of this change, but there are a few unknown things that make it hard to make a change at this exact moment.
|
I think it would be best to refer to I'll keep this open as a reminder of this change, but there are a few unknown things that make it hard to make a change at this exact moment. |
Skinney
referenced this pull request
Nov 1, 2017
Closed
`get` function heads in the wrong direction #1
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
justinmimbs
Nov 2, 2017
Regarding the idea of using primitives at runtime to represent argument-less union types: I hand-modified the JS output of Robin's Dict.LLRB.get function to use cmp instead of compare, and it saw a 77% increase in ops/sec in Chrome (using dicts of 100 elements with ints as keys).
i.e. from
var _p16 = A2(_elm_lang$core$Basics$compare, targetKey, _p15._1);
switch (_p16.ctor) {
case 'LT':to
var _p16 = _elm_lang$core$Native_Utils.cmp(targetKey, _p15._1);
switch (_p16) {
case -1:
justinmimbs
commented
Nov 2, 2017
|
Regarding the idea of using primitives at runtime to represent argument-less union types: I hand-modified the JS output of Robin's i.e. from var _p16 = A2(_elm_lang$core$Basics$compare, targetKey, _p15._1);
switch (_p16.ctor) {
case 'LT':to var _p16 = _elm_lang$core$Native_Utils.cmp(targetKey, _p15._1);
switch (_p16) {
case -1: |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
evancz
Mar 7, 2018
Member
The final version looks like this: https://github.com/elm-lang/core/blob/master/src/Elm/Kernel/Utils.js#L127-L131
There are bunches of inlining tricks we can do to avoid allocations, but the compiler infrastructure is not sophisticated enough to do them right now. So it's not that I don't think of them; it's that they cannot be done without a great deal of infrastructure work to get type information on all expressions. That is something I have my eye on as well.
|
The final version looks like this: https://github.com/elm-lang/core/blob/master/src/Elm/Kernel/Utils.js#L127-L131 There are bunches of inlining tricks we can do to avoid allocations, but the compiler infrastructure is not sophisticated enough to do them right now. So it's not that I don't think of them; it's that they cannot be done without a great deal of infrastructure work to get type information on all expressions. That is something I have my eye on as well. |
Skinney commentedNov 1, 2017
In my work on a new
Dictimplementation, I discovered that one of the functions being called the most is_Utils_compare. By moving the object allocation out of the function (pre-allocating) we can improve performance of the function.In my benchmark,
Dict.getgoes from a mean runtime of ~30ns per call (for aDictof 100 key-value pairs) to ~26ns. If my math is correct (it rarely is) that is a 13% performance increase.I also tried the following, but it did not offer better performance and is more code:
(The actual benchmarks were performed by modifying a 0.18 compiled output by hand. Even though the code looks slightly different in this 0.19-branch, I don't see why the results would be any different).