List Functor: mix unrolled and reverse map #135

Open
wants to merge 9 commits into
from

Conversation

Projects
None yet
6 participants
@matthewleon
Contributor

matthewleon commented Nov 21, 2017

Addresses #131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems: 142.61 μs -> 36.97 μs
list of 2000 elems: 275.17 μs -> 55.33 μs
list of 10000 elems: 912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 21, 2017

Contributor

Will update this with details on my machine setup.

Contributor

matthewleon commented Nov 21, 2017

Will update this with details on my machine setup.

+-- chunked list Functor inspired by OCaml
+-- https://discuss.ocaml.org/t/a-new-list-map-that-is-both-stack-safe-and-fast/865
+-- chunk sizes determined through experimentation
+listMap :: forall a b. (a -> b) -> List a -> List b

This comment has been minimized.

@paf31

paf31 Nov 22, 2017

Member

This is exported right now, is that deliberate?

@paf31

paf31 Nov 22, 2017

Member

This is exported right now, is that deliberate?

This comment has been minimized.

@matthewleon

matthewleon Nov 22, 2017

Contributor

No. It isn't. Good catch.

@matthewleon

matthewleon Nov 22, 2017

Contributor

No. It isn't. Good catch.

src/Data/List/Types.purs
+ startUnrolledMap :: Int -> List a -> List b
+ startUnrolledMap 0 (x : xs) = f x : chunkedRevMap xs
+ startUnrolledMap n (x1 : x2 : x3 : x4 : x5 : xs) =
+ f x1 : f x2 : f x3 : f x4 : f x5 : startUnrolledMap (n - 1) xs

This comment has been minimized.

@paf31

paf31 Nov 22, 2017

Member

I'm not sure if OCaml has tail calls modulo cons, but isn't this going to lead to a stack overflow for us?

@paf31

paf31 Nov 22, 2017

Member

I'm not sure if OCaml has tail calls modulo cons, but isn't this going to lead to a stack overflow for us?

This comment has been minimized.

@matthewleon

matthewleon Nov 22, 2017

Contributor

This is why I've limited the "unrolled map" to 1000 iterations. It should avoid stack overflow. Are you suggesting that there is a way to make it stack overflow despite that limit?

@matthewleon

matthewleon Nov 22, 2017

Contributor

This is why I've limited the "unrolled map" to 1000 iterations. It should avoid stack overflow. Are you suggesting that there is a way to make it stack overflow despite that limit?

This comment has been minimized.

@paf31

paf31 Nov 22, 2017

Member

I guess not, unless f was already on the verge of overflowing the stack. It can only make an existing problem a finite amount worse :)

@paf31

paf31 Nov 22, 2017

Member

I guess not, unless f was already on the verge of overflowing the stack. It can only make an existing problem a finite amount worse :)

src/Data/List/Types.purs
+
+ startUnrolledMap :: Int -> List a -> List b
+ startUnrolledMap 0 (x : xs) = f x : chunkedRevMap xs
+ startUnrolledMap n (x1 : x2 : x3 : x4 : x5 : xs) =

This comment has been minimized.

@paf31

paf31 Nov 22, 2017

Member

This can probably be made faster if you hand-optimize the cases. The JS generated here isn't going to be particularly optimal. We can reuse the branches.

@paf31

paf31 Nov 22, 2017

Member

This can probably be made faster if you hand-optimize the cases. The JS generated here isn't going to be particularly optimal. We can reuse the branches.

This comment has been minimized.

@matthewleon

matthewleon Nov 22, 2017

Contributor

Yes, I thought the same thing. I'll look at it if the option you suggest with mutation isn't the way to go.

@matthewleon

matthewleon Nov 22, 2017

Contributor

Yes, I thought the same thing. I'll look at it if the option you suggest with mutation isn't the way to go.

This comment has been minimized.

@paf31

paf31 Nov 22, 2017

Member

No, I just mean rewriting this so that common cases use the same branch, instead of having multiple branches which recognize lists of length >= 1 for example.

@paf31

paf31 Nov 22, 2017

Member

No, I just mean rewriting this so that common cases use the same branch, instead of having multiple branches which recognize lists of length >= 1 for example.

This comment has been minimized.

@paf31

paf31 Nov 22, 2017

Member

If you look at the generated JS, you'll hopefully see what I mean.

@paf31

paf31 Nov 22, 2017

Member

If you look at the generated JS, you'll hopefully see what I mean.

This comment has been minimized.

@natefaubion

natefaubion Nov 22, 2017

Contributor
case xs of
  x : xs' -> ...
    case xs' of
      x' : xs'' -> ...
      _ -> ...
  _ -> ...

PureScript doesn't perform this optimization, you have to do it manually.

@natefaubion

natefaubion Nov 22, 2017

Contributor
case xs of
  x : xs' -> ...
    case xs' of
      x' : xs'' -> ...
      _ -> ...
  _ -> ...

PureScript doesn't perform this optimization, you have to do it manually.

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 22, 2017

Member

Thanks!

Another thing to try is a straight traversal of the spine using a mutable cons cell, a la Scheme. I think we could make this safe. What do you think?

Member

paf31 commented Nov 22, 2017

Thanks!

Another thing to try is a straight traversal of the spine using a mutable cons cell, a la Scheme. I think we could make this safe. What do you think?

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 22, 2017

Contributor

straight traversal of the spine using a mutable cons cell, a la Scheme

I will research this and see if I can come up with something crafty. Is the idea here that we jump into ST?

Contributor

matthewleon commented Nov 22, 2017

straight traversal of the spine using a mutable cons cell, a la Scheme

I will research this and see if I can come up with something crafty. Is the idea here that we jump into ST?

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 22, 2017

Member

I think it would be FFI instead of ST unfortunately, unless we can come up with a way to modify cons cells in ST, which would mean some sort of weird STList data type.

Member

paf31 commented Nov 22, 2017

I think it would be FFI instead of ST unfortunately, unless we can come up with a way to modify cons cells in ST, which would mean some sort of weird STList data type.

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 22, 2017

Contributor

unless we can come up with a way to modify cons cells in ST, which would mean some sort of weird STList data type.

I was thinking along those lines. I'll try it and benchmark it.

Contributor

matthewleon commented Nov 22, 2017

unless we can come up with a way to modify cons cells in ST, which would mean some sort of weird STList data type.

I was thinking along those lines. I'll try it and benchmark it.

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 22, 2017

Contributor

Though upon further thought, I don't see how to do that without running into the problem of stack-safe recursion in Eff being very slow. Perhaps the saner thing to do is to try the mutable cons cell in FFI... I'll have to use JS anyway to optimize the unroll.

Contributor

matthewleon commented Nov 22, 2017

Though upon further thought, I don't see how to do that without running into the problem of stack-safe recursion in Eff being very slow. Perhaps the saner thing to do is to try the mutable cons cell in FFI... I'll have to use JS anyway to optimize the unroll.

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 23, 2017

Member

Yes, agreed, it doesn't seem worth it to make the code non-portable.

Member

paf31 commented Nov 23, 2017

Yes, agreed, it doesn't seem worth it to make the code non-portable.

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 23, 2017

Contributor

Yes, agreed, it doesn't seem worth it to make the code non-portable.

So do you prefer that the code be largely left as-is, because this version is portable and has much better performance than what we had before, at least in Node?

Or should I go ahead and try to do the mutable cons cell + unrolling in FFI JavaScript?

The answer to me really isn't obvious, as I'm unaware of what PS's stance regarding library portability is. As it stands, this lib doesn't use any FFI at all – directly. I believe that Lazy, however, uses FFI. Which makes at least the lazy lists non-portable, no?

Contributor

matthewleon commented Nov 23, 2017

Yes, agreed, it doesn't seem worth it to make the code non-portable.

So do you prefer that the code be largely left as-is, because this version is portable and has much better performance than what we had before, at least in Node?

Or should I go ahead and try to do the mutable cons cell + unrolling in FFI JavaScript?

The answer to me really isn't obvious, as I'm unaware of what PS's stance regarding library portability is. As it stands, this lib doesn't use any FFI at all – directly. I believe that Lazy, however, uses FFI. Which makes at least the lazy lists non-portable, no?

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 23, 2017

Member

Well the lazy data structure doesn't rely on the compiler internals in any way.

Here is a quick FFI experiment:

exports.mapList = function(f) {
  return function(l) {
    if (l.constructor.name === 'Nil') {
      return l;
    } else {
      var nil = require('Data.List.Types').Nil;
      var cons = require('Data.List.Types').Cons;
      var result = new cons(null, null);

      var input = l;
      var output = result;

      while (true) {
        output.value0 = f(input.value0);
        output.value1 = new cons(null, null);

        if (input.value1 instanceof nil) {
          break;
        }

        input = input.value1;
        output = output.value1;
      }

      output.value1 = nil.value;

      return result;
    }
  }
}

Performance-wise, it's about twice as fast, at the expense of being less portable:

> import Main
> import Data.List
> import Performance.Minibench 
> xs = range 1 100000

> benchWith 1000 \_ -> map (\x -> x * 2) xs
mean   = 3.26 ms
stddev = 1.78 ms
min    = 2.58 ms
max    = 53.19 ms
unit

> benchWith 1000 \_ -> mapList (\x -> x * 2) xs
mean   = 1.55 ms
stddev = 1.54 ms
min    = 1.30 ms
max    = 47.51 ms
unit

I think it's better to be portable though, so I might put this in a separate library as a curiosity, but I wouldn't merge it into lists as it is now.

Member

paf31 commented Nov 23, 2017

Well the lazy data structure doesn't rely on the compiler internals in any way.

Here is a quick FFI experiment:

exports.mapList = function(f) {
  return function(l) {
    if (l.constructor.name === 'Nil') {
      return l;
    } else {
      var nil = require('Data.List.Types').Nil;
      var cons = require('Data.List.Types').Cons;
      var result = new cons(null, null);

      var input = l;
      var output = result;

      while (true) {
        output.value0 = f(input.value0);
        output.value1 = new cons(null, null);

        if (input.value1 instanceof nil) {
          break;
        }

        input = input.value1;
        output = output.value1;
      }

      output.value1 = nil.value;

      return result;
    }
  }
}

Performance-wise, it's about twice as fast, at the expense of being less portable:

> import Main
> import Data.List
> import Performance.Minibench 
> xs = range 1 100000

> benchWith 1000 \_ -> map (\x -> x * 2) xs
mean   = 3.26 ms
stddev = 1.78 ms
min    = 2.58 ms
max    = 53.19 ms
unit

> benchWith 1000 \_ -> mapList (\x -> x * 2) xs
mean   = 1.55 ms
stddev = 1.54 ms
min    = 1.30 ms
max    = 47.51 ms
unit

I think it's better to be portable though, so I might put this in a separate library as a curiosity, but I wouldn't merge it into lists as it is now.

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 23, 2017

Contributor

Recapping conversation with @paf31 on Slack: Though considerable speed gains can be made by using FFI with a mutable Cons cell, the List library is currently portable, and we don't want to introduce FFI just to optimize map further than this.

Contributor

matthewleon commented Nov 23, 2017

Recapping conversation with @paf31 on Slack: Though considerable speed gains can be made by using FFI with a mutable Cons cell, the List library is currently portable, and we don't want to introduce FFI just to optimize map further than this.

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 23, 2017

Member

I've put my code in a repo here, along with zipWith and filter replacements.

Member

paf31 commented Nov 23, 2017

I've put my code in a repo here, along with zipWith and filter replacements.

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 23, 2017

Contributor
Contributor

matthewleon commented Nov 23, 2017

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 25, 2017

Contributor

I've added benchmarking to this branch.

Contributor

matthewleon commented Nov 25, 2017

I've added benchmarking to this branch.

@matthewleon matthewleon referenced this pull request in paf31/purescript-lists-fast Nov 25, 2017

Closed

benchmarks #2

@safareli

This comment has been minimized.

Show comment
Hide comment
@safareli

safareli Nov 25, 2017

Contributor

package-lock.json was committed in last commit

Contributor

safareli commented Nov 25, 2017

package-lock.json was committed in last commit

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 26, 2017

Contributor

@safareli removed it. Thanks!

Contributor

matthewleon commented Nov 26, 2017

@safareli removed it. Thanks!

src/Data/List/Types.purs
+-- chunk sizes determined through experimentation
+listMap :: forall a b. (a -> b) -> List a -> List b
+listMap f = startUnrolledMap unrollLimit
+ where

This comment has been minimized.

@paf31

paf31 Nov 30, 2017

Member

Nitpick: could you please indent things past the where, or move the where up to the end of the previous line?

@paf31

paf31 Nov 30, 2017

Member

Nitpick: could you please indent things past the where, or move the where up to the end of the previous line?

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 30, 2017

Member

Thanks!

What do other people think? @garyb @hdgarrood

Especially about the potential stack overflow issue mentioned above.

Member

paf31 commented Nov 30, 2017

Thanks!

What do other people think? @garyb @hdgarrood

Especially about the potential stack overflow issue mentioned above.

@garyb

This comment has been minimized.

Show comment
Hide comment
@garyb

garyb Nov 30, 2017

Member

@paf31 what was the stack overflow issue? Just that thing that if the stack is near-overflow already it might push it over the edge?

Member

garyb commented Nov 30, 2017

@paf31 what was the stack overflow issue? Just that thing that if the stack is near-overflow already it might push it over the edge?

@hdgarrood

This comment has been minimized.

Show comment
Hide comment
@hdgarrood

hdgarrood Nov 30, 2017

Contributor

Yeah, if f makes enough recursive calls on average, I think that could cause stack overflows. This looks really good though! If you could address @paf31's recent comments and do a little testing to check that stack-safety isn't significantly worse than it is now, I'm 👍 to merge.

Come to think of it, do we have any testing for stack overflows in map in this library already?

Contributor

hdgarrood commented Nov 30, 2017

Yeah, if f makes enough recursive calls on average, I think that could cause stack overflows. This looks really good though! If you could address @paf31's recent comments and do a little testing to check that stack-safety isn't significantly worse than it is now, I'm 👍 to merge.

Come to think of it, do we have any testing for stack overflows in map in this library already?

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 30, 2017

Member

To be clear, the issue is that we go from "map never grows the stack" to "map always grows the stack, but maybe not by very much". The growth should be linear in the size of the list.

Member

paf31 commented Nov 30, 2017

To be clear, the issue is that we go from "map never grows the stack" to "map always grows the stack, but maybe not by very much". The growth should be linear in the size of the list.

@natefaubion

This comment has been minimized.

Show comment
Hide comment
@natefaubion

natefaubion Nov 30, 2017

Contributor

With this, isn't it "map grows the stack by a fixed maximum amount"?

Contributor

natefaubion commented Nov 30, 2017

With this, isn't it "map grows the stack by a fixed maximum amount"?

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 30, 2017

Member

Well startUnrolledMap calls itself recursively, but not tail-recursively, so no, I think the stack usage grows with the size of the input.

Member

paf31 commented Nov 30, 2017

Well startUnrolledMap calls itself recursively, but not tail-recursively, so no, I think the stack usage grows with the size of the input.

@natefaubion

This comment has been minimized.

Show comment
Hide comment
@natefaubion

natefaubion Nov 30, 2017

Contributor

Right, but it has a maximum threshold defined by unrollLimit, which causes it to fallback to the slower, stack-safe variant which halts the growth. So at most, map itself will consume 1000 frames .

Contributor

natefaubion commented Nov 30, 2017

Right, but it has a maximum threshold defined by unrollLimit, which causes it to fallback to the slower, stack-safe variant which halts the growth. So at most, map itself will consume 1000 frames .

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 30, 2017

Member

Oh, of course, sorry.

Member

paf31 commented Nov 30, 2017

Oh, of course, sorry.

@safareli

This comment has been minimized.

Show comment
Hide comment
@safareli

safareli Nov 30, 2017

Contributor

If frame limit is 10,000 and before map is called 9,001 frames are used then using 1k frames will overflow the stack right? is this fine?

Contributor

safareli commented Nov 30, 2017

If frame limit is 10,000 and before map is called 9,001 frames are used then using 1k frames will overflow the stack right? is this fine?

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Nov 30, 2017

Member

Nested maps are also quite common. I don't think this is enough to cause a problem on its own, but it could worsen it.

The question for me is not whether this code should be included, but whether it should be the default. Do we want a default which works in every case but is slow in most cases, or one which works in most cases but overflows the stack in a tiny number of cases?

I think it's probably okay to pick this as the default, but I just want to make sure the trade off is clear now as we decide.

Member

paf31 commented Nov 30, 2017

Nested maps are also quite common. I don't think this is enough to cause a problem on its own, but it could worsen it.

The question for me is not whether this code should be included, but whether it should be the default. Do we want a default which works in every case but is slow in most cases, or one which works in most cases but overflows the stack in a tiny number of cases?

I think it's probably okay to pick this as the default, but I just want to make sure the trade off is clear now as we decide.

@natefaubion

This comment has been minimized.

Show comment
Hide comment
@natefaubion

natefaubion Nov 30, 2017

Contributor

There's also the question of what the stack limit is. It does 5 items at a time, so it will eat stack up to 5000 items. Maybe we don't need to handle that many, and we can reduce it to 200 frames if it's a concern?

Contributor

natefaubion commented Nov 30, 2017

There's also the question of what the stack limit is. It does 5 items at a time, so it will eat stack up to 5000 items. Maybe we don't need to handle that many, and we can reduce it to 200 frames if it's a concern?

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 30, 2017

Contributor

There's also the question of what the stack limit is. It does 5 items at a time, so it will eat stack up to 5000 items. Maybe we don't need to handle that many, and we can reduce it to 200 frames if it's a concern?

I wouldn't mind doing that. Maybe it's worth me doing some exploratory tests to see how much breathing room Node gives us?

Contributor

matthewleon commented Nov 30, 2017

There's also the question of what the stack limit is. It does 5 items at a time, so it will eat stack up to 5000 items. Maybe we don't need to handle that many, and we can reduce it to 200 frames if it's a concern?

I wouldn't mind doing that. Maybe it's worth me doing some exploratory tests to see how much breathing room Node gives us?

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 30, 2017

Contributor

regarding the call stack size: http://2ality.com/2014/04/call-stack-size.html

Contributor

matthewleon commented Nov 30, 2017

regarding the call stack size: http://2ality.com/2014/04/call-stack-size.html

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Nov 30, 2017

Contributor

we can reduce it to 200 frames if it's a concern?

I've made a commit that does this. I figure that this means we still get a big speed-up for common List use-cases, while if people are serious about having faster iteration through longer lists, they can use the fast list repo.

Contributor

matthewleon commented Nov 30, 2017

we can reduce it to 200 frames if it's a concern?

I've made a commit that does this. I figure that this means we still get a big speed-up for common List use-cases, while if people are serious about having faster iteration through longer lists, they can use the fast list repo.

matthewleon added some commits Nov 21, 2017

List Functor: mix unrolled and reverse map
Addresses #131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems:   142.61 μs -> 36.97 μs
list of 2000 elems:   275.17 μs -> 55.33 μs
list of 10000 elems:  912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms  -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.

Benchmarked on 2017 Macbook Pro, 2.3 GHz Intel Core i5, 8 GB RAM.
macOS Sierra 10.12.6
node v8.9.1
initial benchmarks for List.map
2017 MacBook Pro 2.3 GHz Intel Core i5, 8 GB 2133 MHz LPDDR3
Node v8.9.1

List
====
map
---
map: empty list
mean   = 1.31 μs
stddev = 11.87 μs
min    = 799.00 ns
max    = 375.82 μs
map: singleton list
mean   = 2.40 μs
stddev = 11.03 μs
min    = 1.03 μs
max    = 342.18 μs
map: list (1000 elems)
mean   = 143.41 μs
stddev = 225.12 μs
min    = 97.16 μs
max    = 2.03 ms
map: list (2000 elems)
mean   = 274.16 μs
stddev = 295.84 μs
min    = 199.66 μs
max    = 2.06 ms
map: list (5000 elems)
mean   = 531.84 μs
stddev = 512.61 μs
min    = 229.45 μs
max    = 2.95 ms
map: list (10000 elems)
mean   = 895.24 μs
stddev = 777.87 μs
min    = 464.59 μs
max    = 2.94 ms
map: list (100000 elems)
mean   = 33.45 ms
stddev = 7.65 ms
min    = 22.07 ms
max    = 63.47 ms
lower unrolled map iteration limit
this lower the probability of stack-size troubles
@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Dec 4, 2017

Contributor

rebased.

Contributor

matthewleon commented Dec 4, 2017

rebased.

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Dec 10, 2017

Member

Given the possible stack-safety issue, what do people think about providing the naive (safe) implementation as a separate named function? Or vice versa?

Member

paf31 commented Dec 10, 2017

Given the possible stack-safety issue, what do people think about providing the naive (safe) implementation as a separate named function? Or vice versa?

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Dec 11, 2017

Contributor

One possibility is to just use the stack-safe unrolled operation, which should still give a speed boost, and recommend that people turn to purescript-lists-fast if speed is important.

Contributor

matthewleon commented Dec 11, 2017

One possibility is to just use the stack-safe unrolled operation, which should still give a speed boost, and recommend that people turn to purescript-lists-fast if speed is important.

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Dec 11, 2017

Member

Would it be possible to try that and compare the performance easily?

Member

paf31 commented Dec 11, 2017

Would it be possible to try that and compare the performance easily?

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Dec 12, 2017

Contributor

I made a branch that only uses the stack-safe reverse map and... It's actually faster.

map: list (2000 elems)
mean = 53.33 μs -> 31.24 μs

map: list (10000 elems)
mean = 107.73 μs -> 98.17 μs

I've merged it in.

Contributor

matthewleon commented Dec 12, 2017

I made a branch that only uses the stack-safe reverse map and... It's actually faster.

map: list (2000 elems)
mean = 53.33 μs -> 31.24 μs

map: list (10000 elems)
mean = 107.73 μs -> 98.17 μs

I've merged it in.

@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Dec 12, 2017

Contributor

Just realized I can also inline the unrolledMap, will do that and push.

Contributor

matthewleon commented Dec 12, 2017

Just realized I can also inline the unrolledMap, will do that and push.

make map stack safe(r) again
begin with reverse unrolled map
@matthewleon

This comment has been minimized.

Show comment
Hide comment
@matthewleon

matthewleon Dec 12, 2017

Contributor

Actually, inlining unrolledMap slows things down significantly. Mysteries of JS... This should be good for review.

Contributor

matthewleon commented Dec 12, 2017

Actually, inlining unrolledMap slows things down significantly. Mysteries of JS... This should be good for review.

@paf31

This comment has been minimized.

Show comment
Hide comment
@paf31

paf31 Dec 12, 2017

Member

Even better, thanks!

Any other comments before I merge this?

Member

paf31 commented Dec 12, 2017

Even better, thanks!

Any other comments before I merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment