New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Generators #165

Closed
jonathanhefner opened this Issue Feb 8, 2015 · 4 comments

Comments

Projects
None yet
3 participants
@jonathanhefner
Contributor

jonathanhefner commented Feb 8, 2015

I have an experimental branch I've been playing with. After the discussion in #152, I've been trying to come up with an elegant and performant way of expressing loops. The concept that I kept coming back to is generators. You can see how they are implemented, and also how list functions can be implemented in terms of them. Note that I was able to remove several native List functions in favor of just the 3 in Native.Generator. It should be possible to do the same with loops in other native modules, e.g. Native.Regex.

So how do they perform? Good, although not as good as I would like. From preliminary testing, the biggest slowdown seems to be allocation / garbage collection of Maybe objects. The good news is that if we implement Maybe elision (i.e. Just x => x and Nothing to null in JavaScript) in the compiler, Generators should perform extremely close to their non-abstracted counterparts. (Other code will get faster too, of course.)

Below are some benchmark numbers (time in seconds; lower is better). All benchmarking was done on an old laptop to magnify performance gaps. With the exception of the last column, all benchmarks were done over a 500,000 element list. For the last column (foldl/100), I instead ran the test over 100 5,000 element lists. All higher-order functions (e.g. the f for map or the reduce for foldl) were as simple and dumb as possible to put emphasis on the speed of iteration.

branch map take filterMap repeat
master 4.15 4.05 4.20 1.25
generators 4.05 3.90 4.00 1.25
branch any foldl foldr drop foldl/100
master 1.05 1.05 1.10 1.0 0.95
generators 1.25 1.20 1.25 1.15 1.05

The takeaways, in my opinion, are:

  • For anything like mapping, Generators are a win. They do this by cheating (using mutation in JavaScript), but said cheating is confined to a single native method and every Generator benefits from it.
  • For pure iteration (involving no computation overhead), Generators are ~15% slower than data structure-specific native loops.
  • Performance could be improved by implementing Maybe elision as a compiler optimization, making the wins greater and virtually eliminating the losses.
@TheSeamau5

This comment has been minimized.

Show comment
Hide comment
@TheSeamau5

TheSeamau5 Feb 12, 2015

Contributor

This looks like super cool stuff. Plus generators give us super simple laziness if we want to. We could for example use this to implement a LazyList and a LazyArray.

Contributor

TheSeamau5 commented Feb 12, 2015

This looks like super cool stuff. Plus generators give us super simple laziness if we want to. We could for example use this to implement a LazyList and a LazyArray.

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Feb 22, 2015

Member

I'm not sure exactly what the goal is here. Is this primarily about performance? Or about public APIs?

Exposing them as a public API seems questionable for a number of reasons. I haven't looked into JS generators a ton, but I know that some languages have generators that mutate as you use them. Furthermore, you can do it in the language itself.

If it's primarily about performance, under the hood, are we getting something over for loops? If so, is that worth using features that are not super-widely supported?

As for the idea of "Maybe elision", I don't think that's really possible. How can you tell the difference between Nothing and Just Nothing and Just (Just Nothing) and so on? This information cannot be lost and maintain the meaning of a program.

Member

evancz commented Feb 22, 2015

I'm not sure exactly what the goal is here. Is this primarily about performance? Or about public APIs?

Exposing them as a public API seems questionable for a number of reasons. I haven't looked into JS generators a ton, but I know that some languages have generators that mutate as you use them. Furthermore, you can do it in the language itself.

If it's primarily about performance, under the hood, are we getting something over for loops? If so, is that worth using features that are not super-widely supported?

As for the idea of "Maybe elision", I don't think that's really possible. How can you tell the difference between Nothing and Just Nothing and Just (Just Nothing) and so on? This information cannot be lost and maintain the meaning of a program.

@jonathanhefner

This comment has been minimized.

Show comment
Hide comment
@jonathanhefner

jonathanhefner Feb 22, 2015

Contributor

There were a few goals here:

  • Tiny / simple API for iteration, for both internal and external use
  • Progress towards a generic collection API, so that e.g. List.any and Array.any share an implementation
  • Minimize performance costs, improve performance where possible

However, I've been heads-down writing a library that I think better addresses these goals: elm-seq. It needs more fleshing out (a README, more unit tests, etc), but everything works so far. Key points:

  • Seq a is a lazy sequence of a
  • Mapping, filtering, etc are all done via behind-the-scenes transducers
    • Many typically stateful transducers (e.g. drop) are instead implemented by threading an index into the reducer function. There's more cleverness to make this work in chained operations, but that's the main idea.
  • Iter is a function signature that any collection can implement to work with Seq (or use the transducers directly)
  • A generic Iter implementation that calls a state -> state function could provide a simple iteration API for external use
  • I haven't tested performance yet, but there should at least be some benefit when doing chained operations (e.g. map => filter => reduce). I am concerned about how curried functions are currently handled, but that could either be avoided (with explicit lambda creation) or optimized (in the elm-core F* and A* JavaScript functions).

I will eventually post a link on the mailing list, but any and all feedback is greatly appreciated.

Contributor

jonathanhefner commented Feb 22, 2015

There were a few goals here:

  • Tiny / simple API for iteration, for both internal and external use
  • Progress towards a generic collection API, so that e.g. List.any and Array.any share an implementation
  • Minimize performance costs, improve performance where possible

However, I've been heads-down writing a library that I think better addresses these goals: elm-seq. It needs more fleshing out (a README, more unit tests, etc), but everything works so far. Key points:

  • Seq a is a lazy sequence of a
  • Mapping, filtering, etc are all done via behind-the-scenes transducers
    • Many typically stateful transducers (e.g. drop) are instead implemented by threading an index into the reducer function. There's more cleverness to make this work in chained operations, but that's the main idea.
  • Iter is a function signature that any collection can implement to work with Seq (or use the transducers directly)
  • A generic Iter implementation that calls a state -> state function could provide a simple iteration API for external use
  • I haven't tested performance yet, but there should at least be some benefit when doing chained operations (e.g. map => filter => reduce). I am concerned about how curried functions are currently handled, but that could either be avoided (with explicit lambda creation) or optimized (in the elm-core F* and A* JavaScript functions).

I will eventually post a link on the mailing list, but any and all feedback is greatly appreciated.

@jonathanhefner

This comment has been minimized.

Show comment
Hide comment
@jonathanhefner

jonathanhefner Feb 22, 2015

Contributor

Also, as clarification, this proposal had nothing to do with JavaScript generators (which, indeed, do mutate and are evil).

Contributor

jonathanhefner commented Feb 22, 2015

Also, as clarification, this proposal had nothing to do with JavaScript generators (which, indeed, do mutate and are evil).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment