concat_map inlining for BatStream and BatEnum #526

Merged
merged 2 commits into from Apr 14, 2014

Conversation

Projects
None yet
5 participants
Contributor

Drup commented Jan 31, 2014

I have run some benchmarks with lilis, the result is ~1.4 time faster for BatStream and ~1.2 time faster for BatEnum.
The tests seem to pass (if make test is enough).
Are the tests for BatEnum.concat_map enough ?
Should I add test for BatStream ? The file is quite deserted ...

Here are the benchs. BatSeq is here for reference.
Before :

 --- Lsystem Von_koch for 7 iterations ---
           Rate         Stream     Enum      Seq
  Stream 2.42+-0.02/s       --     -80%     -94%
    Enum 12.0+- 0.4/s     395%       --     -68%
     Seq 37.3+- 0.2/s    1442%     211%       --

 --- Lsystem dragon for 15 iterations ---
           Rate         Stream     Enum      Seq
  Stream 1.40+-0.01/s       --     -84%     -94%
    Enum 8.56+-0.12/s     513%       --     -64%
     Seq 23.7+- 0.3/s    1596%     177%       --

 --- Lsystem Tetradragon for 12 iterations ---
            Rate          Stream     Enum      Seq
  Stream 0.209+-0.001/s       --     -79%     -94%
    Enum  1.02+- 0.02/s     386%       --     -69%
     Seq  3.27+- 0.01/s    1465%     222%       --

After

 --- Lsystem Von_koch for 7 iterations ---
           Rate         Stream     Enum      Seq
  Stream 4.46+-0.02/s       --     -67%     -89%
    Enum 13.7+- 0.1/s     207%       --     -66%
     Seq 40.7+- 1.0/s     814%     198%       --

 --- Lsystem dragon for 15 iterations ---
           Rate         Stream     Enum      Seq
  Stream 2.80+-0.01/s       --     -73%     -89%
    Enum 10.3+- 0.1/s     267%       --     -61%
     Seq 26.4+- 0.1/s     844%     157%       --

 --- Lsystem Tetradragon for 12 iterations ---
            Rate          Stream     Enum      Seq
  Stream 0.364+-0.005/s       --     -70%     -90%
    Enum  1.20+- 0.01/s     229%       --     -66%
     Seq  3.51+- 0.01/s     863%     193%       --

P.S. : I just saw the benchsuite, Do I add something there ?

Owner

gasche commented Jan 31, 2014

I would be interested in seeing the benchmark code you use to measure speed. I don't understand why Seq would be so slow for example, is that a workflow with lots of sharing?

Owner

gasche commented Jan 31, 2014

PS: no, the Enum test coverage is currently not good enough to get us any confidence of correctness. The really hard part is clone, and testing clone is nearly as difficult as getting a proper specification for it, which we have failed to achieve so far.

Contributor

Drup commented Jan 31, 2014

Seq is not slow, it's the fastest of the three.

Member

c-cube commented Feb 4, 2014

20% speedup for Enum sounds a lot. It makes the not-much-more-complicated new version very attractive imho. Anyone against merging?

Member

rgrinberg commented Feb 4, 2014

I personally don't care as I don't use Enum unless absolutely forced nowadays. But yeah the change seems good and Enum is important to lots of people so LGTM

Owner

thelema commented Feb 5, 2014

@gasche what's the problem with clone x being specified to 1) not
observably modify x and 2) return an enumeration that is equal to x (in
that the sequence of values enumerated is identical)?

On Fri, Jan 31, 2014 at 3:44 AM, gasche notifications@github.com wrote:

PS: no, the Enum test coverage is currently not good enough to get us any
confidence of correctness. The really hard part is clone, and testing
clone is nearly as difficult as getting a proper specification for it,
which we have failed to achieve so far.


Reply to this email directly or view it on GitHubhttps://github.com/ocaml-batteries-team/batteries-included/pull/526#issuecomment-33769595
.

Member

c-cube commented Feb 5, 2014

The point of @gasche is, again, that Enum has a complicated semantics that makes its code awful (and slow, apparently), but it is still useful. I like the semantics proposed by @thelema but I think it's unreasonable w.r.t performances -- you shouldn't expect cloning to be useful in the general case; instead, a distinction should be made between one-shot iterators (IO streams for instance, or iterating over a mutable structure) and cloned/persistent iterators that can be used as many times as needed. A conversion operation (basically, store all elements in a hidden structure) could be used to convert the former into the latter if really needed.

Owner

thelema commented Feb 5, 2014

I agree that the semantics I propose don't have super efficient solutions
for IO streams (or other one-shot iterators), as the emitted values from
either Enum have to be cached somewhere so they can be emitted from the
other correctly. That said, this is not much worse than lazy list. Using
clone on a one-shot iterator can be a known performance drain.

E.

On Wed, Feb 5, 2014 at 12:47 PM, Simon Cruanes notifications@github.comwrote:

The point of @gasche https://github.com/gasche is, again, that Enum has
a complicated semantics that makes its code awful (and slow, apparently),
but it is still useful. I like the semantics proposed by @thelemahttps://github.com/thelemabut I think it's unreasonable w.r.t performances -- you shouldn't expect
cloning to be useful in the general case; instead, a distinction should be
made between one-shot iterators (IO streams for instance, or iterating over
a mutable structure) and cloned/persistent iterators that can be used as
many times as needed. A conversion operation (basically, store all elements
in a hidden structure) could be used to convert the former into the latter
if really needed.


Reply to this email directly or view it on GitHubhttps://github.com/ocaml-batteries-team/batteries-included/pull/526#issuecomment-34216357
.

Contributor

Drup commented Feb 9, 2014

@thelema : "Not much worse than lazylist" is already very bad, performance wise, for short-lived streams. I'm not sure why try to fix something when you have another better solution (cf my email on the ML, or @c-cube 's Sequence library).

Member

c-cube commented Feb 9, 2014

Agreed. If LazyList is faster than Enum then Enum is useless because the former is much simpler and easier to understand (cloning is trivial, for instance).
Which makes me think that maybe, lightweight iterators that provide a freeze : 'a gen -> 'a lazy_list clone-like operator may be very nice!

Owner

thelema commented Feb 10, 2014

This is a worst case performance for a case that enum wasn't really
designed for. The alternative is to give up and raise a not-supported
exception when someone tries to clone a stream that would need this kind of
special handling. I don't mind having known-slow behavior of Enum if the
benefit is more functionality for users.

On Sat, Feb 8, 2014 at 11:02 PM, Drup notifications@github.com wrote:

@thelema https://github.com/thelema : "Not much worse than lazylist" is
already very bad, performance wise, for short-lived streams. I'm not sure
why try to fix something when you have another better solution (cf my email
on the ML, or @c-cube https://github.com/c-cube 's Sequence library).


Reply to this email directly or view it on GitHubhttps://github.com/ocaml-batteries-team/batteries-included/pull/526#issuecomment-34564900
.

Member

c-cube commented Feb 10, 2014

Then, why not use LazyList instead of Enum everywhere? It can provide at least as much functionality and doesn't seem much slower...
@thelema weren't you planning to provide, in each module, conversion functions that can use different styles of iterators, at some point? Like of_gen, to_sequence, of_enum, etc.?

Member

c-cube commented Feb 10, 2014

Actually, let me detail my opinion. We've seen that Enum is complicated and quite slow (which is understandable given the amount of things it's trying to provide). On the other hand having a very convenient and compositional way of iterating over values is undoubtedly important and useful.

What Enum tried to provide is three-fold:

  • consumable iterator (allows for things like zip, or merge, cannot be done with my Sequence module's style of iterators)
  • ability to count elements quickly if possible
  • ability to clone the iterator so that its copy can be used without affecting (consuming) the original.

I have no idea of how to deal with the second point in general (other than cloning, then counting), but the two other points are imho dealt with in my module Gen in a good enough way (yes, shameless self-advertising). The idea is that cloning can basically work in two ways:

  1. the Enum is built from an underlying structure (list, array, tree, etc.) and cloning just requires copying some specific state of "where" the Enum points into the structure. I believe that in this case, using "restartable" iterators (which make cloning meaningless since you can create new iterators easily) is good enough. I really don't see the point of cloning in the middle of an iteration (use drop or drop_while to remove elements if required).
  2. the Enum is built from a consumable resource (Stream, some in_channel, etc.). In this case cloning is done by reading the whole input into a hidden structure (list), and then iterating on this list. Here it's the same as using Gen.persistent, which returns a restartable generator (cloning for free thereafter); an hypothetical conversion to LazyList might also be very interesting.

I'm promoting my own code because I like reinventing the wheel (according to many people on some IRC channel), but also because I believe it's much more readable and understandable in this case (and faster, according to benchmarks, but performance is always more tricky to debate). Take a look at the source of both Gen and Enum to compare. Another benefit is that the type 'a Gen.t = unit -> 'a option is transparent and can be used by modules (imported/exported) without depending on Gen itself, which would make splitting batteries into pieces easier.

Cheers!

Owner

gasche commented Feb 10, 2014

Instead of promoting your own code, you could promote Batteries code, after having sent pull requests to implement Gen-style iterators for Batteries datastructures. As you well know, I think we should provide the control-inverted dual as well, but that's another story.

Enum is not going away easily. But we can propose better alternatives.

Member

c-cube commented Feb 10, 2014

Well, sure! I can try to do that (it's much simpler than providing to_enum / of_enum anyway), soon.

Owner

gasche commented Apr 14, 2014

The benchmark seems flawed (in absolute time difference, Seq's speed improves more than Enum's speed, while the former is supposed to be unaffected by the patch), but we've argued enough about it. Drup says that, independently of the benchmark, he did observe a speed difference, and for this one time it will be enough.

@gasche gasche added a commit that referenced this pull request Apr 14, 2014

@gasche gasche Merge pull request #526 from Drup/master
concat_map inlining for BatStream and BatEnum
49b11ed

@gasche gasche merged commit 49b11ed into ocaml-batteries-team:master Apr 14, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment