Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial transducers implementation #878

Merged
merged 1 commit into from
Mar 11, 2015
Merged

Conversation

kevinbeaty
Copy link
Contributor

  • Squashed from buzzedecafe:transduce
  • Rebased to master
  • Discussed in ramduce
  • Credits to @buzzdecafe and @kedashoe

Squashed Commits:

  • we have a map transducer
  • Apply all arguments to transducer as function.
  • Use switch case dispatch for transducer function
  • port @kedashoe and @kevinbeaty changes
  • Fix build and tests.
  • _appendXf as single expression and fix require order for bulid
  • foldl uses _xwrap for use with functions
  • _dispatchable accepts function for use in _groupBy
  • _isIterable uses _symIterator
  • Fix filter/map to use _dispatchable appropriately
  • Curry _xany, _xfilter, _xmap for use with _dispatchable
  • Fix idx lint warning in _arrayReduce
  • Remove no longer supported transformer dispatch test
  • Transducers for take, takeWhile, drop, dropWhile
  • get tests working again
  • transduce tests
  • improve map test coverage
  • Add into
  • all and any working
  • Add find, findLast, findIndex and findLastIndex transducers
  • Step default value in any/all
  • stepLast no longer requires initial value since transducers always step single value
  • Allow null/undefined accumulators when checking reduced
  • groupBy transducer

@kevinbeaty
Copy link
Contributor Author

Alternative to #865

@kevinbeaty
Copy link
Contributor Author

Note, the mocha tests pass, but it is failing on creation of index.js ... I still need to look into this. If anybody could offer any pointers, it would be appreciated.

@buzzdecafe
Copy link
Member

thanks for doing this so fast @kevinbeaty .

it is failing on creation of index.js

have you tried make clean and then try again?

@kevinbeaty kevinbeaty force-pushed the transducers branch 3 times, most recently from a3fff44 to 7439d16 Compare March 6, 2015 03:02
@kevinbeaty
Copy link
Contributor Author

it is failing on creation of index.js

All is well now.

@buzzdecafe
Copy link
Member

Man, this is a lot of work! I dropped the ball on this because i got very discouraged by the perf numbers. However, there is much that is good about this impl (if I say so myself) and there may be opportunities to optimize later.

I am thinking more and more about using properties on functions to store metadata about them, as recommended long ago by @megawac . I wonder if we could include an iteration strategy on a function, s.t. when called e.g. lazily(map(f)) it would employ a lazy iteration strategy on map; or more germane to this case, transduce(map(f), filter(g)) would use the transducer strategy over the composition. I don't know that this is possible,or how much work it would be, or how invasive. But if there is a way to thread the needle, it could be a way to get the best of several worlds.

That would address one of my primary concerns about this approach as it stands today: It is very invasive, essentially routing all iteration through reduce. It costs a lot in terms of typical-use performance, and you will not see the benefits until you get up to very large list sizes.

@svozza
Copy link
Contributor

svozza commented Mar 6, 2015

Wow. This is amazing.

@buzzdecafe buzzdecafe mentioned this pull request Mar 6, 2015
@kevinbeaty
Copy link
Contributor Author

I wonder if we could include an iteration strategy on a function

I think we could change the iteration strategy internally to the _transduceDispatch so I don't think it would be too invasive. My initial thought would be to include a Ramda meta data property using a Symbol on the function with fallbacks similar to how we are handling iterator and transformer symbols.

Something like:

var symRamdaMeta = typeof Symbol !== 'undefined' ? Symbol('ramdaMeta') : '@@ramdaMeta'
fn[symRamdaMeta] = {}

// in _transduceDispatch or transduce... whatever makes sense
fn[symRamdaMeta].iterationStrategy = 'transduce'

@buzzdecafe
Copy link
Member

My initial thought would be to include a Ramda meta data property using a Symbol on the function with fallbacks similar to how we are handling iterator and transformer symbols.

👍

fn[symRamdaMeta].iterationStrategy = 'transduce'

I was imagining a reference to a function, maybe something like:

fn[symRamdaMeta].strategy = {
    lazy: lzIter,
    xf: xfIter,
    //... etc.
};

then the implementation of lazily could query the function and grab the correct iteration strategy. We can fight over design details later.

@kevinbeaty
Copy link
Contributor Author

I was imagining a reference to a function

Ah. Gotcha. That could work well.

We can fight over design details later.

Do you want me to propose something and update the PR, or are you saying this is a separate issue?

@buzzdecafe
Copy link
Member

Do you want me to propose something and update the PR, or are you saying this is a separate issue?

I would like to build/see a smaller scale proof of concept first. So I think a separate issue/PR makes sense

@kevinbeaty kevinbeaty force-pushed the transducers branch 2 times, most recently from ada9d83 to ae33328 Compare March 7, 2015 14:28
@kevinbeaty
Copy link
Contributor Author

I updated the PR with a few minor changes to _dispatchable to help reduce the intrusion/performance concerns. _dispatchable now resembles _checkForMethod with the exception of dispatching on a transformer given in list position. Like _checkForMethod, the function to call as a fallback is passed in to allow an optimized version in replace of using transducers and reduce for the common case. I changed map to use the internal version. Note I did not need to do anything with function meta data. We might be able to make of use of that later, but it was not necessary with this change.

Using old version (map through reduce in common case on my laptop):

Running suite map [lib/bench/map.bench.js]...
_.map x 2,915,839 ops/sec ±4.59% (85 runs sampled)
map(sq, nums) x 339,775 ops/sec ±4.09% (84 runs sampled)
map(sq)(nums) x 307,052 ops/sec ±4.41% (80 runs sampled)
mapSq(nums) x 321,318 ops/sec ±4.33% (81 runs sampled)
Fastest test is _.map at 8.6x faster than map(sq, nums)

Using new version (map using internal _map in common case):

Running suite map [lib/bench/map.bench.js]...
_.map x 3,464,050 ops/sec ±2.51% (91 runs sampled)
map(sq, nums) x 2,057,462 ops/sec ±0.89% (100 runs sampled)
map(sq)(nums) x 1,715,320 ops/sec ±0.95% (100 runs sampled)
mapSq(nums) x 1,977,764 ops/sec ±0.96% (98 runs sampled)
Fastest test is _.map at 1.68x faster than map(sq, nums)

Does this make you feel better about things? I can update the other functions to fallback to the internal versions if you like this approach.

@kevinbeaty kevinbeaty force-pushed the transducers branch 3 times, most recently from fc7730e to 4c6fcf8 Compare March 7, 2015 16:42
@buzzdecafe
Copy link
Member

wow @kevinbeaty that is quite an improvement

@kevinbeaty
Copy link
Contributor Author

Latest update, after a few more tweaks to _dispatchable

Running suite map [lib/bench/map.bench.js]...
_.map x 3,526,052 ops/sec ±0.92% (100 runs sampled)
map(sq, nums) x 3,165,482 ops/sec ±0.93% (100 runs sampled)
map(sq)(nums) x 2,337,636 ops/sec ±3.02% (93 runs sampled)
mapSq(nums) x 3,297,122 ops/sec ±1.16% (94 runs sampled)
_.map at 1.07x faster than mapSq(nums)

@buzzdecafe
Copy link
Member

🎆

@kevinbeaty
Copy link
Contributor Author

Yeah. So TIL that typeof checks can make a big difference in micro benchmarks.

So is that a 👍 to convert the rest of the _dispatchable functions in this PR to this new approach? Does it alleviate your concerns?

@buzzdecafe
Copy link
Member

i'm pretty gobsmacked actually. well done! and yes, let's proceed. Anyone else have $0.02 to contribute?

@kevinbeaty
Copy link
Contributor Author

Ok. Converted the other functions and removed no longer needed helper functions (including _transduceDispatch). The diff is cleaner now. Probably also a good time for interested parties to review.

A few notes:

  • Some functions now have "functor method dispatch" that didn't before all any drop dropWhile find findIndex findLast findLastIndex groupBy ... If this is controversial, I can remove this by passing null as methodname for these functions and handle appropriately in _dispatchable. Say the word and I'll make this change.
  • reduce (and by extension groupBy) still has a performance hit due to addition of iterable dispatching and __reduced__ checks. Maybe this is OK. I haven't looked at it much so there may be opportunity for improvement.

@buzzdecafe
Copy link
Member

all any drop dropWhile find findIndex findLast findLastIndex groupBy

do any other libs have these object methods that we want to enable point-free composition for? most of these seem to be pretty array specific.

@buzzdecafe
Copy link
Member

took a quick look:

  • highland has find; that would be nice support;
  • lazy.js: every (all); some (any); this would mean 👽 which is ok by me, but may be a fight; also drop and dropWhile; also find
  • etc.

so for the most part i guess we do want to dispatch on these; in some cases we may have to alias 😱 if we want interoperability

@kevinbeaty
Copy link
Contributor Author

Incorporated latest tweaks from @AutoSponge and @davidchambers

* used to convert the final accumulator into the return type and in most cases is R.identity.
* The init function is used to provide the initial accumulator.
*
* The iteration is performed with R.reduce after initializing the transducer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terrific description!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@megawac
Copy link
Contributor

megawac commented Mar 11, 2015

👍 looks pretty much good to me. I would say making a X base class would be something worth doing in the future to remove some of the redundancy tho

* @func
* @memberOf R
* @category List
* @sig a -> b -> c -> a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite right. Perhaps something like this:

a -> (b -> b) -> [c] -> a

I sense there's some relationship between a, b, and c, but this might only be apparent in b's methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change here and similarly in transduce.

@CrossEye
Copy link
Member

This is fantastic. I've finally gotten through a once-over of everything and managed to run some performance tests (no significant changes for current benchmarks; presumably there would be speed-ups in combination ones we never wrote.)

I'm very, very impressed!

I was barely listening as the initial passes at this stuff was going on. I knew what you were talking about but not much about how you were going about it. This looks really well-done.

I wouldn't delay merging the code for this, as we can address it later. But I'd like to either have a discussion or to learn about reasons already discussed for choosing function constructors over Object.create (even if we shimmed it) or a possibility like this, for _xmap:

var _curry2 = require('./_curry2');


module.exports = _curry2(function _xmap(f, xf) {
    return {
        xf: xf,
        f: f,
        init: function() {
            return this.xf.init();
        },
        result: function(result) {
            return this.xf.result(result);
        },
        step: function(result, input) {
            return this.xf.step(result, this.f(input));
        }
    }; 
});

It's not at all that I think one of these other options is necessarily better, but I think we should consider them... unless this was already hashed out when I was not paying attention.

Really nice work!

- Squashed from buzzedecafe:transduce
- Rebased to master
- Discussed in ramduce
- Credits to @buzzdecafe and @kedashoe

Squashed Commits:
- we have a map transducer
- reduce uses _xwrap for use with functions
- _dispatchable as transducers when passed a transformer
- _isIterable uses _symIterator
- Fix filter/map to use _dispatchable appropriately
- Curry _xany, _xfilter, _xmap for use with _dispatchable
- Transducers for take, takeWhile, drop, dropWhile
- transduce tests
- Add into
- all and any working
- Add find, findLast, findIndex and findLastIndex transducers
- Allow null/undefined accumulators when checking reduced
- groupBy transducer
@kevinbeaty
Copy link
Contributor Author

@kevinbeaty
Copy link
Contributor Author

@megawac

I would say making a X base class would be something worth doing in the future to remove some of the redundancy tho

@CrossEye

But I'd like to either have a discussion or to learn about reasons already discussed for choosing function constructors over Object.create

All options worth exploring.

I wouldn't delay merging the code for this, as we can address it later.

👍 Please :)

@CrossEye
Copy link
Member

Everyone ok with merging this one? This is one of our most sweeping changes to date, so I'd really like to hear from @buzzdecafe, @davidchambers, @kedashoe, and (I suppose 😉) @kevinbeaty if there any objections. Of course anyone is free to chime in. (Especially you @megawac. 😉)

@buzzdecafe
Copy link
Member

let's do it -- big thanks to @kevinbeaty and @kedashoe for their great work -- particularly reviving this when i had dropped the ball.

One caveat -- this now means we have clojure-style iterators when transducing. SO they support array-like, string, iterator-protocol, and object; this means weird unspecified behavior when iterating objects because of indeterminate iteration order.

@AutoSponge
Copy link
Contributor

👍

Great work. One question, would this constitute a major or minor version?

@buzzdecafe
Copy link
Member

would this constitute a major or minor version?

since we're pre-1.0 I think we can get away with a minor bump

@kevinbeaty
Copy link
Contributor Author

One caveat -- this now means we have clojure-style iterators when transducing. SO they support array-like, string, iterator-protocol, and object; this means weird unspecified behavior when iterating objects because of indeterminate iteration order.

Actually, there is no implicit object iteration in this PR for this very reason (we could certainly add it in the future). It currently supports merging objects on the accumulator side but iteration of objects must be explicit (using ES6 style Symbol.iterator or @@iterator)

@buzzdecafe
Copy link
Member

i stand corrected! thanks @kevinbeaty

@davidchambers
Copy link
Member

🌳

buzzdecafe added a commit that referenced this pull request Mar 11, 2015
Initial transducers implementation
@buzzdecafe buzzdecafe merged commit 950dc5a into ramda:master Mar 11, 2015
@buzzdecafe
Copy link
Member

🎉

@megawac
Copy link
Contributor

megawac commented Mar 11, 2015

🎊 👯 🎈

@svozza
Copy link
Contributor

svozza commented Mar 11, 2015

Fantastic work!

@kevinbeaty kevinbeaty deleted the transducers branch March 11, 2015 17:49
@kevinbeaty
Copy link
Contributor Author

Awesome.

I have to admit, hitting that "Delete Branch" button felt pretty good ;)

Thanks all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants