Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Shortened version of unzip that doesn't create sparse arrays #1047

Merged
merged 4 commits into from

4 participants

@almost

@jdalton informed me that some browsers have problems with sparse arrays1 so I re-wrote _.unzip to not create them. In the process I managed to reduce the amount of code and I'm quite pleased with the result :)

@jdalton

_.bind(_.pluck, null, tuples) can be _.partial(_.pluck, tuples).

Thanks, have changed it. I didn't know _.partial was in underscore, thought it was just in low-dash

No problem, my "code reads" are free ;)
The _.partial method was added in Underscore v1.4.4, ~9 months after it was added to Lo-Dash.
However, it's still not as robust as Lo-Dash's, which also mimics the behavior of _.bind when called as a constructor.

@jdalton

I spend a lot of my time unrolling code, but I've got to hand it to you.
You've used the methods available to really pack this down.

slow clap

underscore.js
@@ -506,16 +506,8 @@
// `[['a',1],['b',2],['c',3]]` returns the array
// [['a','b','c'],[1,2,3]].
_.unzip = function(tuples) {
- var results = [];
- _.each(tuples, function (tuple, tupleIndex) {
- _.each(tuple, function (value, itemIndex) {
- if (results.length <= itemIndex) {
- results[itemIndex] = [];
- }
- results[itemIndex][tupleIndex] = value;
- });
- });
- return results;
+ var maxLen = _.max(_.pluck(tuples, "length"))
+ return _.map(_.range(maxLen), _.partial(_.pluck, tuples));
@almost
almost added a note

return _.times(maxLen, _.partial(_.pluck, tuples)); almost works except in the case of an empty list input in which case maxLen becomes -Infinity which _.times doesn't like. So close :)

@jdalton
jdalton added a note

return _.times(maxLen, _.partial(_.pluck, tuples)); almost works except in the case of an empty list input in which case maxLen becomes -Infinity which _.times doesn't like. So close :)

It would work in Lo-Dash ;)

@almost
almost added a note

Do you think it should work? I think it probably should, being able to pass in a negative number and have _.times not through an error (just return an empty array) seems like a useful thing and matches up what would happen if you were using a regular for loop. It also matches the behaviour of _.range.

I've created a pull request for the change here #1048

@jashkenas Owner

Now that it does work, want to revise this PR?

@jdalton
jdalton added a note

Here is the perf and size break downs:
http://jsperf.com/unzip-vs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@almost

I've updated the code to use my patch just merged for _.times

@jashkenas jashkenas merged commit 556be78 into from
@michaelficarra
Collaborator

@jashkenas: You sure about this? It's even slower than the already slow implementation underscore already had.

@jashkenas
Owner

I could go either way, but unzip doesn't seem like a super hot path kind of function -- at least for most folks. Feel free to revert if you feel strongly otherwise.

@almost

Happy to provide an iterative version that doesn't suffer from the sparse array problem if you want. But I kind of figured the same as you that most of the time the speed of unzip isn't going to be the limiting thing in a system. But then I always tend to lean more towards clear and less towards fast, and maybe that's not the correct thing to do in a widely used library :p

@michaelficarra
Collaborator

@almost: I agree completely. I would write my code like this if it was my own. But this is a very popular library that some people may expect to be performant. And with such huge differences in performance, it's probably better to opt for the unrolled/inlined/iterative approach. But we shouldn't just apply that technique to unzip. All underscore functions that depend on other underscore functions would probably benefit greatly.

@almost

I guess the question is how far to take it, the original used _.each but could have been written with just for loops. The newer version makes an improvement to the older version in that it doesn't produce sparse arrays. To write it completely iteratively would be quite a bit larger than it currently is. May well be worth it though, will have a go.

@almost

Ok, just pushed up an iterative version, and it's actually quite short :)

Here are performance comparisons (iterative is fastest by quite a way of course): http://jsperf.com/various-versions-of-unzip

@almost

See #1049

@jdalton

@michaelficarra

I agree completely. I would write my code like this if it was my own. But this is a very popular library that some people may expect to be performant. And with such huge differences in performance, it's probably better to opt for the unrolled/inlined/iterative approach. But we shouldn't just apply that technique to unzip. All underscore functions that depend on other underscore functions would probably benefit greatly.

Nice, you've been picking up what I've been laying down :3

@michaelficarra
Collaborator

@jdalton: Your JSConf talk a few years ago convinced me of that fact. Also, pulling feature tests out of functions that provide conditional functionality and instead using separate functions. And the convenient JS technique that allows us to have functions that are a combo of both: they have the conditional in them for the first call, but redefine themselves for more efficient subsequent calls. It was a good talk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 2 additions and 10 deletions.
  1. +2 −10 underscore.js
View
12 underscore.js
@@ -506,16 +506,8 @@
// `[['a',1],['b',2],['c',3]]` returns the array
// [['a','b','c'],[1,2,3]].
_.unzip = function(tuples) {
- var results = [];
- _.each(tuples, function (tuple, tupleIndex) {
- _.each(tuple, function (value, itemIndex) {
- if (results.length <= itemIndex) {
- results[itemIndex] = [];
- }
- results[itemIndex][tupleIndex] = value;
- });
- });
- return results;
+ var maxLen = _.max(_.pluck(tuples, "length"))
+ return _.times(maxLen, _.partial(_.pluck, tuples));
};
// Converts lists into objects. Pass either a single array of `[key, value]`
Something went wrong with that request. Please try again.