New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adds R.stableSort and tests #1456
Conversation
if (comparator(a, b) === 0) { | ||
return -(copyList.indexOf(a) - copyList.indexOf(b)); | ||
} else { | ||
return comparator(a, b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why call comparator
twice?
Should we update |
|
||
return copyList.sort(function(a, b) { | ||
if (comparator(a, b) === 0) { | ||
return -(copyList.indexOf(a) - copyList.indexOf(b)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative to scanning the list twice with indexOf
here would be to zip up the input list with the index of each field and then use the index to perform the secondary comparison.
e.g. something like:
var stableSort = R.curry(function (comparator, list) {
var withIndex = R.zip(list, R.range(0, list.length));
var sortedWithIndex = withIndex.sort(function(a, b) {
var order = comparator(a[0], b[0]);
return order === 0 ? a[1] - b[1] : order;
});
return R.map(R.head, sortedWithIndex);
});
I'd be interested to see how these two compare for different sized lists.
👍 |
Agreed. I've used that before for stable sorts. |
@scott-christopher Heres a benchmark of the different methods http://jsperf.com/stable-sort-ramda |
Hey guys, where are you on this? |
Nowhere at the moment. It's fallen between the cracks. @ConorLinehan: Any interest in addressing the concerns brought up in comments here? I would like to include this. |
@CrossEye Yup will do :) |
After doing the benchmarks (http://jsperf.com/stable-sort-ramda) with both discussed methods do we still want to update
|
I would look for ways to optimize the solution, but at a guess, we may need to have a separate |
I've added an extra benchmark case here which uses the same method, but hand-rolls the zip-with-index, etc. It shows a slight boost of performance in Chrome, but negligible in FF. 🤷 var stableSort3 = R.curry(function(comparator, list) {
var sorted = [];
var idx;
idx = 0;
while (idx < list.length) {
sorted.push([list[idx], idx]);
idx += 1;
}
sorted.sort(function(a, b) {
var order = comparator(a[0], b[0]);
return order === 0 ? a[1] - b[1] : order;
});
idx = 0;
while (idx < list.length) {
sorted[idx] = sorted[idx][0];
idx += 1;
}
return sorted;
}); |
* @memberOf R | ||
* @category List | ||
* @sig (a,a -> Number) -> [a] -> [a] | ||
* @param {Function} comparator A sorting function :: a -> b -> Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove a -> b -> Int
here. It's confusing that it doesn't match (a,a -> Number)
above, and at any rate it's unnecessary to duplicate the information.
I don't like the sound of this. Users would then need to choose between the "fast" sorting function which doesn't always do the right thing and the "slow" sorting function which does. This is incidental complexity. I'm imagining the entries in @svozza's I want to… table. I want to sort a list quickly and I want to stably sort a list (and I'm not in a hurry). ;) |
Which could also be written as, I want to sort a list and I want to sort a list so that equivalent items retain their original order, without mentioning speed or with just trailing parenthetical (faster) and (slower) notations. We can certainly look for more performant sorters. There is no reason to rely on Or we can live with the performance hit. That might well be the best, but it certainly would seem worth pursuing the options before making that decision. |
My view is that we should make a decision on behalf of our users. If we consider stable sorting to be important enough to justify a significant performance hit, we should update |
I can probably agree to that. It's probably not worth having a second function. But I think there is more investigation necessary to find if we can have our cake and still eat at least some of it. |
Paraphrasing from this, http://stackoverflow.com/questions/3026281/array-sort-sorting-stability-in-different-browsers, it seems chrome uses a none stable sort and are sticking to it(https://bugs.chromium.org/p/v8/issues/detail?id=90)
All the other browsers seem to use a stable sort. So from the above benchmarks here we would only be able to get near to firefox. Currently we're roughly 50% slower than firefox 47, So I suppose it's picking an acceptable tolerance? |
But see also a comparison with various merge-sort implementation: http://jsperf.com/stable-sort-comparison/5 Timsort is quite a bit faster, but it also is substantially more complex to implement. But most merge-sort implementation are stable, I believe. That might be enough. If we want stability, we obviously cannot use the native sort. Even if all the current crop of browsers did use a stable version, there is simply no guarantee. We could build atop the native version as above, or we could use our own, mergesort, timsort, or one of many other choices. |
So, should we consider an implementation like this instead?: module.exports = _curry2(function sort(comparator, list) {
return msort(_slice(list), 0, list.length);
function msort(list, begin, end) {
var size = end - begin;
if (size < 2) return;
var middle = begin + Math.floor(size / 2);
msort(list, begin, middle);
msort(list, middle, end);
merge(list, begin, middle, end);
return list;
}
function merge(list, begin, middle, end) {
while (begin < middle) {
if (comparator(list[begin], list[middle]) > 0) {
var val = list[begin];
list[begin] = list[middle];
insert(list, middle, end, val);
}
begin += 1;
}
}
function insert(list, begin, end, val) {
while (begin + 1 < end && comparator(list[begin + 1], val) < 0) {
swap(list, begin, begin + 1);
begin += 1;
}
list[begin] = val;
}
function swap(list, a, b) {
var tmp = list[a];
list[a] = list[b];
list[b] = tmp;
}
}); (Perhaps this would need to be pulled into an _internal version to be reused by Of course this depends hugely on internal mutation, but that doesn't escape the function, so I don't see any issues. This is a really basic merge sort and maybe one of the ones at http://jsperf.com/stable-sort-comparison/5 would be faster, but anything based on the same ideas might not only add stability but also actually be more performant (for now) than |
I'd be keen to see this move ahead. I think a not-guaranteed-to-be-stable sort implementation is a bit of a foot-gun in JavaScript, and I think Ramda would do well to guard users from that trap. |
Addresses #1135.