Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE][ADD] Add stableSort function #592

Merged
merged 7 commits into from Feb 18, 2018

Conversation

simov
Copy link
Contributor

@simov simov commented Feb 6, 2018

New Snippet stableSort

Most browser engines support stable sorting via Array.sort() but not Chrome/NodeJS.

More information:

What does your PR belong to?

  • Website
  • Snippets
  • General / Things regarding the repository (like CI Integration)
  • Tests

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (non-breaking improvement of a snippet)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have checked that the changes are working properly
  • I have checked that there isn't any PR doing the same
  • I have read the CONTRIBUTING document.

@Chalarangelo
Copy link
Owner

I think, due to the fact that this is natively implemented in most environments, we can either not include it or archive it, but I'm pretty certain it should not go on the main snippet list.

@Chalarangelo Chalarangelo added the blocked Blocked by some other action. label Feb 6, 2018
@simov
Copy link
Contributor Author

simov commented Feb 6, 2018

Chrome and NodeJS represent a fairly large portion of the JavaScript community, and the stable sort is definitely useful. It's up to you.

Copy link
Contributor

@fejes713 fejes713 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing! I will leave the decision to more experienced members of the core team. 👍

However, I would kindly suggest you take a look at ES6 syntax. There is a lot of fun stuff out there and we're implementing our codebase only with the latest. Look for const and let, they are the great improvement over standard var.

@skatcat31
Copy link
Contributor

let a = [1,2,3,7,6,5]
for(let i = 0;i < 100000;i++){console.log(a.sort(()=>0))}

If you can prove it is not stable I'm more than happy to reopen this, but after several tests like this I noticed no change in the order. If sort were unstable the order would have changed

@skatcat31 skatcat31 closed this Feb 7, 2018
@simov
Copy link
Contributor Author

simov commented Feb 7, 2018

@skatcat31 the proof is in the example here is how you can reproduce it:

  1. Open up Firefox
  2. Copy/paste the stableSort function in the developer console
  3. Copy/paste the example in the developer console

The result is true for both the built-in Array.sort() and for the stableSort implementations.

Now open up Chrome and do the same. You'll notice that the results are different according to my comments in the example.

The exact same example can be found in the tests as well. There is a link to the original test, sorting [1,2,3,7,6,5] is just not enough to test it properly.

Also what exactly do you check with this .sort(()=>0)? You are not moving anything anywhere. TBH I'm not sure what exactly the bitwise comparator does: var compare = (a, b) => ~~(str.indexOf(b) / 2.3) - ~~(str.indexOf(a) / 2.3);, all I know is that the exact same code yields different results in different browsers.


@fejes713 I thought the linter would do it, just like the extra semicolons. Again from the example, this time with const:

const str = 'abcdefghijklmnopqrstuvwxyz';
const compare = (a, b) => ~~(str.indexOf(b) / 2.3) - ~~(str.indexOf(a) / 2.3);
const arr1 = str.split('');
const result = arr1.sort(compare)

Now if you copy/paste the above code in your browser and inspect arr1 and result they both hold the same value even if arr1 was declared as const and that's not fun.

@simov
Copy link
Contributor Author

simov commented Feb 7, 2018

BTW I've just made the stableSort function even shorter also added const where appropriate i.e only the line with the built-in Array.sort() is using var because it sorts the array in-place. No idea why the commit isn't showing up here.

@skatcat31
Copy link
Contributor

skatcat31 commented Feb 7, 2018

@simov it is a double negated bit-wise not operation. It is considered unsafe for comparison purposes on floats.

The compare function is double negating EVERY BIT of the division, which in ES are unsafe on floats due to precision. This is not an unstable sort, but an unstable comparison operator in MOST JS engines that properly implement the float definition as per ES. Case in point, change 2.3 to Math.PI and it will NEVER work because at that point FF needs to switch to the larger Float as per ES, and it is not accurate. In fact your stable sort fails as well even though Math.PI is a constant value.

@skatcat31
Copy link
Contributor

@simov What I check with .sort(()=>0) is the stability of the sort as per your definition. Yes I am not moving any values, but if sort were unstable order would not be preserved because the values, to sort, are told to be equivalent, regardless of their actual value.

If sort did the behavior you describe, it would have changed their order.

@simov
Copy link
Contributor Author

simov commented Feb 7, 2018

@skatcat31 Thanks for the explanations. I though the comparison might be faulty. Here I found a module that claims to be stable sorting. It have 1.8M downloads in the last month, so at least a few people think that's a stable sort. My function passes their tests, unfortunately the built in Array.sort() also passes them. So I'll look around for a better test or implement one, probably you can help with that too?

My current test however is a script that runs on every 10 minutes for months now that sorts Facebook events by their start_time. Of course some events start at the same time - date/hour/minutes. The comparison is made using new Date(event.start_time).getTime() so no floating points there. As you might guess the built in Array.sort() changes the events order for events with equal start_time. It happens randomly on NodeJS. Then I implemented the above function and the sorting is stable since then - for a couple of months now.

Also that's not my definition for stable sorting you can look it up on the internet.

@skatcat31
Copy link
Contributor

For your timed app it would depend on the comparison operation used. Without seeing what the exact code is I wouldn't be able to tell you more( feel free to contact me on gitter about it if you don't want to post it here ).

As for that stable sort, it's a bubble sort implementation compared to a fast sort like ES uses by default. It's also dependent on the comparison being consistent. Irrational numbers still ruin it.

@simov
Copy link
Contributor Author

simov commented Feb 7, 2018

Here is very simple explanation of how stable sort works.

And the implementation:

// https://github.com/mziccard/node-timsort/#stability

var t = require('assert')

// sorted by weight
var input = [ 
  { height: 100, weight: 80 },
  { height: 90, weight: 90 },
  { height: 70, weight: 95 },
  { height: 100, weight: 100 },
  { height: 80, weight: 110 },
  { height: 110, weight: 115 },
  { height: 100, weight: 120 },
  { height: 70, weight: 125 },
  { height: 70, weight: 130 },
  { height: 100, weight: 135 },
  { height: 75, weight: 140 },
  { height: 70, weight: 140 } 
]

// sorted by height (using stableSort)
var target = [
  { height: 70, weight: 95 },
  { height: 70, weight: 125 },
  { height: 70, weight: 130 },
  { height: 70, weight: 140 },
  { height: 75, weight: 140 },
  { height: 80, weight: 110 },
  { height: 90, weight: 90 },
  { height: 100, weight: 80 },
  { height: 100, weight: 100 },
  { height: 100, weight: 120 },
  { height: 100, weight: 135 },
  { height: 110, weight: 115 }
]

var stableSort = (arr, compare) => arr
  .map((item, index) => ({item, index}))
  .sort((a, b) => compare(a.item, b.item) || a.index - b.index)
  .map(({item}) => item)

var compare = (a, b) => a.height - b.height

// passes
t.deepStrictEqual(stableSort(input, compare), target)
// throws
t.deepStrictEqual(input.sort(compare), target)

// Array.sort() expects something like this
var unstable = [
  { height: 70, weight: 140},
  { height: 70, weight: 95 },
  { height: 70, weight: 125 },
  { height: 70, weight: 130 },
  { height: 75, weight: 140 },
  { height: 80, weight: 110 },
  { height: 90, weight: 90 },
  { height: 100, weight: 100 },
  { height: 100, weight: 80 },
  { height: 100, weight: 135 },
  { height: 100, weight: 120 },
  { height: 110, weight: 115 }
]

I've uploaded a gist for convenience.

@skatcat31 is this test enough to convince you that my implementation is correct?

@skatcat31
Copy link
Contributor

skatcat31 commented Feb 7, 2018

No because it's comparing a different comparison on an unrelated index. There is a fundamental problem with that comparison. What you just did was compare two different conditionals in your sorts.

Lets go back to the definition of a stable sort:

  • When told the values are the same, order is preserved

This is tested by taking an array, calling sort, and telling it ALL values are the same and seeing if the order changes. There is no other way to test that type of stability.

What you want is an indexed preserving sort, which does not care about stability, but enforces it through relativity. That is a VERY helpful snippet, and is exactly what you've implemented here. Now that we've reached that conclusion, I'd be happy to re-open this with a request for changes review that includes changing the name, and the snippet to the version you placed above if you'd like. It does mean this PR name is a little misleading though.

@skatcat31
Copy link
Contributor

As a side note: bit-wise modification comparisons are actually considered unsafe for ANY comparison due to changing the Byte size as it turns out, and is a problem that persists in ALL languages that do not preserve ALL bytes with a bit-wise modification( coercion or modification ), of which ES is one of those languages.

I dug a little deeper into the theory behind the constructs and stumbled upon that ES and bit-wise negate coerce a copy value in byte form, which leads to changing sizes on anything but small enough integers directly

@skatcat31
Copy link
Contributor

After re-reading the thread I did miss the intent of your snippet based on the description. As such I offer an apology on that matter and should have recognized that it was a different algorithm in the first place. I am sorry for that. I'll try to take a closer look in the future, but as it stands I'm doing this in between things at work

@simov
Copy link
Contributor Author

simov commented Feb 7, 2018

Thanks @skatcat31 ! No problem. If you re-open this PR so that at least it's up for consideration, I'll update it with the latest test so that it's easier to understand and useful to others.

@skatcat31
Copy link
Contributor

Sure. I'll submit the request for changes as I reopen it too

@skatcat31 skatcat31 reopened this Feb 7, 2018
Copy link
Contributor

@skatcat31 skatcat31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just update the name and change it to a const where needed and it's good to go. The only question now is what would be a good name

Returns new array without modifying the initial one.

```js
var stableSort = (arr, compare) =>

This comment was marked as spam.

@skatcat31 skatcat31 removed the blocked Blocked by some other action. label Feb 7, 2018
@skatcat31
Copy link
Contributor

Now the question is what should it's name be... preservedSort? indexedSort?

@simov
Copy link
Contributor Author

simov commented Feb 7, 2018

I'm not sure according to Wikipedia stableSort is the correct name:

.. if two items compare as equal, like the two 5 cards, then their relative order will be preserved, so that if one came before the other in the input, it will also come before the other in the output.

@skatcat31
Copy link
Contributor

Fair I guess. Then just update to the const delcerations and let declerations and I'm happy and so will fejes be

Copy link
Contributor

@skatcat31 skatcat31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Var should be avoided

const compare = (a, b) => ~~(str.indexOf(b) / 2.3) - ~~(str.indexOf(a) / 2.3);

// default Array.sort() is unstable in Chrome and NodeJS + modifies the input array
var arr1 = str.split('');

This comment was marked as spam.

This comment was marked as spam.

console.log(arr1.sort(compare).join('') === 'xyzvwtursopqmnklhijfgdeabc'); // false

// stable sort + returns new array
const arr2 = str.split('');

This comment was marked as spam.

@skatcat31
Copy link
Contributor

skatcat31 commented Feb 7, 2018

```js
var a;
for(let i = 0; i < 1000; i++){
  a = [1,2,3,4,5,6,7,8,9]
  console.log(a.sort((a,b) => a%2 - b%2)) // [2,4,6,8,1,3,5,7,9]
}
```
BAH Chrome uses several different sorting algorithms while FF decided to just always use merge sort

Copy link
Owner

@Chalarangelo Chalarangelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples are definitely excessive. Please provide just one example with less data, so that it's easy to read quickly and understand the result.

@skatcat31
Copy link
Contributor

After doing some DEEEP diving into the code for V8 and how it handles sorts, it is apparent that there are many cases where it could be stable and many cases where it could be unstable. Due to the absolute absurdity of it this is a useful snippet. We do need however a much more concise example that isn't so long

@skatcat31
Copy link
Contributor

let a = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
a.sort((a,b) => a%2 - b%2) // [8, 14, 12, 4, 10, 6, 2, 7, 9, 5, 11, 3, 13, 1]
a = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
a.sort(() => 0) // [8, 1, 3, 4, 5, 6, 7, 2, 9, 10, 11, 12, 13, 14]

@skatcat31
Copy link
Contributor

The two example above should prove that sort is unstable. I had to dig deep to discover that V8 has 2 different sorting algorithms

@Chalarangelo
Copy link
Owner

What I meant is that we should just link to an article or something explaining why and how sorting is unstable and provide a shorter example showcasing what we achieve with this. If someone is looking for this, they should already have a basic understanding of the problem anyways. Also, this should definitely be tagged advanced as it appears worthless to the casual reader who has not faced this issue (ie a lot of people, provided I have never heard of this before and I have been coding in JS for over 3 years now).

@skatcat31 skatcat31 dismissed fejes713’s stale review February 8, 2018 01:03

Requested changes were implimented

@skatcat31
Copy link
Contributor

dismissed @fejes713 request for changes since those were addressed. Now it's just waiting for a smaller example or clarification on the purpose and then it'll be ready

@simov
Copy link
Contributor Author

simov commented Feb 8, 2018

I've added the advanced tag and removed the Array.sort() part of the example. The example data causes the built-in sort to fail on the rule of stable sorting. So it'll be easier for the user to test and see what exactly is the problem. Probably a line or two can be removed but overall the data set should be this.

@Chalarangelo
Copy link
Owner

Chalarangelo commented Feb 8, 2018

I would also like to run some tests before merging, because I think that mapping to arrays with a value-index pair and mapping back from them might perform slightly faster than using objects. Granted, the object technique is pretty cool and it showcases implicit property naming, but if the performance is a lot better with arrays, we might want to stick with that.

@kingdavidmartins
Copy link
Contributor

kingdavidmartins commented Feb 8, 2018

Hey All I know I am joining in pretty late into the convo.

But why not just implement a mergeSort, quickSort & heapSort snippet. They are all self explanatory and has their specific trade offs.

I def agree with @Chalarangelo statement

If someone is looking for this, they should already have a basic understanding of the problem anyways.

And will most likely implement a mergeSort especially since in it's entirety it's a stable sort.

Typically in CS the 3 basic sorts before branching into esoteric & heuristic based sorts.

  • quickSort Time: worst case: O (N^2) | Time: average case: O (N LOG (N)) | Space: worst case: O (N LOG (N)) | unstable
  • mergeSort Time: worst case: O (N LOG (N)) | Time: average case: O (N LOG (N)) | Space: worst case: O (N) |stable
  • heapSort Time: worst case: O (N LOG (N)) | Time: average case: O (N LOG (N)) | | Space: worst case: O (N)

And so on.... Please do a quick look up if you want to dive deeper

They all trade offs from time complexity, to space complexity, state management and etc.

Excluding non-comparative integer sorting algorithms like radix sort & etc

I am def against using .sort() to implement a stable sort for the following reason as stated by MDN

The sort() method sorts the elements of an array in place and returns the array. The sort is not necessarily stable. The default sort order is according to string Unicode code points.

I believe if we truly want to have people learn how to implement a "stable sort" it shouldn't be dependent on .sort() and should show them the simple implementation in it's entirety which uses recursion which isn't really complicated to implement

In regards to how languages which have native sorting implementations. It's never 1 specific algorithm and relies on a plethora of checks, optimizations, branching, and etc. But on average it relies on a much more complicated variant quickSort then branches from there.

@simov
Copy link
Contributor Author

simov commented Feb 8, 2018

@kingdavidmartins thanks for your input! I really like this type of conversation, however the point of this snippet is:

The simplest possible way to implement stable sort, and you need less than 30 seconds to understand it.

Besides that this implementaion does not modify the input array, because of the map prior sorting it.

@Chalarangelo we can micro benchmark Array vs Object initialization time + destructuring, but I'm afraid that's not my point either.

@Chalarangelo
Copy link
Owner

@kingdavidmartins Implementing commonly used algorithms is out of the scope of the project, as we have discussed before, but I am not against adding said snippets in the archive, as they can be extremely useful to some people. Using the native sortigg lowers the barrier of entry for people and makes it easier to understand, as well as showcasing what can be done in order to improve the native sorting. Usually, native implementations are optimized a lot more than what we can code in a short snippet, so I'd rather we stuck with the native Array.sort() for stable sort and separately showcase sorting algorithms in the archive, if we decide to do so.

@simov I understand the problem, in fact I think I might have faced it in the past once. JS's sorting implementations are not exactly perfect in regards to when the sorting function returns 0, so stable sorting is something that could definitely be added as a snippet (hece your PR).

What I meant about the performance is that we want to make snippets as useful as possible, so we might want to check if arrays are faster than objects in the mapping, so that sorting huge collections can be as fast as possible. It's a tiny change anyways and will not harm the snippet at all (if arrays are in fact faster, which I still have not checked).

@Chalarangelo
Copy link
Owner

Apparently, as the test indicated, the object version is barely faster, so nevermind my suggestion. Also, the object version showcases some nice techniques such as implicit object property naming, which is pretty cool, so we're sticking with that.

@simov
Copy link
Contributor Author

simov commented Feb 8, 2018

@Chalarangelo I've stumbled upon it only a few months ago, and it was really annoying. So I think it's handy to have this little function at your disposal.

@skatcat31
Copy link
Contributor

skatcat31 commented Feb 8, 2018

@simov a shorter example was already posted that shows that sort is unstable when using it's secondary algorithm. If the length of the array is only 10 it uses one algorithm that is stable. If it's longer than that it uses a different, not stable algorithm. The smallest case to demonstrate this is as follows based on the current V8 source code definition of it's sorting algorithm:

let a = [0,1,2,3,4,5,6,7,8,9,10]
stableSort(a, () => 0) // [0,1,2,3,4,5,6,7,8,9,10]
a.sort(() => 0) // [5, 0, 2, 3, 4, 1, 6, 7, 8, 9, 10]

I think if we use that example it will be enough and short enough

@simov
Copy link
Contributor Author

simov commented Feb 9, 2018

@skatcat31 that's clever! Now I'm wondering which one would be easier to understand for most of the people, otherwise your example fits the requirement to be short better and I personally like it.

BTW the spread operator can be used to deal with the in-place sort of the built-in method:

let a = [0,1,2,3,4,5,6,7,8,9,10]
stableSort(a, () => 0) // [0,1,2,3,4,5,6,7,8,9,10]
;[...a].sort(() => 0) // [5, 0, 2, 3, 4, 1, 6, 7, 8, 9, 10]

or new Array(...a) depending on which one do you prefer.

@skatcat31
Copy link
Contributor

Or Array.from

let a = [0,1,2,3,4,5,6,7,8,9,10]
stableSort(a, () => 0) // [0,1,2,3,4,5,6,7,8,9,10]
Array.from(a).sort(() => 0) // [5, 0, 2, 3, 4, 1, 6, 7, 8, 9, 10]

@kingdavidmartins
Copy link
Contributor

@Chalarangelo sure. I will implement them and add them to archive folder. Hopefully it helps

@simov
Copy link
Contributor Author

simov commented Feb 11, 2018

@Chalarangelo, as suggested by @skatcat31, ee64565

@Chalarangelo Chalarangelo merged commit cfede03 into Chalarangelo:master Feb 18, 2018
@lock
Copy link

lock bot commented Dec 18, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for any follow-up tasks.

@lock lock bot locked as resolved and limited conversation to collaborators Dec 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants