Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Change the shuffle to the Fisher-Yates shuffle #72

Merged
merged 1 commit into from

2 participants

@d8uv
Collaborator

The current "Shuffling Array Elements" page recommends a naive algorithm, which produces a slow and biased shuffle. Included is a highly idiomatic refactorization of the Fisher-Yates shuffle (aka the Knuth Shuffle), a less-idiomatic-but-better refactorization, and a version of the algorithm that adds to Array.prototype, for those that program in that style.

@d8uv d8uv Change the shuffle to the Fisher-Yates shuffle
The current "Shuffling Array Elements" page recommends a naive algorithm, which produces 
a slow and biased shuffle. Included is a highly idiomatic refactorization of the Fisher-
Yates shuffle (aka the Knuth Shuffle), a less-idiomatic-but-better refactorization, and a
version of the algorithm that adds to Array.prototype, for those that program in that 
style.
fa13f3b
@dbrady
Owner

THIS. This is what a fantastic recipe entry looks like. I saw this email and thought "oh no, they've deleted the simple shuffle and replaced it with a needlessly complex one". I came in here expecting to write a message asking you to preserve the old shuffle, but then explain when and why it might be unsuitable, and propose the Fisher-Yeates shuffle as a replacement. When I read your diff I wanted to cry with joy. d8uv, this is awesome. THANK YOU.

This is the happiest "I hereby give you commit rights" message I've written to date. :-) Welcome!

@dbrady dbrady merged commit d0ee5cb into coffeescript-cookbook:master
@dbrady
Owner

Actually, rereading it, might be better if it started with array.shuffle(), presented for fairly simple cases of randomness. Is it slightly biased, or is it obviously biased? I haven't run the Chi-squareq analysis on it to see how broken it is. I wonder if we could give some examples for when it works as a naive example, along with a specific use case that shows just how bad it can be--sort of like how you should never use modulus on Linear Congruential generators because rand() % 2 in the C stdlib will return 0,1,0,1,0,1,... etc.

@d8uv d8uv deleted the unknown repository branch
@d8uv
Collaborator

The chi-square analysis for the random.sort vs. fisher-yates has been done by Rob Weir. I haven't done a proper formal analysis of my refactorization, so if you're gonna drag out R you might as well check to see if my math-fu is strong, but my informal tests seem to pan out wonderfully.

The reason I didn't write it in a "this is what everyone uses, these are the problems, and this is the solution" way is... I find that style to be a little complex when you're trying to learn something specific. People are here to learn how best to shuffle in Coffeescript, and Knuth Shuffle is the best way. People aren't here to learn about a bunch of different methods and their pitfalls, they just want the goods, and to get out. Adding that supplementary stuff at the end is great, and I'd be really curious as to what that'd look like, but the solution should stay at the top

And as to if the naive solution is good enough... I'd say probably never. Doing it properly only takes up 5 lines of source, is a lot faster, and is free from bias. When you want something random, you really want it to be random, not random-ish.

@dbrady
Owner
@d8uv d8uv referenced this pull request from a commit
@d8uv d8uv Make Array::shuffle safer
As mentioned by @dbrady in #72 we shouldn't overwrite a native `Array::shuffle`
0648bcf
@d8uv
Collaborator

I... actually disagree with that. If Array::shuffle gets into native javascript, then code I wrote in 0648bcf automatically becomes a polyfill. It'll use the native method if it's available, and will use our method if it isn't. And regardless, most of the time, we should be using a utility library like Lo-dash or Underscore, because they smooth out inconsistencies when the native methods differ from platform to platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Feb 28, 2013
  1. @d8uv

    Change the shuffle to the Fisher-Yates shuffle

    d8uv authored
    The current "Shuffling Array Elements" page recommends a naive algorithm, which produces 
    a slow and biased shuffle. Included is a highly idiomatic refactorization of the Fisher-
    Yates shuffle (aka the Knuth Shuffle), a less-idiomatic-but-better refactorization, and a
    version of the algorithm that adds to Array.prototype, for those that program in that 
    style.
This page is out of date. Refresh to see the latest.
Showing with 82 additions and 5 deletions.
  1. +82 −5 chapters/arrays/shuffling-array-elements.md
View
87 chapters/arrays/shuffling-array-elements.md
@@ -9,17 +9,94 @@ You want to shuffle the elements in an array.
## Solution
-The JavaScript Array `sort()` method accepts a custom sort function. We can write a `shuffle()` method to add some convenience.
+The [Fisher-Yates shuffle] is a highly efficient and completely unbiased way to randomize
+the elements in an array. It's a fairly simple method: Start at the end of the list, and
+swap the last element with a random element from earlier in the list. Go down one and
+repeat, until you're at the beginning of the list, with all of the shuffled elements
+at the end of the list. This [Fisher-Yates shuffle Visualization] may help you understand
+the algorithm.
{% highlight coffeescript %}
-Array::shuffle = -> @sort -> 0.5 - Math.random()
+shuffle = (a) ->
+ # From the end of the list to the beginning, pick element `i`.
+ for i in [a.length-1..1]
+ # Choose random element `j` to the front of `i` to swap with.
+ j = Math.floor Math.random() * (i + 1)
+ # Swap `j` with `i`, using destructured assignment
+ [a[i], a[j]] = [a[j], a[i]]
+ # Return the shuffled array.
+ a
-[1..9].shuffle()
+shuffle([1..9])
# => [ 3, 1, 5, 6, 4, 8, 2, 9, 7 ]
{% endhighlight %}
+[Fisher-Yates shuffle]: http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
+[Fisher-Yates Shuffle Visualization]: http://bost.ocks.org/mike/shuffle/
+
## Discussion
-For more background on how this shuffle logic works, see this [discussion at StackOverflow](http://stackoverflow.com/questions/962802/is-it-correct-to-use-javascript-array-sort-method-for-shuffling).
+### The Wrong Way to do it
+
+There is a common--but terribly wrong way--to shuffle an array, by sorting by a random
+number.
+
+{% highlight coffeescript %}
+shuffle = (a) -> a.sort -> 0.5 - Math.random()
+{% endhighlight %}
+
+If you do a sort randomly, it should give you a random order, right? Even [Microsoft used
+this random-sort algorithm][msftshuffle]. Turns out, [this random-sort algorithm produces
+biased results][naive], because it only has the illusion of shuffling. Randomly sorting
+will not result in a neat, tidy shuffle; it will result in a wild mass of inconsistent
+sorting.
+
+[msftshuffle]: http://www.robweir.com/blog/2010/02/microsoft-random-browser-ballot.html
+[naive]: http://www.codinghorror.com/blog/2007/12/the-danger-of-naivete.html
+
+### Optimizing for speed and space
+
+The solution above isn't as fast, or as lean, as it can be. The list comprehension, when
+transformed into Javascript, is far more complex than it needs to be, and the
+destructured assignment is far slower than dealing with bare variables. The following
+code is less idiomatic, and takes up more source-code space... but will compile down
+smaller and run a bit faster:
+
+{% highlight coffeescript %}
+shuffle = (a) ->
+ i = a.length
+ while --i > 0
+ j = ~~(Math.random() * (i + 1)) # ~~ is a common optimization for Math.floor
+ t = a[j]
+ a[j] = a[i]
+ a[i] = t
+ a
+{% endhighlight %}
+
+### Extending Javascript to include this shuffle.
+
+The following code adds the shuffle function to the Array prototype, which means that
+you are able to run it on any array you wish, in a much more direct manner.
+
+{% highlight coffeescript %}
+Array::shuffle = ->
+ for i in [@length-1..1]
+ j = Math.floor Math.random() * (i + 1)
+ [@[i], @[j]] = [@[j], @[i]]
+ @
+
+[1..9].shuffle()
+# => [ 3, 1, 5, 6, 4, 8, 2, 9, 7 ]
+{% endhighlight %}
+
+**Note:** Although it's quite common in languages like Ruby, extending native objects is
+often considered bad practice in JavaScript (see: [Maintainable JavaScript: Don’t modify
+objects you don’t own][dontown]; [Extending built-in native objects. Evil or not?]
+[extendevil]).
+
+Also, if you think you'll be using a lot of these utility functions, consider using a
+utility library, like [Lo-dash](http://lodash.com/). They include a lot of nifty
+features, like maps and forEach, in a cross-browser, lean, high-performance way.
-**Note:** Although it's quite common in languages like Ruby, extending native objects is often considered bad practice in JavaScript (see: [Maintainable JavaScript: Don’t modify objects you don’t own](http://www.nczonline.net/blog/2010/03/02/maintainable-javascript-dont-modify-objects-you-down-own/); [Extending built-in native objects. Evil or not?](http://perfectionkills.com/extending-built-in-native-objects-evil-or-not/)).
+[dontown]: http://www.nczonline.net/blog/2010/03/02/maintainable-javascript-dont-modify-objects-you-down-own/
+[extendevil]: http://perfectionkills.com/extending-built-in-native-objects-evil-or-not/
Something went wrong with that request. Please try again.