This repository has been archived by the owner. It is now read-only.
Permalink
Browse files

Fix maintainance of invariant for length of runs

Timsort maintains a list of sorted segments (called 'runs') in an list
called 'runLen'. To ensure that a small array suffices for this, timsort
imposes an invariant on the lengths of runs: each run must be shorter
than the preceding run in 'runs' and the sum of the lengths of two
consecutive runs must be lower than the length of the run preceding the
two runs. In effect, the (reverted) sequence of run lengths grows at
least as fast as the fibonacci sequence. The operation that adds a new
run to the list of runs (and merges runs, if necessary) makes sure that
the new list 'runLen' satisfies

    runLen [n-2] > runLen [n-1] + runLen [n]
    runLen [n-1] > runLen [n]

It has recently been discovered that this doesn't suffice to maintain
the invariant for all triples of consecutive runs:

http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

This patch fixes this problem.

Correctness of the new implementation has been checked with QuickCheck:
https://gist.github.com/timjb/cca12b004a0c782ca622
  • Loading branch information...
timjb committed Feb 25, 2015
1 parent 9def096 commit d9c3e1e454c7d338320f8c2aa12e882921d42828
Showing with 4 additions and 1 deletion.
  1. +4 −1 src/Data/Vector/Algorithms/Tim.hs
@@ -52,7 +52,10 @@ sortBy cmp vec =
performMerges (c:b:a:ss) i tmpBuf | runLengthInvariantBroken a b c i =
if i - c <= b - a
then merge cmp vec b c i tmpBuf >>= performMerges (b:a:ss) i
else merge cmp vec a b c tmpBuf >>= performMerges (c:a:ss) i
else do
tmpBuf' <- merge cmp vec a b c tmpBuf
(ass', tmpBuf'') <- performMerges (a:ss) c tmpBuf'
performMerges (c:ass') i tmpBuf''
performMerges s _ tmpBuf = return (s, tmpBuf)
performRemainingMerges (b:a:ss) tmpBuf =
merge cmp vec a b len tmpBuf >>= performRemainingMerges (a:ss)

0 comments on commit d9c3e1e

Please sign in to comment.