Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
README
batch.sh
clj-compile.sh
clj-run.sh
fannkuch.clj-1.clj
fannkuch.clj-10.clj
fannkuch.clj-11.clj
fannkuch.clj-2.clj
fannkuch.clj-3.clj
fannkuch.clj-4.clj
fannkuch.clj-5.clj
fannkuch.clj-6.clj
fannkuch.clj-6b.clj
fannkuch.clj-7.clj
fannkuch.clj-8.clj
fannkuch.clj-9.clj
fannkuch.ghc-5.ghc
fannkuch.java
fannkuch.perl
fannkuch.sbcl-2.sbcl
fannkuch.sbcl-2.sbcl_compile
fannkuch.sbcl-2.sbcl_run
ghc-compile.sh
ghc-run.sh
java-compile.sh
java-run.sh
sbcl-compile.sh
sbcl-run.sh

README

----------------------------------------------------------------------
fannkuch.clj-1.clj

Fairly straightforward Clojure implementation, using lex-permutations
from clojure.contrib.combinatorics library to generate a sequence of
all permutations of the numbers from 1 to n, which is used for
iterating over them, computing the maximum number of flips any of them
have before 1 is at the front of the sequence.

----------------------------------------------------------------------
fannkuch.clj-2.clj

Minor variation on clj-1 to see if permutations-in-fannkuch-order in
place of lex-permutations is faster.  It is slower.

----------------------------------------------------------------------
fannkuch.clj-3.clj

Minor variation of clj-1 where I replace a call to max with an if.  I
looked at some profiling data from clj-1 runs and saw reduce taking a
significant fraction of the time.  I didn't know where it was coming
from at the time, and assumed incorrectly it was the call to max.  It
turns out to have been not there, but the call to into in
fannkuch-of-permutation.

----------------------------------------------------------------------
fannkuch.clj-4.clj

Minor variation of clj-1, where I changed the "reverse first N
elements" implementation inside of fannkuch-of-permutation so that it
uses sequence operations instead of vector-specific operations.

I don't recall whether this was faster or slower than clj-1.

----------------------------------------------------------------------
fannkuch.clj-5.clj

Bigger variation of clj-1 than clj-4.  This one has a new function
reverse-first-n for reversing the first n elements.  It uses
loop/recur and an accumulator for the reversed part of the list, in
hopes of being faster.

----------------------------------------------------------------------
fannkuch.clj-6b.clj

clj-5's reverse-first-n checked for some boundary conditions on its
arguments that never occur in this program.  clj-6b replaces it with
reverse-first-n-restricted that does not check these conditions, and
is thus a bit faster.

MacBook Pro run times for medium length test, multiple runs:
real	1m12.593s  0m59.537s  0m56.020s
user	1m12.967s  1m 1.349s  0m56.762s
sys	0m 6.280s  0m 5.829s  0m 5.809s
----------------------------------------------------------------------
fannkuch.clj-6.clj

Identical to clj-6b, except it replaces the loop in function fannkuch
with an equivalent reduce / map expression.  It seems a bit faster
than clj-6b, which surprises me.  Not much, though.

MacBook Pro run times for medium length test, multiple runs:
real	0m55.899s  0m47.597s  0m45.530s
user	1m 0.501s  0m47.003s  0m45.682s
sys	0m 3.307s  0m 1.288s  0m 1.241s
----------------------------------------------------------------------
fannkuch.clj-7.clj

Like clj-6b, but fannkuch-of-permutation now uses a mutable Java array
for doing its flipping, using a new function reverse-first-n!

Not much faster than clj-6, if at all.

----------------------------------------------------------------------
fannkuch.clj-8.clj

Like clj-7, but also uses Java mutable arrays to generate the sequence
of permutations, instead of lex-permutations, like most previous
implementations have.

----------------------------------------------------------------------
fannkuch.clj-9.clj

Does not use any Java mutable arrays.  Most like clj-6, except
modified to use the recently-added transient/assoc!/persistent!
functions in reverse-first-n-restricted and next-lex-permutation.  It
is significantly faster than clj-6, but at least on medium test seems
slightly slower than clj-8.  Need to run the long test to compare.

Measurements of multi-threaded version on MacBook Pro with 2 cores

Why so much more user time with 2 or 3 threads vs. 1?  Is this a case
of a shared on-chip cache between the 2 cores being big enough for 1
thread, but getting thrashed for 2 or more?

                         Threads
Test             1          2          3
------       ---------  ---------  ---------
medium  real 0m32.852s  0m35.374s  0m25.966s
run #1  user 0m33.403s	0m56.752s  0m45.090s
        sys  0m 1.128s	0m 2.075s  0m 1.374s
				   	 
medium  real 0m32.619s	0m26.669s  0m31.447s
run #2  user 0m33.232s	0m42.485s  0m53.814s
        sys  0m 1.078s	0m 1.469s  0m 1.827s
				   	 
medium  real 0m33.979s	0m29.133s  0m29.376s
run #3  user 0m34.469s	0m47.837s  0m50.388s
        sys  0m 1.140s	0m 1.382s  0m 1.744s
				   	 
			0m29.736s  0m32.329s
			0m49.199s  0m51.199s
			0m 1.420s  0m 1.836s
----------------------------------------------------------------------
fannkuch.clj-10.clj

Like clj-8, but change reverse-first-n! to a macro, and add #^ints
type annotation to fannkuch-of-permutation's argument.  Also use
aclone instead of slower copy-java-int-array (at least, aclone is
faster when it has the appropriate type annotations for its argument).

----------------------------------------------------------------------