sparse vector math benchmark
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Makefile
README
vec_in.clj
vec_in.cpp
vec_in.hs
vec_in.lua
vec_in.pir
vec_in.pl
vec_in.scm
vectors
vectors2

README

Benchmark scripts for sparse vector math 
(stress-test hash tables)

Running
time ./vec_in_cpp < vectors
time ./vec_in.pl < vectors
time ./vec_in.lua
time ./vec_in_hs vectors2
time clojure1.3 ./vec_in.clj 
time parrot vec_in.pir < vectors2
time racket vec_in.scm

There are some differences between scripts:

1. Some of them parse "vectors" file (PostgreSQL hstore format), other
   use "vectors2" file (lines/spaces separated). Some of languages lack
   regex-es in core and I have prepared the second file that could be 
   parsed only with split-s. Parsing of the input data is under 2 
   seconds in every script, so this differece do not add a lot to the 
   final result
NOTE: haskell version spends too mush time on input but I do not know 
to make it better

2. Some of scripts sum the resulting distance vectors. We are not
   benchmarking this operation but we need it in lazy languages like 
   haskel in order to make them perform the computation

This are my first programs in some of this languages, so if you find
some mistakes or ways to make the programs better or if you have 
implementation in other language, please send me a pull request.

Here are the results of running them on Linux/CoreDuo E8200 @ 2.66GHz 

        time    mem
c++     6.4     38012       map<int,float>
c++     3.6     34900       unordered_map<int,float>
clojure 32      449536
haskell 22      1042884     Data.IntMap Float
lua     26      108540
luajit  6       68072
parrot  28      360992      Hash PMC_keys PMC_vals
parrot  15      263628      Hash int_keys PMC_vals
perl    33      117904
racket  30      148692

Time was measured using "time" command, memory was measured by RSS