This repository has been archived by the owner. It is now read-only.
erlang experiment in distributed median finding
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
Capfile
README
controller.erl
erl.sh
generate_binary_dicts.erl
generate_generate_scripts.rb
generate_test_data.rb
generate_test_data_single.rb
median.erl
median.rb
parse_file.erl
setup_users.rb
spread_across_files.rb
worker.erl
worker_freq.erl

README

finding the median of a trillion numbers
http://matpalm.com/median
see above page for walkthrough, basic instructions follow
 
-----------------
generate a list of (approx) 1000 numbers whose min is 1, max is 100 and median is 40
spread this across 4 files; numbers.0, numbers.1, numbers.2, numbers.3

bash> ./generate_test_data.rb 1 40 100 1000 > numbers.all
bash> cat numbers.all | ./spread_across_files.rb numbers 4

-----------------
run ruby version

bash> ./median.rb < numbers.all

-----------------
run erlang single process version

bash> erl -noshell -run median from_file numbers.all

-----------------
run erlang multiple process version (all processes on local machine) using full list impl

bash> erl -noshell -sname bob -run controller init worker numbers.[0-3]

-----------------
run erlang multiple process version (all processes on local machine) using list freq impl

bash> erl -noshell -run generate_binary_dicts main numbers.all numbers.dict 4
bash> erl -noshell -sname bob -run controller init worker_freq numbers.dict.[0-3]