Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

more readme updates and info

  • Loading branch information...
commit 95c413a44eaa2ee3355ca4cd7ba89da6a92ad7df 1 parent dda7cef
@erikfrey authored
Showing with 18 additions and 3 deletions.
  1. +0 −1  README
  2. +18 −2 README.textile
View
1  README
@@ -1 +0,0 @@
-bashreduce: Map an input file to many hosts, sort/reduce, merge
View
20 README.textile
@@ -8,7 +8,9 @@ bashreduce lets you apply your favorite unix tools in a mapreduce fashion across
h2. Configuration
-If you wish, you may edit @/etc/br.hosts@ and enter the machines you wish to use as workers. You can also specify this at runtime using @br -m "host1 host2 host3..."@
+If you wish, you may edit @/etc/br.hosts@ and enter the machines you wish to use as workers. You can also specify this at runtime:
+
+<pre>br -m "host1 host2 host3"</pre>
To take advantage of multiple cores, repeat the host name as many times as you wish.
@@ -24,4 +26,18 @@ h3. word count
h3. count words that begin with 'b'
-<pre>br -r "grep ^b | uniq -c" < input > output</pre>
+<pre>br -r "grep ^b | uniq -c" < input > output</pre>
+
+h2. Performance
+
+Completely spurious numbers, all this shows you is how useful br is to *me* :-)
+
+I have four compute machines and I'm usually relegated to using one core on one machine to sort. How about when I use br?
+
+|_. command |_. time |_. mb/s |
+| sort -k1,1 4gb_file > 4gb_file_sorted | 82m54.746s | 843 kBps |
+| br -i 4gb_file -o 4gb_file_sorted | 9m49s | 6.95 MBps |
+
+When I have more time I'll compare this apples to apples. There's also still a ton of room for improvement.
+
+I'm also interested in seeing how bashreduce compares to hadoop. Of course it won't win... but how close does it come?
Please sign in to comment.
Something went wrong with that request. Please try again.