Skip to content

Commit

Permalink
more readme updates and info
Browse files Browse the repository at this point in the history
  • Loading branch information
erikfrey committed Mar 8, 2009
1 parent dda7cef commit 95c413a
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 3 deletions.
1 change: 0 additions & 1 deletion README

This file was deleted.

20 changes: 18 additions & 2 deletions README.textile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ bashreduce lets you apply your favorite unix tools in a mapreduce fashion across

h2. Configuration

If you wish, you may edit @/etc/br.hosts@ and enter the machines you wish to use as workers. You can also specify this at runtime using @br -m "host1 host2 host3..."@
If you wish, you may edit @/etc/br.hosts@ and enter the machines you wish to use as workers. You can also specify this at runtime:

<pre>br -m "host1 host2 host3"</pre>

To take advantage of multiple cores, repeat the host name as many times as you wish.

Expand All @@ -24,4 +26,18 @@ h3. word count

h3. count words that begin with 'b'

<pre>br -r "grep ^b | uniq -c" < input > output</pre>
<pre>br -r "grep ^b | uniq -c" < input > output</pre>

h2. Performance

Completely spurious numbers, all this shows you is how useful br is to *me* :-)

I have four compute machines and I'm usually relegated to using one core on one machine to sort. How about when I use br?

|_. command |_. time |_. mb/s |
| sort -k1,1 4gb_file > 4gb_file_sorted | 82m54.746s | 843 kBps |
| br -i 4gb_file -o 4gb_file_sorted | 9m49s | 6.95 MBps |

When I have more time I'll compare this apples to apples. There's also still a ton of room for improvement.

I'm also interested in seeing how bashreduce compares to hadoop. Of course it won't win... but how close does it come?

0 comments on commit 95c413a

Please sign in to comment.