Permalink
Browse files

few more performance figures

  • Loading branch information...
1 parent 8bd1abb commit af82aba72b283953645fe1295d9d6a5f2285fc5f @erikfrey committed Mar 24, 2009
Showing with 18 additions and 7 deletions.
  1. +18 −7 README.textile
View
@@ -30,15 +30,26 @@ h3. count words that begin with 'b'
h2. Performance
-Completely spurious numbers, all this shows you is how useful br is to *me* :-)
+h3. big honkin' local machine
-I have four compute machines and I'm usually relegated to using one core on one machine to sort. How about when I use br?
+Let's start with a simpler scenario: I have a machine with multiple cores and for many operations I'm relegated to using just one core. How does br help us here? Here's br on an 8-core machine, essentially operating as a poor man's multi-core sort:
|_. command |_. using |_. time |_. rate |
-| sort -k1,1 4gb_file > 4gb_file_sorted | coreutils | 82m54.746s | 843 kBps |
-| br -i 4gb_file -o 4gb_file_sorted | coreutils | 9m17.431s | 7.35 MBps |
-| br -i 4gb_file -o 4gb_file_sorted | brp/brm | 4m13.816s | 16.1376745 MBps |
+| pv 4gb_file | sort -k1,1 -S2G > 4gb_file_sorted | coreutils | 30m32.078s | 2.24 MBps |
+| br -i 4gb_file -o 4gb_file_sorted | coreutils | 11m3.111s | 6.18 MBps |
+| br -i 4gb_file -o 4gb_file_sorted | brp/brm | 7m13.695s | 9.44442523 MBps |
-When I have more time I'll compare this apples to apples. There's also still a ton of room for improvement.
+The job completely i/o saturates, but still a reasonable gain!
-I'm also interested in seeing how bashreduce compares to hadoop. Of course it won't win... but how close does it come?
+h3. many cheap machines
+
+Here lies the promise of mapreduce: rather than use my big honkin' machine, I have a bunch of cheaper machines lying around that I can distribute my work to. How does br when I add four cheaper 4-core machines into the mix?
+
+|_. command |_. using |_. time |_. rate |
+| pv 4gb_file | sort -k1,1 -S2G > 4gb_file_sorted | coreutils | 30m32.078s | 2.24 MBps |
+| br -i 4gb_file -o 4gb_file_sorted | coreutils | 8m30.652s | 8.02 MBps |
+| br -i 4gb_file -o 4gb_file_sorted | brp/brm | 4m7.596s | 16.54 MBps |
+
+We have a new bottleneck: we're limited by how quickly we can partition/pump our dataset out to the nodes. awk and sort begin to show their limitations (our clever awk script is a bit cpu bound, and @sort -m@ can only merge so many files at once). So we use two little helper programs written in C (yes, I know! it's cheating! if you can think of a better partition/merge using core unix tools, contact me) to remove these bottlenecks.
+
+4 machines contribute a ~3.5x speedup, and with two little compiled c files, we eek out ~7.4x speedup. Not bad for bash, eh?

0 comments on commit af82aba

Please sign in to comment.