It's neither the prettiest code, nor a rigorous benchmark. It was written to get a solid feel for the relative performance of these packages for our particular workload. See the links above for more details.
- Go to EC2 and provision a
m2.4xlargeinstance with AMI
ami-31d41658(Standard Redhat 6.1 64-bit).
sshin as root.
- Download Maven 3.0.x binaries from a mirror and install per the instructions.
If you want to set up the hash table 'services' comparison:
yum install gcc-c++ zlib-devel java-1.6.0-openjdk-devel.x86_64
Download the sources for the data stores and install them:
make make install
./configure make make install
# make sure you have $JAVA_HOME set appropriately! ./configure make make install
berkeley-db-5.2.36 (You'll need to log in to get this one.)
cd build_unix ../dist/configure --enable-java make make install
/etc/ld.so.confso that Java can see the shared Kyoto Cabinet libraries.
ldconfigto make sure the links and caches are fresh.
If you want to set up the hash table 'libraries' comparison:
yum install java-1.6.0-openjdk-devel.x86_64
Get the test code source
git clone git://github.com/aggregateknowledge/hashperf.git
Grab dependencies and compile
cd hashperf mvn compile mkdir deps mvn dependency:copy-dependencies -DoutputDirectory=deps
If you want to run the hash table 'services' comparison:
Start the daemons
java -server -Djava.library.path=/usr/local/lib/:/usr/local/BerkeleyDB.5.2/lib/ -classpath deps/*:target/classes net.agkn.hashperf.services.FullPerformanceTestSuite /path/to/data.csv /path/to/stats/dir/ warmupCount obsCount pollingInterval
in my case, this was:
mkdir /dev/shm/stats/ mv data.csv /dev/shm/hash_test.csv java -server -Djava.library.path=/usr/local/lib/:/usr/local/BerkeleyDB.5.2/lib/ -classpath deps/*:target/classes net.agkn.hashperf.services.FullPerformanceTestSuite /dev/shm/hash_test.csv /dev/shm/stats/ 10 30 1000000
Note that the line count in your test file divided by the
pollingInterval should be less than or equal to
If you want to run the hash table 'libraries' comparison:
java -server -XmxNNNg -classpath deps/*:target/classes net.agkn.hashperf.libs.PerformanceTestSuite /path/to/data.csv /path/to/stats/dir/ warmupCount obsCount pollingInterval sizeHint
in my case, an example of this was:
mkdir /dev/shm/stats/ mv data.csv /dev/shm/hash_test.csv java -server -Xmx50g -classpath deps/*:target/classes net.agkn.hashperf.libs.PerformanceTestSuite /dev/shm/hash_test.csv /dev/shm/stats/ 2 2 10000000 976000000
This program and the accompanying materials are made available under the terms of the Eclipse Public License v1.0 which accompanies this distribution, and is available at http://www.eclipse.org/legal/epl-v10.html.
Aggregate Knowledge - implementation