Permalink
Switch branches/tags
Nothing to show
Find file Copy path
f694413 Dec 10, 2013
1 contributor

Users who have contributed to this file

98 lines (85 sloc) 5.29 KB
Fast Concurrent Memoization Map (fcmm)
http://projects.giacomodrago.com/fcmm
This file contains the results of some benchmarks comparing fcmm and
tbb::concurrent_hash_map. You can execute the same benchmark on your machine by running the
following commands in the "test" directory:
$ make benchmark_tbb # Needs the Intel Threading Building Blocks (TBB) library
$ export FCMM_NUM_THREADS=4 # Change as needed
$ ./benchmark_tbb_batch.sh | tee benchmark_results.txt # Will write a file named benchmark_results.txt
# containing a table similar to the ones below
Actually, fcmm and tbb::concurrent_hash_map provide different features (e.g. fcmm does not support
erase and update) so the benchmark is not meant to be an 'absolute' comparison between the two.
fcmm may provide a better performance when:
- There is no need to delete or update the entries
- The map shall store a large number of entries and you can provide a reasonable estimate of their number
- Values are relatively fast to compute
- Concurrency is high (4 threads or more)
Notes:
----------
- For the serial run std::unordered_map is used
- Times are expressed in milliseconds
Machine 1:
----------
Intel Core i5-2415M @2.30GHz
4GB RAM
Mac OS X 10.8.4
MacPorts gcc 4.7.3
Number of threads: 4
Operations % of Time Time Time fcmm speedup fcmm speedup
(mln) insertions (serial) (TBB) (fcmm) over serial over TBB
-----------------------------------------------------------------------------------
8 10 2192 1284 711 3.08 1.81
8 20 2376 1287 765 3.11 1.68
8 30 2676 1407 891 3.00 1.58
8 40 3122 1699 959 3.26 1.77
8 50 3337 1990 1019 3.27 1.95
16 10 4426 1864 1364 3.24 1.37
16 20 5274 2489 1579 3.34 1.58
16 30 5790 2973 1730 3.35 1.72
16 40 6454 4353 1865 3.46 2.33
16 50 6928 4342 2049 3.38 2.12
32 10 9953 4189 2810 3.54 1.49
32 20 11137 5555 3172 3.51 1.75
32 30 12360 7212 3635 3.40 1.98
32 40 13859 9430 4193 3.31 2.25
32 50 16754 11493 4540 3.69 2.53
Machine 2:
----------
Google Compute Engine
n1-highmem-8 (Intel Xeon @2.60GHz, 8 virtual cores, 52 GB RAM)
Debian 7 64 bit
gcc 4.7.2
Number of threads: 8
Operations % of Time Time Time fcmm speedup fcmm speedup
(mln) insertions (serial) (TBB) (fcmm) over serial over TBB
-----------------------------------------------------------------------------------
8 10 1403 375 267 5.25 1.40
8 20 1862 600 356 5.23 1.69
8 30 2187 803 405 5.40 1.98
8 40 2542 1044 472 5.39 2.21
8 50 2674 1229 496 5.39 2.48
16 10 3469 884 623 5.57 1.42
16 20 4408 1374 767 5.75 1.79
16 30 4707 1710 822 5.73 2.08
16 40 5148 2166 925 5.57 2.34
16 50 5499 2577 1058 5.20 2.44
32 10 7974 2025 1358 5.87 1.49
32 20 8877 2837 1585 5.60 1.79
32 30 10105 3585 1702 5.94 2.11
32 40 10604 4384 1861 5.70 2.36
32 50 11940 5248 1886 6.33 2.78
64 10 16761 4226 2804 5.98 1.51
64 20 18493 5814 3259 5.67 1.78
64 30 20556 7530 3752 5.48 2.01
64 40 21699 8798 3739 5.80 2.35
64 50 25286 10675 4184 6.04 2.55
128 10 34087 8638 5651 6.03 1.53
128 20 38353 11808 6150 6.24 1.92
128 30 42238 15731 6781 6.23 2.32
128 40 47761 19437 7763 6.15 2.50
128 50 55805 21800 7901 7.06 2.76
256 10 72337 17640 11223 6.45 1.57
256 20 86691 25384 13253 6.54 1.92
256 30 98354 32520 15987 6.15 2.03
256 40 110349 38903 15953 6.92 2.44
256 50 128414 42968 20337 6.31 2.11