Skip to content
Fast and memory-compact (especially on primitive types) scala HashMap and HashSet
Find file
Pull request Compare This branch is 17 commits ahead, 8 commits behind alex14n:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Strong points:
 * All data is stored in 2-3 arrays, no HashEntry objects => less memory, more GC friendly
 * No dynamic memory allocation when adding a new key/value mapping (only during resize)
 * Iteration over elements is a sequental array read => faster
 * Adding a new key/value is a sequental array write => faster
 * (Scala only) Preserves iteration order:
   if no keys were removed it's the same order they were inserted
   Java version also breaks iteration order on null keys and defragmentation
 * HashCode bits are stored in index array => less random reads when looking for missing key
 * (Scala only) Primitive types are stored in primitive arrays => saves a lot of memory

Weak points:
 * entrySet().iterator() each time creates a new Entry object, which can be slow
   but in many cases this is helped with Server VM and -XX:+DoEscapeAnalysis
 * iterators remove() method is slower than java.util.HashMap
 * (Scala only) keys and values are stored in different arrays which is slower
 * (Scala only) key remove operation is not optimized


Scala version will be rewritten with type specialization after it's released (version 2.8?)

If performance is more important than memory footprint you can modify:

  final val DEFAULT_LOAD_FACTOR = .75f

at the very end of CompactHashSet.scala file.

Some microbenchmark results:

1.5 million new Objects, 32bit JVM:

fastWrite            - 0.766s; Mem:  62.44 Mb
fastReadFull         - 0.906s; Mem:  62.44 Mb
fastReadEmpty        - 0.312s; Mem:  36.44 Mb
compactWrite         - 0.843s; Mem:  64.43 Mb
compactReadFull      - 1.000s; Mem:  64.43 Mb
compactReadEmpty     - 0.312s; Mem:  36.43 Mb
javaWrite            - 1.250s; Mem:  80.55 Mb
javaReadFull         - 0.922s; Mem:  80.55 Mb
javaReadEmpty        - 0.406s; Mem:  36.55 Mb
scalaWrite           - 1.718s; Mem:  86.55 Mb
scalaReadFull        - 1.234s; Mem:  86.59 Mb
scalaReadEmpty       - 0.687s; Mem:  36.59 Mb

1.5 million new Objects, 64bit JVM:

fastWrite            - 0.313s; Mem: 110.94 Mb
fastReadFull         - 0.302s; Mem: 110.94 Mb
fastReadEmpty        - 0.143s; Mem:  72.94 Mb
compactWrite         - 0.320s; Mem: 112.95 Mb
compactReadFull      - 0.315s; Mem: 112.95 Mb
compactReadEmpty     - 0.143s; Mem:  72.95 Mb
javaWrite            - 0.532s; Mem: 161.28 Mb
javaReadFull         - 0.313s; Mem: 161.28 Mb
javaReadEmpty        - 0.257s; Mem:  73.28 Mb
scalaWrite           - 0.786s; Mem: 149.24 Mb
scalaReadFull        - 0.451s; Mem: 149.24 Mb
scalaReadEmpty       - 0.436s; Mem:  73.24 Mb

1.5 million Ints (=i*123), 32bit JVM:

compactWrite         - 0.766s; Mem:  28.50 Mb
compactReadFull      - 0.796s; Mem:  28.50 Mb
compactReadEmpty     - 0.312s; Mem:   0.36 Mb
compactIntWrite      - 0.656s; Mem:  28.41 Mb
compactIntReadFull   - 0.453s; Mem:  28.41 Mb
compactIntReadEmpty  - 0.203s; Mem:   0.34 Mb
fastutilIntWrite     - 1.062s; Mem:  24.72 Mb
fastutilIntReadFull  - 0.422s; Mem:  24.72 Mb
fastutilIntReadEmpty - 0.406s; Mem:   0.34 Mb
troveIntWrite        - 1.453s; Mem:  39.75 Mb
troveIntReadFull     - 0.438s; Mem:  39.75 Mb
troveIntReadEmpty    - 0.407s; Mem:   0.34 Mb
fastWrite            - 1.406s; Mem:  76.94 Mb
fastReadFull         - 0.844s; Mem:  76.94 Mb
fastReadEmpty        - 0.250s; Mem:   0.59 Mb
scalaWrite           - 4.906s; Mem:  96.65 Mb
scalaReadFull        - 1.079s; Mem:  92.65 Mb
scalaReadEmpty       - 0.734s; Mem:   0.66 Mb

fast* is this with default (0.75) load factor
compact* is this CompactHashMap.scala with 0.75 load factor
compactInt* is this CompactHashMap.scala with Int accessors and 0.75 load factor
java* is java.util.HashMap (jdk7b59)
scala* is scala.collection.mutable.HashMap (2.7.4)
fastutilInt* is Int2IntOpenHashMap (5.1.5) with 0.6 load factor,
troveInt* is TIntIntHashMap (2.0.4) with 0.5 load factor,

*Write is adding a new key/value mapping
*ReadFull is reading an existing key/value mapping
*ReadEmpty is looking for a non-existing key
Something went wrong with that request. Please try again.