import "github.com/barbell-math/smoothbrain-hashmap"
A very simple library that implements a generic, linear probing map.
Example (Custom Eq And Hash Funcs)
seed := maphash.MakeSeed()
h := NewCustom[string, string](
4,
func(l, r string) bool { return strings.ToLower(l) == strings.ToLower(r) },
func(v string) uint64 { return maphash.String(seed, strings.ToLower(v)) },
)
h.Put("one", "one")
h.Put("two", "two")
h.Put("three", "three")
h.Put("ThReE", "four")
if val, ok := h.Get("three"); ok {
fmt.Println(val)
}
h.Remove("tWO")
fmt.Println(h.Len())
fmt.Println("Keys:")
keys := slices.Collect(h.Keys())
slices.Sort(keys)
fmt.Println(keys)
//Output:
// four
// 2
// Keys:
// [one three]
four
2
Keys:
[one three]
Example (Simple)
h := New[int, string]()
h.Put(1, "one")
h.Put(2, "two")
h.Put(3, "three")
if val, ok := h.Get(1); ok {
fmt.Println(val)
}
if _, ok := h.Get(4); !ok {
fmt.Println("4 was not in the map!")
}
h.Remove(1)
fmt.Println(h.Len())
fmt.Println("Keys:")
keys := slices.Collect(h.Keys())
slices.Sort(keys)
fmt.Println(keys)
//Output:
// one
// 4 was not in the map!
// 2
// Keys:
// [2 3]
one
4 was not in the map!
2
Keys:
[2 3]
- func ComparableEqual[T comparable](l T, r T) bool
- func ComparableHash[T comparable]() func(v T) uint64
- type Map
- func New[K comparable, V comparable]() Map[K, V]
- func NewCap[K comparable, V comparable](_cap int) Map[K, V]
- func NewCustom[K any, V any](_cap int, eq func(l K, r K) bool, hash func(v K) uint64) Map[K, V]
- func (m *Map[K, V]) Clear()
- func (m *Map[K, V]) Copy() *Map[K, V]
- func (m *Map[K, V]) Get(k K) (V, bool)
- func (m *Map[K, V]) Keys() iter.Seq[K]
- func (m *Map[K, V]) Len() int
- func (m *Map[K, V]) PntrVals() iter.Seq[*V]
- func (m *Map[K, V]) Put(k K, v V)
- func (m *Map[K, V]) Remove(k K)
- func (m *Map[K, V]) Vals() iter.Seq[V]
- func (m *Map[K, V]) Zero()
func ComparableEqual
func ComparableEqual[T comparable](l T, r T) bool
An equality function that can be passed to NewCustom when using a comparable type. If the key type is comparable then you can simply use New instead of NewCustom and this function will be Used by default.
func ComparableHash
func ComparableHash[T comparable]() func(v T) uint64
A hash function that can be passed to NewCustom when using a comparable type. If the key type is comparable then you can simply use New instead of NewCustom and this function will be Used by default.
type Map
type Map[K any, V any] struct {
// contains filtered or unexported fields
}
func New
func New[K comparable, V comparable]() Map[K, V]
Creates a Map where K is the key type and V is the value type. ComparableEqual and ComparableHash functions will be Used by the returned Map. For creating a Map with non-comparable types or custom hash and equality functions refer to NewCustom.
func NewCap
func NewCap[K comparable, V comparable](_cap int) Map[K, V]
Creates a Map where K is the key type and V is the value type with a capacity of `_cap`. ComparableEqual and ComparableHash functions will be Used by the returned Map. For creating a Map with non-comparable types or custom hash and equality functions refer to NewCustom.
func NewCustom
func NewCustom[K any, V any](_cap int, eq func(l K, r K) bool, hash func(v K) uint64) Map[K, V]
Creates a Map where K is the key type and V is the value type with a capacity of `_cap`. The supplied `eq` and `hash` functions will be Used by the Map. If two values are equal the `hash` function hash function should return the same hash for both values.
func (*Map[K, V]) Clear
func (m *Map[K, V]) Clear()
Removes all values from the underlying hash but keeps the maps underlying capacity.
func (*Map[K, V]) Copy
func (m *Map[K, V]) Copy() *Map[K, V]
Creates a copy of the supplied hash map. All values will be copied using memcpy, meaning a shallow copy will be made of the values.
func (*Map[K, V]) Get
func (m *Map[K, V]) Get(k K) (V, bool)
Gets the value that is related to the supplied key. If the key is found the boolean return value will be true and the value will be returned. If the key is not found the boolean return value will be false and a zero-initialized value of type V will be returned.
func (*Map[K, V]) Keys
func (m *Map[K, V]) Keys() iter.Seq[K]
Iterates over all of the keys in the map. Uses the stdlib `iter` package so this function can be Used in a standard `for` loop.
func (*Map[K, V]) Len
func (m *Map[K, V]) Len() int
Returns the number of elements in the hash map. This is different than the maps capacity.
func (*Map[K, V]) PntrVals
func (m *Map[K, V]) PntrVals() iter.Seq[*V]
Iterates over all of the values in the map. Uses the stdlib `iter` package so this function can be Used in a standard `for` loop. The value may be mutated and the results will be seen by the hash map.
func (*Map[K, V]) Put
func (m *Map[K, V]) Put(k K, v V)
Places the supplied key, value pair in the hash map. If the key was already present in the map the old value will be overwritten. The map will rehash as necessary.
func (*Map[K, V]) Remove
func (m *Map[K, V]) Remove(k K)
Removes the supplied key and associated value from the hash map if it is present. If the key is not present in the map then no action will be taken.
func (*Map[K, V]) Vals
func (m *Map[K, V]) Vals() iter.Seq[V]
Iterates over all of the values in the map. Uses the stdlib `iter` package so this function can be Used in a standard `for` loop.
func (*Map[K, V]) Zero
func (m *Map[K, V]) Zero()
Removes all values from the underlying hash and resets the maps capacity to the default initial capacity.
Generated by gomarkdoc
Simply include the package and you will be able to use the default map with no
simd intrinsics. If you wish to use simd intrinsics then you must compile your
project with one of following tags: sbmap_simd128
, sbmap_simd256
, or
sbmap_simd512
. In testing sbmap_simd128
has shown the greatest performance.
Note: Please keep in mind that the hash and equality functions you provide to the map can have a significant impact on performance. For this map to perform well the functions you provide must be correct, and efficient.
There are a couple scenarios where this map will make sense to use:
- You want to support maps with key types outside of the
comparable
type set - You want to use a map that performs less allocations than the builtin map
- Your hardware has SIMD support and you can take advantage of it
To build the build system:
go build -o ./bs/bs ./bs
The build system can then be used as usual:
./bs/bs --help
./bs/bs buildbs # Builds the build system!
To run unit tests:
go test -v ./... # or...
go test -tags=sbmap_simd128 -v ./... # or...
go test -tags=sbmap_simd256 -v ./... # or...
go test -tags=sbmap_simd512 -v ./...
To run the delve debugger:
dlv test github.com/barbell-math/smoothbrain-hashmap # or...
dlv test --build-flags="-tags=sbmap_simd128" github.com/barbell-math/smoothbrain-hashmap # or...
dlv test --build-flags="-tags=sbmap_simd256" github.com/barbell-math/smoothbrain-hashmap # or...
dlv test --build-flags="-tags=sbmap_simd512" github.com/barbell-math/smoothbrain-hashmap
To run gdb
on a selected packages tests:
go test -gcflags "-N" -ldflags="-compressdwarf=false" -c ./... # or...
./bs/bs unitTestExe default
go test -tags=sbmap_simd128 -gcflags "-N" -ldflags="-compressdwarf=false" -c ./... # or...
./bs/bs unitTestExe 128
go test -tags=sbmap_simd256 -gcflags "-N" -ldflags="-compressdwarf=false" -c ./... # or...
./bs/bs unitTestExe 256
go test -tags=sbmap_simd512 -gcflags "-N" -ldflags="-compressdwarf=false" -c ./... # or...
./bs/bs unitTestExe 512
gdb slotProbes.test # or...
gdb smoothbrain-hashmap.test
The delve debugger is better suited for dealing with go constructs. It handles accessing variables, printing values, and breakpoints a bit more smoothly than
gdb
. However, delve cannot see the xmm/ymm/zmm registers. This is a major problem considering that this hash map uses SIMD.gdb
on the other hand does not handle accessing/printing go variables/constructs as well and setting breakpoints is not as easy but, it can see the xmm/ymm/zmm registers. Pick your poison.
By default running the LargishDataset
unit test will run profiling and will
produce a file: ./bs/testProf.prof
. This can then be viewed using golangs
builtin pprof
tool:
go tool pprof ./bs/tmp/testProf.prof
Writing assembly is not a trivial task. These are some helpful resources:
- amd64 instruction list: Lists all of
the available instructions in the amd64 instruction set. However, not all of the
instructions listed in this resource are available for use with the
go
assembler. To check what instructions are available it is often helpful too search the go repo for them. - available instructions search:
This link provides a way to search the
go
repo for assembly instructions. - godbolt: Sometimes it is difficult to know where to start. In this case it can sometimes help to look at what the compiler would do. It is often helpful to write a small code sample in C and see what the compiler produces on high optimization levels.
- Algorithmica: This is a resource to better understand SIMD instructions and how to use them. If you have not dealt with SIMD instructions before it is recommended to read this first.
Golang has a builtin map. Why make a new one? Because I wanted to support
arbitrary types as keys. This is just a small issue that I have run into before
when trying to write generic code. The builtin map being restricted to
comparable
types restricts it from being used as a set in many scenarios. So,
this map implementation allows the user to supply their own custom equality and
hash functions. It is assumed that the functions the user supplies are valid for
use with a map.
The map in this package is implemented as a swiss table with one contiguous backing array for storage. This is similar to the implementation that the builtin map uses, except the builtin map does not use one contiguous array for backing storage, it uses the more widely accepted bucket approach. On the go dev blog it is claimed that the builtin map was implemented this way to make it so adding a value would not create a sudden increase in processing time if the map needed to resize the backing array and rehash all of the elements.
Go is frequently used for latency-sensitive servers, so we don’t want operations on built-in types that can have arbitrarily large impact on tail latency. Instead, Go maps grow incrementally, so that each insertion has an upper bound on the amount of growth work it must do. This bounds the latency impact of a single map insertion. Source: https://go.dev/blog/swisstable
However, this bucket approach comes with a significant increase in the number of allocations made by the builtin map. (Nothing is free after all.) Plot 1 shows the number of allocations made by the builtin map compared to the number of allocations made by the map in this package.
Plot 1: A plot showing the total number of allocations to add N elements to
the map.
So, the question can be asked: is there is a world where the reduced number of allocations performed by the map in this package allows it to have greater performance than the builtin map? After writing some benchmarks, the data can be looked at. Plot 2 shown below shows the time taken to put N elements into a map.
Plot 2: A plot showing the time taken to add N elements.
The map in this package has very clear jumps in processing time which are associated with when the map needs to resize and rehash. These jumps can be seen occurring at higher number of elements for higher load factor percentages, which makes sense because more elements can be added before triggering a resize and rehash given a higher load factor percentage. The builtin map does not show the same jumps, but to my surprise it does show more controlled increases in processing time at the same points where the map in this package has jumps in processing time. Seeing this large, but smooth, increase in processing time tells me that the builtin map still has similar limitations to resizing and rehashing even if those are not the exact actions that are occurring.
Plot 2 also shows the map in this package operating at differing load factors. The load factors in the range of 60%-75% perform the best in relation to the builtin map. The default load factor for the map in this package is 75%, the highest percentage that can be used before performance significantly decreases.
So, the original question of if reducing the number of allocations can result in an increase in performance can now be answered: yes, it can. The simd128 version of the map defined in this package shows a slight performance increase over the builtin map. Up until now only the simd128 version of the map defined in this package has been shown in graphs. This was done to reduce clutter and because the simd128 version of the map actually performs the best, as shown in plot 3. The reason there is a jagged section is because data points within that region were collected every few hundred elements. After that region sampling dropped to follow powers of 10 to help reduce the time taken to gather the benchmark results.
Plot 3: A plot showing the time taken to add N elements normalized
against the time taken to add N elements to the builtin map across various
simd tags.
Plot 3 shows that
- simd128 performs the best.
- simd256 has a slight edge, especially towards the end of the graph, to put it in second place.
- the default implementation with no SIMD is about as good as the simd512 version, if not better.
The pattern of worse performance as the SIMD register size increases is due to the nature of how a swiss table works. The larger the SIMD register the larger the potential number of collisions at each stage of comparison. Yes, SIMD will allow more values to be compared at once but having more values also results in more collisions. These collisions have to be iterated over sequentially which results in slower performance.