This is yet another Go cache. I need the fastest possible implementation with optional synchronizaton.
- Target DNS servers and domain names lookup
- Carefully avoid allocations from the heap in the Store()/Load() API
- Use runtime.nanotime()
- Synchronisation is optional
- All items have the same expiration period
- Minimalistic API
- Eviction usually via expiration
- Eviciton by reference
- "Unsafe" memory pool
BenchmarkPoolAlloc-4 10000000 9.68 ns/op BenchmarkStore-4 50000000 292 ns/op BenchmarkLoad-4 50000000 129 ns/op BenchmarkEvict-4 50000000 222 ns/op BenchmarkAllocStoreEvictFree-4 10000000 358 ns/op 0 B/op 0 allocs/op BenchmarkHashtableStore-4 20000000 99.9 ns/op 0 B/op 0 allocs/op BenchmarkHashtableLoad-4 20000000 165 ns/op 0 B/op 0 allocs/op BenchmarkMapStore-4 10000000 144 ns/op 0 B/op 0 allocs/op BenchmarkRandomMemoryAccess-4 50000000 34.5 ns/op
This implementation allows 5-10M cache operations/s on a single core. Round trip "allocation from a pool - store in cache - evict from cache - free to the pool" requires 350ns. A single core system theoretical peak is ~3M events/s. With packet size 64 bytes this code is expected to handle 100Mb/s line.
The cache API is a thin wrapper around a custom hashtable and an expiration queue. The key is a string and data is unsafe.Pointer. See TestAddCustomType() for usage.
The cache keep unsafe.Pointer instead of Go references. This means that the application can not not rely on the
Go memory management. For example, you can not allow the Go GC to free the memory allocated for the object.
You have two fast options:
- Allocate an array of objects, keep index or reference to the object in the cache.
- Use unsafepool.
Unsafe pool's New() alocates the specified number of memory blocks. The raw memory block is large enough to fit object of the specified type. The pool Alloc()/Free() API operates with pointers to the blocks (at this point Go crowd runs away crying to things like https://github.com/patrickmn/go-cache). Both approaches will target sub 100ns/operation time.
I want the hash function to cluster for most popular keys. The idea is that popular lookups can fit a single 4K memory page and rarely trigger a data cache miss. The simplest way to do this is to keep a small hashtable (cache) for the popular lookups. I can run two lookups - in the main and large table and in the small cache. The small cache can implement a very simple and fast hash function. For example, use 2 first characters as a hash In another approach I can prepare and keep permanently hash keys for top 10K domain names
- http://www.cs.cornell.edu/courses/cs3110/2008fa/lectures/lec21.html - start here