Enabled Issue Auto-close (#174)
* Enable Issue Auto-close * Update URL and README.md
Make item public. Add a new onReject call for rejected items. (#180)
- Making Item public makes the onEvict and onReject function calls more readable. - Adding onReject allows us to tightly track every Set that happens, so we can avoid memory leaks in manually allocated memory.
Add life expectancy histogram (#182)
If cache is too small, keys can enter and leave very quickly. This results in poor cache usage. Adding a life expectancy histogram to track a sample of keys from admission into cache to eviction. If we see too many keys getting evicted quickly, (along with miss ratio) that's a clear signal that cache size is too small. This would help a user tweak cache size better.
Add OnExit handler which can be used for manual memory management (#183)
Add a new OnExit handler, which is called every time an accepted value by Ristretto is let go. This is useful for manual memory management. Move Calloc from Badger over to Ristretto in z package, so Ristretto, Badger and Dgraph can all use it.
Add mechanism to wait for items to be processed. (#184)
Cache operations are handled asynchronously. Calling the Wait method will add an item to the process queue and block until that item is processed. Useful to ensure all the previous items have been processed before proceeding.
Introduce Calloc: Manual Memory Management via jemalloc (#186)
Introduce z.Calloc, z.CallocNoRef and z.Free, to use jemalloc for memory allocation and deallocation to reduce pressure from Go GC. Introduce y.Buffer which can use manual memory management. y.Buffer also has a way to encode lots of smaller buffers into this big buffer and access them via offsets. This can be used for sorting them, as we do in bulk loader. Changes: * add a hack to disable free * Add a memtest * Add modes to showcase the memory leak problem. * Add a C program to verify that memory fragmentation is not an issue * Bind port to host, so I can access it from my laptop * Use jemalloc for Calloc and Free. * Add a way to print jemalloc stats. * Switch jemalloc prefix to je_ * Use a new jemalloc build tag * Move Buffer class over to z * Add a new func called CallocNoRef to deal with object allocations. * Don't do memory tracking in Go mode. * Move memtest to contrib * Add godocs because Daniel made me write them
Add histogram.Mean() method (#188)
Move Closer from y to z (#191)
Allocator helps allocate memory to be used by unsafe structs (#192)
Internally it uses Calloc, so the memory could be allocated via either Go or jemalloc, depending upon what's enabled. The allocated memory is then safe to be unsafe type casted to Go structs.
Add ReadMemStats function (#193)
JE Malloc is used to manually allocate memory. This PR adds a `ReadMemStats` function (similar to runtime.ReadMemStats) that can be used to fetch JE Malloc statistics at runtime. This PR supports fetching `Allocated, Active, Retained, and Resident` memory information. Fixes - DGRAPH-2382
Introduce Mmapped buffers and Merge Sort (#194)
Buffers can now be mmapped as well as Calloc'd. This PR also copies over all the mmap files from Badger to allow mmap support in various platforms. This PR also introduces Merge Sort to do sorting of the buffer using an extra temporary space costing half of the space as the original buffer, currently allocated on Calloc. We can't use quick sort. Each entry is variable length, so we can't just swap them in the buffer. Merge Sort allows us to iterate over them linearly, hence is a better fit.
Have a way to automatically mmap a growing buffer (#196)
Allow a buffer allocated via Calloc to switch to Mmap after it grows beyond a certain size.
Improve memory performance (#195)
- Use an int64 instead of a time.Time struct to represent the time. - By default, include the cost of the storeItem in the cost calculation. Related to DGRAPH-1378
Buffer: Use 256GB mmap size instead of MaxInt64 (#198)
MaxInt64 is 9.2 Exabyte and the test fails with cannot allocate memory on my computer. This PR also fixes the build (it is failing on master).
z: Add TotalSize method on bloom filter (#197)
This PR adds a TotalSize function which returns the total size of the bloom filter.
Public methods must not panic after Close() (#202)
The process crashes when other public methods are called after `Close()`. That must be handled gracefully. ``` panic: send on closed channel goroutine 24430 [running]: code.uber.internal/infra/statsdex/vendor/github.com/dgraph-io/ristretto.(*defaultPolicy).Push(0xc000676060, 0xc1417ed200, 0x40, 0x40, 0x3) code.uber.internal/infra/statsdex/vendor/github.com/dgraph-io/ristretto/policy.go:112 +0x64 ```
Zbuffer: Add LenNoPadding and make padding 8 bytes (#204)
Show count when printing histogram (#201)
Add IncrementOffset API for z.buffers (#206)
Add an IncementOffset API for z.buffers which is a thread-safe API for incrementing the buffer offset.
This PR adds a custom mmaped B+ tree. This data structure creates a mapping from uint64 to uint64. Structure of node: Each node in the node is of size pageSize. Two kinds of nodes. Leaf nodes and internal nodes. Leaf nodes only contain the data. Internal nodes would contain the key and the offset to the child node. Internal node would have first entry as: <0 offset to child>, <1000 offset>, <5000 offset>, and so on... Leaf nodes would just have: <key, value>, <key, value>, and so on... Last 16 bytes of the node are off limits. | pageID (8 bytes) | metaBits (1 byte) | 3 free bytes | numKeys (4 bytes) |