New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active defrag v2 #4691
Active defrag v2 #4691
Conversation
17f747a
to
d5f5b83
Compare
d5f5b83
to
03917bd
Compare
- big keys are not defragged in one go from within the dict scan instead they are scanned in parts after the main dict hash bucket is done. - add latency monitor sample for defrag - change default active-defrag-cycle-min to induce lower latency - make active defrag start a new scan right away if needed, so it's easier (for the test suite) to detect when it's done - make active defrag quick the current cycle after each db / big key - defrag some non key long term global allocations - some refactoring for smaller functions and more reusable code - during dict rehashing, one scan iteration of the dict, can end up scanning one bucket in the smaller dict and many many buckets in the larger dict. so waiting for 16 scan iterations before checking the time, may be much too long.
…ve defrag test other fixes / improvements: - LUA script memory isn't taken from zmalloc (taken from libc malloc) so it can cause high fragmentation ratio to be displayed (which is false) - there was a problem with "fragmentation" info being calculated from RSS and used_memory sampled at different times (now sampling them together) other details: - adding a few more allocator info fields to INFO and MEMORY commands - improve defrag test to measure defrag latency of big keys - increasing the accuracy of the defrag test (by looking at real grag info) this way we can use an even lower threshold and still avoid false positives - keep the old (total) "fragmentation" field unchanged, but add new ones for spcific things - add these the MEMORY DOCTOR command - deduct LUA memory from the rss in case of non jemalloc allocator (one for which we don't "allocator active/used") - reduce sampling rate of the rss and allocator info
03917bd
to
806736c
Compare
Hello @oranagra, I'm going to merge this because I believe it's too complex for me every time to dig into the defragmentation code, so I nominate you, if you accept, the maintainer of this Redis sub-systems and will just merge anything you PR. However if I understand correctly, this changes also non-only-defrag related things, like the way fragmentation is computed. Questions:
Remarks:
Thank you |
Hi. I've split that PR into two commits so that it'll be easier for you to review just the second one (which has changes outside the defrag code). Anyway, regarding the second commit and the changes it brings to redis INFO / MEMORY command: The idea this time was not to change the behavior of the old info fields (they report the same measurement that they used to report), but now, i add new info fields with more info as to what this RSS overhead (incorrectly called "fragmentation") is made of. the thing about sampling memory usage and RSS at the same time is actually a bugfix, otherwise the ratio you calculate can be wrong if the RSS sampled at the server cron was very different than the memory usage at the time of infoCommand. i will obviously turn your attention to risky things i'm adding, even if you won't review them. all the other changes in that commit are just adding more fields to MEMORY and INFO commands and improving the DOCTOR. |
@antirez i forgot to tag you for the above reply. |
Thank you @oranagra, so I'll follow your advice and today I'll read both the commit and merge or comment. Also thanks for splitting the PR into two commits for easy reading. |
Hello @oranagra, I read the two commits, on the big one implementing defrag V2 I only scanned the different functions to get an idea of what it does, the code seems well written to me and the improvement in the latency very important, however I cannot claim that my was a review able to uncover any bug. It will be cool if we will be able to reserve some "office time" in San Francisco in case you are willing to give me a 20 minutes tour on how it exactly works. Also it is pretty clear that if we go with a radix-tree+listpacks for all the data types (but sorted sets), this will be a big win from the POV of the simplicity that can be achieved inside the defragmentation code, a single code path for Streams, Lists (we should move away from quicklists as well), Hashes, Sets, would be cool. About the second commit, it looks cool as well, also the hints in MEMORY DOCTOR are useful. I'm merging everything as it is, but the only thing I would like to improve before Redis 5.0 RC1 is the naming of the fields. mem_fragmentation vs allocator_fragmentation is complex to understand. I wonder if we can pick more specific names, and document somewhere effectively the difference between all those. So starting with merging the PR, and we can continue later from here. Thanks for your work! |
@antirez thanks for merging. Regarding the names, i originally wanted to change the meaning of mem_fragmentation and have it show what's currently displayed in allocator_fragmentation, and add a field named rss_overhead that will show what's currently displayed in mem_fragmentation. if you want to suggest some alternative names, please give an example, and we'll go from there. |
Active defrag v2
Active defrag v2
First commit: Avoid latency spikes when defragging huge keys
Second commit: Adding real allocator fragmentation to INFO and MEMORY command + active defrag test
other fixes / improvements:
other details: