Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

defragger improvements around large bins #12963

Closed
oranagra opened this issue Jan 18, 2024 · 8 comments · Fixed by #12996
Closed

defragger improvements around large bins #12963

oranagra opened this issue Jan 18, 2024 · 8 comments · Fixed by #12996
Assignees

Comments

@oranagra
Copy link
Member

jemalloc/jemalloc#1098 (comment)
few things that can be improved:

  1. don't attempt to defrag (wasting CPU cycles), if the fragmentation in small bins doesn't cross the threshold (ignore the active pages used by large bins)
  2. consider adding INFO metrics for retained and muzzy memory (stats.arenas.<i>.pmuzzy)
  3. and used memory breakdown of large vs small bins.
@zuiderkwast
Copy link
Contributor

This looks like your notes to yourself. :) Do you need the rest of us to get involved?

I see #12315. Didn't that already get rid of fragmentation on large bins?

@oranagra
Copy link
Member Author

right, this was a "draft" note in the 8.0 project and i promoted it to an issue so that it can be assigned or discussed.

what's fixed in #12315 does reduce some case of "fragmentation" in large bins, but i was worried that it could be others, and wanted to somehow measure only fragmentation inside small bins to trigger defrag.
i don't remember how i wanted to gather that metric, i see that it doesn't exist in mallctl, but we do have it (per bin) in malloc_stats (so in theory we can add an API to expose it to redis)

@sundb sundb self-assigned this Jan 23, 2024
@sundb
Copy link
Collaborator

sundb commented Jan 23, 2024

@zuiderkwast did you start? if not, i'd like to start.

@zuiderkwast
Copy link
Contributor

@sundb please go ahead! I didn't start.

@sundb
Copy link
Collaborator

sundb commented Feb 1, 2024

@oranagra eliminating large bins actually results in a higher frag rate.
when there are far more large bins than small bins, the frag rate will approach zero.
I inserted 1000 strings of 100k into Redis, and the fragmentation rate would be 0% before eliminating, but when I eliminating the large bins, the fragmentation rate was 130%.

@oranagra
Copy link
Member Author

oranagra commented Feb 1, 2024

right. but that's indeed the fragmentation ratio in the area we can defrag.
on the other hand, if that memory consumption is negligible compared to what we consume, maybe we shouldn't bother to defrag it.

the reason i suggested this change was because i wanted to hide any memory overheads in large bins (or other non-defraggable overheads), i.e. anything that's not defraggable.
so maybe we can take a different formula and get that, without suffering from the amplification of the fact most memory is in large bins.

e.g. we can sum all the memory wasted in small bins (i.e. explicit calculation of small bin fragmentation), and then divide that by the total memory usage, rather than the small-bin memory usage.

WDYT?

@sundb
Copy link
Collaborator

sundb commented Feb 2, 2024

@oranagra why is the formula for fragmentation percent ((float)resident / allocated)*100 - 100, instead of (active - allocated) / active?

@oranagra
Copy link
Member Author

oranagra commented Feb 3, 2024

you mean why isn't it (active-allocated)*100/allocated (which gives the same result as the one in the code)
the formula you suggested (the one who uses resident isn't actually used (just a print).
and also, yours results in a scale of 0..1, not 0..100

the other difference is that the one in the code measures the fragmentation overhead as a portion of the allocated memory (dividing by allocated), and the one you gave would give the fragmentation overhead as a portion of the total active memory.
e.g. if we have 200gb active, and 150gb allocated, is the fragmentation 33% or 25%. (current code considers it to be 33%, i.e. 50 out of 150).

am i missing anything?

@oranagra oranagra linked a pull request Feb 3, 2024 that will close this issue
oranagra added a commit that referenced this issue Feb 20, 2024
Implement #12963

## Changes
1. large bins don't have external fragmentation or are at least
non-defraggable, so we should ignore the effect of
large bins when measuring fragmentation, and only measure fragmentation
of small bins. this affects both the allocator_frag* metrics and also
the active-defrag trigger
2. Adding INFO metrics for `muzzy` memory, which is memory returned to
the OS but still shows as RSS until the OS reclaims it.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants