Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbuf: evict harder when under memory pressure #15659

Closed
wants to merge 1 commit into from

Conversation

robn
Copy link
Contributor

@robn robn commented Dec 11, 2023

Motivation and Context

In an OOM situation, there's no path for the kernel to signal to the dbuf cache that it should evict things, so it continues to operate as normal, oblivious to the fact that the world is burning down around it.

Absent such a signal, this commit instead just makes the evict thread keep going as long as the ARC is in distress.

Description

In dbuf_evict_thread() when deciding if there's more to evict, also call arc_reclaim_needed(), and keep going if its true.

This is made somewhat less necessary by #15511, since the dbuf cache will no longer be pinning vastly more memory than it should have been, but its still useful if there's a sudden demand for a huge amount of memory.

I don't know if this is the right approach. On the one hand, the ARC is accounting for a lot of memory in the system that it isn't directly responsible for, so asking it if it believes there is a problem seems reasonable. On the other hand, its not accounting for everything (eg kmem caches) and doesn't have a complete picture, and isn't able to evict a lot of things anyway.

This leads me to a design question: what is the ARCs role in overall memory management? Is it expected that its the source of truth for all memory ZFS is using? (and if so, why isn't it?). Or, is it only interested in its own memory usage? (and if so, why account for memory from other sources (eg dnodes)?). The answer is somewhere in between, but I haven't been able to draw the line.

If the ARC is supposed to be at the centre of it all, then possibly the better approach here would be for it to understand the dbuf cache more directly and signal it when it its time to drop things. But that also leads to madcap things, like the ARC also being able to evict kmem caches, so I'd appreciate some guidance on this question regardless of whether or not this PR is sensible.

How Has This Been Tested?

Minimal, mostly by causing OOM conditions (tail -f /dev/zero) and observing the cache kstats.

Its quite possible to write a test that uses the same method, but first I wanted to make sure that the idea is right.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

In an OOM situation, there's no path for the kernel to signal to the
dbuf cache that it should evict things, so it continues to operate as
normal, oblivious to the fact that the world is burning down around it.

Absent such a signal, this commit instead just makes the evict thread
keep going as long as the ARC is in distress.

Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should be needed, not that the implementation is right.

For the first part, dbuf cache size is set to 3% of ARC target. Under memory pressure ARC will reduce its target down to arc_min, that should proportionally reduce dbuf cache to almost nothing already.

For the second part, most of dbuf cache evictions are just moved to ARC and no memory is freed immediately (aside of uncacheable and user buffers). It means arc_reclaim_needed() will still be true and dbuf cache eviction continue until it is dropped completely, that is not our goal.

@behlendorf behlendorf added Component: Memory Management kernel memory management Status: Code Review Needed Ready for review and testing labels Dec 11, 2023
@behlendorf
Copy link
Contributor

For the first part, dbuf cache size is set to 3% of ARC target. Under memory pressure ARC will reduce its target down to arc_min, that should proportionally reduce dbuf cache to almost nothing already.

Exactly this. I think we need to better understand why this existing mechanism is insufficient.

@robn
Copy link
Contributor Author

robn commented Dec 13, 2023

Alright, I'll pull this for now. I was working on it at the same time as at #15511, and my notes from the time suggest it was necessary to really get a lot of memory back around OOM-time, but I'm struggling to reproduce that right now. I'll come back to it another time.

I may have further question about ARC memory accounting though :)

Thanks!

@robn robn closed this Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Memory Management kernel memory management Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants