Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request - Adopt OracleZFS's merged Data-Metadata ARC model #3946

Closed
Sachiru opened this issue Oct 23, 2015 · 4 comments
Closed

Feature Request - Adopt OracleZFS's merged Data-Metadata ARC model #3946

Sachiru opened this issue Oct 23, 2015 · 4 comments
Labels
Component: Memory Management kernel memory management Status: Design Review Needed Architecture or design is under discussion Type: Feature Feature request or new feature

Comments

@Sachiru
Copy link

Sachiru commented Oct 23, 2015

Reference article here: https://blogs.oracle.com/roch/entry/rearc

Recently, Oracle changed the way they handle ARC. Some concepts are interesting, specifically the merged data and metadata ARC lists. To quote:

Previously, the ARC claimed to use a two-state model:

"most recently used" (MRU)

"most frequently used" (MFU)

But it further subdivided these states into data and metadata lists.

That model, using 4 main memory lists, created a problem for ZFS. The ARC algorithm gave us only 1 target size for each of the 2 MRU and MFU states. The fact that we had 2 lists (data and metadata) but only 1 target size for the aggregate meant that when we needed to adjust the list down, we just didn't have the necessary information to perform the shrink. This lead to the presence of an ugly tunable arc_meta_limit, which was impossible to set properly and was a source of problems for customers.

This problem raises an interesting point and a pet peeve of mine. Many people I've interacted with over the years defended the position that metadata was worth special protection in a cache. After all, metadata is necessary to get to data, so it has intrinsically higher value and should be kept around more. The argument is certainly sensible on the surface, but I was on the fence about it.

ZFS manages every access through a least recently used scheme (LRU). New access to some block, data or metadata, puts that block back to the head of the LRU list, very much protected from eviction, which happens at the tail of the list.

When considering special protection for metadata, I've always stumbled on this question:

If some buffer, be it data or metadata, has not seen any accesses for sufficient amount of time, such that the block is now the tail of an eviction list, what is the argument that says that I should protect that block based on it's state ?
I came up blank on that question. If it hasn't been used, it can be evicted, period. Furthermore, even after taking this stance, I was made aware of an interesting fact about ZFS. Indirect blocks, the blocks that hold a set of block pointers to the actual data are non_evictable inasmuch as any of the block pointers they reference are currently in the ARC. In other words, if some data is in cache, it's metadata is also in the cache and furthermore, is non-evictable. This fact really reinforced my position that in our LRU cache handling, metadata doesn't need special protection from eviction.

And so, the reARC project actually took the same path. No more separation of data and metadata and no more special protection. This improvement led to fewer lists to manage and simpler code, such as shorter lock hold times for eviction. If you are tuning arc_meta_limit for legacy reasons, I advise you to try without this special tuning. It might be hurting you today and should be considered obsolete.

This concept is very interesting, and from the view of a layperson like me, very reasonable. Perhaps it is appropriate that we adopt the same idea of merging the two?

@ryao
Copy link
Contributor

ryao commented Oct 26, 2015

@ahrens This idea seems worth evaluating. What do you think?

@behlendorf behlendorf added the Type: Feature Feature request or new feature label Oct 21, 2016
@Sachiru
Copy link
Author

Sachiru commented Jul 17, 2017

Any news on this front?

@behlendorf behlendorf removed the Type: Question Issue for discussion label Dec 21, 2020
@ednadolski-ix
Copy link
Contributor

@ahrens @brian @amotin @mmaybee This has been open for quite some time, is there anything still worth considering here (esp. in light of very recent changes) or should it be closed out?

@amotin
Copy link
Member

amotin commented Nov 3, 2023

It should be closed. There were other threads on this topic, that ended up in decision to properly separate data and metadata instead, that I have done in 2.2.

@amotin amotin closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Memory Management kernel memory management Status: Design Review Needed Architecture or design is under discussion Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

6 participants
@behlendorf @ryao @Sachiru @amotin @ednadolski-ix and others