-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP - ZAP Shrinking #14088
WIP - ZAP Shrinking #14088
Conversation
This is awesome! As far as i understood @ahrens on the OpenZFS DevSummit Hackathon 2022 correctly, this could finally fix long waits for simple "ls" on dirs, which contained many files in the past. In case this gets merged, does this only "work" with only new created dirs or also with already existing dirs after upgrading zfs to new release which contains this feature? Cant wait to get that upstreamed! :) Thanks in advance for all contributors! Referencing a github discussion regarding zap shrinking: #8420 |
@@ -110,6 +111,8 @@ typedef enum zap_flags { | |||
* already randomly distributed. | |||
*/ | |||
ZAP_FLAG_PRE_HASHED_KEY = 1 << 2, | |||
/* XXX */ | |||
ZAP_FLAG_NO_SHRINK = 1 << 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where this is ever passed in; can we remove it?
} | ||
zap_put_leaf(l); | ||
return (err); | ||
} | ||
|
||
#define ZAP_PREFIX_HASH(pref, pref_len) ((pref) << (64 - (pref_len))) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] one blank line is enough here
zc->zc_leaf = NULL; | ||
|
||
/* | ||
* The leaf was either shrunk or splitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] "split" is the correct past tense.
(ZAP_HASH_IDX(zc->zc_hash, | ||
zap_leaf_phys(zc->zc_leaf)->l_hdr.lh_prefix_len) != | ||
zap_leaf_phys(zc->zc_leaf)->l_hdr.lh_prefix)) { | ||
// XXX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is correct, and the commented-out code can be removed
* | ||
* Any ZAP leaf might have a sibling - a leaf with the same prefix length and | ||
* with the prefix, which differes only by 1 least significant (sibling) bit. | ||
* If both leaves are empty, we can remove one of them. For simplicity, we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove one of them if only one of them is empty (not both empty)? Maybe it doesn't matter much in practice, since you would have to remove the vast majority (>99%?) of entries in order to get either one or two adjacent leaves empty.
|
||
/* | ||
* Instead of calling zap_unlockdir(); zap_lockdir(); | ||
* we do it in more optimized way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an unnecessary optimization, given that we don't make this optimization in zap_expand_leaf()
, which is probably called much more often than this.
@sdimitro Sorry for bothering you, can you may adress the feedback and rebase this? I am really looking forward to this. Much thanks in advance for all participants! |
Politely pinging @amotin since recent work on ZAP code and to may bring more attention/review to this. Just in case iX is may interested in this. To get that integrated would be awesome. Anyway much thanks! |
@amotin Feel free to pick this up! BTW I was planning on trying out a few other designs as this PR is not really my code (I uncovered it from an old illumos PR). Maybe something along the lines of recreating a the whole ZAP once too many entries are gone (1/4?) potentially converting it to a microZAP too. |
@allanjude I saw your recent presentation on the june 2023 OpenZFS leadership meeting regarding the "rework" dedup stuff. There (as far as i understood), you also mentioned some ZAP optimizations, including ZAP shrinking. Do you plan to use this PR or do you even/may consider finishing this PR before/separate from the dedup stuff? Getting ZAP shrinking/optimizations would be great. In any case much thanks. |
We expect to post an updated version of ZAP shrinking in the next week or two. |
Hello @allanjude, can I may ask the progress on this/can you may give an update on this? In any case, much thanks for working on zfs! |
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
I'd just like to say here, that this code seems solid so far. We've been running it (after a few testing rounds) in production for a while now, no problems to chase, just works. I think there are quite a number of users, who would benefit from this being upstreamed. Not sure it's worth the wait for a newer version... That can eventually still land in master even after this gets merged, can't it? (also sorry for the spam, I'll remove the reference to this PR from the commit) |
@snajpa it's good to know this is holding up well in your testing. There's still some outstanding feedback to tackle, that work just needs to be picked up by someone and a fresh PR opened. |
Klara's improved version of ZAP shrinking should get a pull request before the end of February. |
Replaced by #15888 |
Ported from: