New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WT-3149 Have eviction choose a random point when walking a tree. #3285
Conversation
Only choose a random point when there is no saved walk point.
src/btree/row_srch.c
Outdated
@@ -785,7 +785,7 @@ __wt_row_random_leaf(WT_SESSION_IMPL *session, WT_CURSOR_BTREE *cbt) | |||
* Find a random leaf page in a row-store tree. | |||
*/ | |||
int | |||
__wt_row_random_descent(WT_SESSION_IMPL *session, WT_CURSOR_BTREE *cbt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing much here that is row-store specific. If we change the page->type
check to use WT_PAGE_IS_INTERNAL(page)
, can we use this on any tree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this, and I've pushed that change in 7666562.
I left the function itself in row_srch.c
because I can't think of a better place for it.
@keithbostic Could you take a look at this change and let me know what you think? There are two things I don't like stylistically about this change:
A more invasive change could fix that by adding a general random search function that returns ENOTSUP for non row-stores. I'm happy to go down that path if you think it'd be better? I'm also a bit apprehensive about how long the search could take in a sparse tree, but can't think of a good way to check that (and it should be rare that we don't have an existing walk point. |
@@ -1662,6 +1662,21 @@ __evict_walk_file(WT_SESSION_IMPL *session, | |||
FLD_SET(walk_flags, WT_READ_PREV); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keithbostic I think we probably also want to revisit evict_walk_reverse
- I suspect we only want to set a walk direction when setting from a NULL evict_ref.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably also want to revisit
evict_walk_reverse
- I suspect we only want to set a walk direction when setting from a NULL evict_ref.
If by "revisit", you mean "remove", I think that's a good idea.
Frankly, the less reverse traversal of the tree we do, the happier I am.
Updated per @michaelcahill's suggestion.
I assume you mean the There is a retry in here (retry 100 times if we don't find a valid leaf page), that may not be right for eviction, depending on how often this function is called? If there are other changes to this function to support eviction, we could create an eviction-specific version: it's only about 80 lines of code.
I think that question is resolved with @michaelcahill's suggested change. The
If this function is only called rarely, I don't think sparse trees will be an issue (including any questions about the retry loop in the code). |
walking it sequentially weren't the same. Changed that, which caused the test_compact02 test to fail. There's an underlying bug in this code, if we return WT_NOTFOUND, we can lose a hazard pointer on the page of the tree we unsucessfully searched. Add a page-release in the case of returning not-found. Simplify the termination conditions as well: there are bunch of ref states I'd just as soon not worry about, for example, WT_REF_SPLIT. I think they'd all work, except split, but as I said, I'd just as soon not worry about them. Limit our valid returns to in-memory if we're running on behalf of eviction, disk and in-memory for everybody else.
point should be a more effective solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keithbostic I'm pretty happy with this change now, but it needs perf testing (and your agreement) before merging
src/btree/row_srch.c
Outdated
*/ | ||
int | ||
__wt_row_random_descent(WT_SESSION_IMPL *session, WT_CURSOR_BTREE *cbt) | ||
__wt_random_descent(WT_SESSION_IMPL *session, WT_REF **refp, bool eviction) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote would be for this to be in a different location. Either a new file; maybe src/btree/btree_random.c
or maybe into bt_cursor.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In response to the other question - I'm inclined to keep the single function for now. The function isn't that long, but I can easily imaging bug fixes coming here, and I don't see a way to share the common code paths.
I'd like to talk more about the walk direction when walks start at a random point. Can we have that discussion early next week? |
@agorrod, lgtm. I agree with you & @michaelcahill, it would be good to move this somewhere else. I don't want this code in We could create (If we do that, let's merge PR #3289 first, it changes |
@keithbostic, agreed and done. UPDATE: sorry, I lost track of branches and didn't merge #3289 first. I think git will handle this just fine, I'll sort it out tomorrow assuming #3289 is ready to go. |
This is designed to keep even coverage of the tree (if we always did forward walks, the left-most pages in the tree would rarely be visited).
@agorrod, I resurrected the code to reverse the direction of eviction walks and make it reverse each time we start a new random walk. Let's talk about whether we want to go this way: in my local testing of |
Thanks @keithbostic and @michaelcahill these changes look good to me. I'm going to merge and watch the perf jobs over night. |
Only choose a random point when there is no saved walk point.