Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WT-3149 Have eviction choose a random point when walking a tree. #3285

Merged
merged 11 commits into from Feb 14, 2017

Conversation

agorrod
Copy link
Member

@agorrod agorrod commented Feb 9, 2017

Only choose a random point when there is no saved walk point.

Only choose a random point when there is no saved walk point.
@@ -785,7 +785,7 @@ __wt_row_random_leaf(WT_SESSION_IMPL *session, WT_CURSOR_BTREE *cbt)
* Find a random leaf page in a row-store tree.
*/
int
__wt_row_random_descent(WT_SESSION_IMPL *session, WT_CURSOR_BTREE *cbt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing much here that is row-store specific. If we change the page->type check to use WT_PAGE_IS_INTERNAL(page), can we use this on any tree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this, and I've pushed that change in 7666562.

I left the function itself in row_srch.c because I can't think of a better place for it.

@agorrod
Copy link
Member Author

agorrod commented Feb 9, 2017

@keithbostic Could you take a look at this change and let me know what you think?

There are two things I don't like stylistically about this change:

  1. The eviction server is aware of row storeness.
  2. The extra flags to the search feel a bit clunky.

A more invasive change could fix that by adding a general random search function that returns ENOTSUP for non row-stores. I'm happy to go down that path if you think it'd be better?

I'm also a bit apprehensive about how long the search could take in a sparse tree, but can't think of a good way to check that (and it should be rare that we don't have an existing walk point.

@@ -1662,6 +1662,21 @@ __evict_walk_file(WT_SESSION_IMPL *session,
FLD_SET(walk_flags, WT_READ_PREV);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keithbostic I think we probably also want to revisit evict_walk_reverse - I suspect we only want to set a walk direction when setting from a NULL evict_ref.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agorrod

I think we probably also want to revisit evict_walk_reverse - I suspect we only want to set a walk direction when setting from a NULL evict_ref.

If by "revisit", you mean "remove", I think that's a good idea.

Frankly, the less reverse traversal of the tree we do, the happier I am.

@keithbostic
Copy link
Contributor

@agorrod

The eviction server is aware of row storeness.

Updated per @michaelcahill's suggestion.

The extra flags to the search feel a bit clunky.

I assume you mean the mem_only flag to __wt_row_random_descent(). Yes, I agree it's a little clunky, but I can't think of anything better.

There is a retry in here (retry 100 times if we don't find a valid leaf page), that may not be right for eviction, depending on how often this function is called?

If there are other changes to this function to support eviction, we could create an eviction-specific version: it's only about 80 lines of code.

A more invasive change could fix that by adding a general random search function that returns ENOTSUP for non row-stores.

I think that question is resolved with @michaelcahill's suggested change. The WT_CURSOR.next_random call stack doesn't call into this function for anything other than row-stores (it checks both when setting the cursor's next_random method, and again when the underlying next_random function is called).

I'm also a bit apprehensive about how long the search could take in a sparse tree, but can't think of a good way to check that (and it should be rare that we don't have an existing walk point).

If this function is only called rarely, I don't think sparse trees will be an issue (including any questions about the retry loop in the code).

walking it sequentially weren't the same. Changed that, which caused the
test_compact02 test to fail.

There's an underlying bug in this code, if we return WT_NOTFOUND, we can
lose a hazard pointer on the page of the tree we unsucessfully searched.
Add a page-release in the case of returning not-found.

Simplify the termination conditions as well: there are bunch of ref
states I'd just as soon not worry about, for example, WT_REF_SPLIT.  I
think they'd all work, except split, but as I said, I'd just as soon not
worry about them. Limit our valid returns to in-memory if we're running
on behalf of eviction, disk and in-memory for everybody else.
@keithbostic
Copy link
Contributor

@agorrod
A couple of fixes chasing the test suite failures: b9dd358 and ea9f063.

Copy link
Member Author

@agorrod agorrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keithbostic I'm pretty happy with this change now, but it needs perf testing (and your agreement) before merging

*/
int
__wt_row_random_descent(WT_SESSION_IMPL *session, WT_CURSOR_BTREE *cbt)
__wt_random_descent(WT_SESSION_IMPL *session, WT_REF **refp, bool eviction)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote would be for this to be in a different location. Either a new file; maybe src/btree/btree_random.c or maybe into bt_cursor.c

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In response to the other question - I'm inclined to keep the single function for now. The function isn't that long, but I can easily imaging bug fixes coming here, and I don't see a way to share the common code paths.

@michaelcahill
Copy link
Contributor

I'd like to talk more about the walk direction when walks start at a random point. Can we have that discussion early next week?

@keithbostic
Copy link
Contributor

@agorrod, lgtm.

I agree with you & @michaelcahill, it would be good to move this somewhere else.

I don't want this code in bt_cursor.c, that already has too much stuff.

We could create bt_random.c, and put __wt_btcur_next_random(), __wt_random_descent() and __wt_row_random_leaf() there?

(If we do that, let's merge PR #3289 first, it changes __wt_btcur_next_random().)

@michaelcahill
Copy link
Contributor

michaelcahill commented Feb 13, 2017

@keithbostic, agreed and done.

UPDATE: sorry, I lost track of branches and didn't merge #3289 first. I think git will handle this just fine, I'll sort it out tomorrow assuming #3289 is ready to go.

This is designed to keep even coverage of the tree (if we always did
forward walks, the left-most pages in the tree would rarely be visited).
@michaelcahill
Copy link
Contributor

@agorrod, I resurrected the code to reverse the direction of eviction walks and make it reverse each time we start a new random walk. Let's talk about whether we want to go this way: in my local testing of evict-btree.wtperf, I don't measure any performance impact.

@agorrod
Copy link
Member Author

agorrod commented Feb 14, 2017

Thanks @keithbostic and @michaelcahill these changes look good to me. I'm going to merge and watch the perf jobs over night.

@agorrod agorrod merged commit df64d27 into develop Feb 14, 2017
@agorrod agorrod deleted the wt-3149-evict_rand_walk branch February 14, 2017 05:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants