New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WT-2262 Have random sampling walk the tree so it isn't biased in skewed trees. #2363
Conversation
if a page is a leaf page because it's not set if the page is in memory, we don't ever hold a hazard pointer on the root page, but that has to be enforced outside of the page-in code. Switch __wt_btcur_next_random from WT_RET to WT_ERR. __wt_ref_is_leaf calls __wt_ref_info (and now, __wt_page_in), which can fail, propagate those errors up to their callers.
out until we've skipped the appropriate number of leaf pages. Flip the test order, and add some comments to clarify what's going on.
simplifies things to push it all down there. Split the check for in-memory/on-disk into two parts, depending on what we access there is at our current point in the code, __wt_ref_is_leaf() goes back to being a simple on-page cell check. Pages read for data sampling aren't "useful"; don't update the read generation of pages already in memory, and if a page is read, set its generation to a low value so it is evicted quickly.
the WT_WITH_PAGE_INDEX macro down a level, out of the high-level cursor code.
descend label, test for a leaf page and go to the else clause, then break out of the loop and access the pindex and slot variables without initializing them. We only ever jump to descend when starting from the root, which is guaranteed to be an internal page, but the compilers don't know that. Restructure things slightly to clarify the code flow.
more sense once the debugging code is removed).
variable in order to skip some number of leaf pages.
…ages, but there's no reason to disallow that, now we have the WT_READ_SKIP_LEAF flag. If no "ignore leaf page count" is specified, skip all leaf pages.
WT-2262: Have random sampling walk the tree so it isn't biased in skewed trees.
specifies a percentage of the tree to skip over before retrieving each new record. If next_random_sample_percent is configured, we first retrieve a random record from the tree as a whole, and on subsequent retrievals, skip forward the specified percentage of the tree; if next_random_sample_percent is not configured, we use the previous method of choosing a new leaf page from the entire tree on each retrieval. Clarify bulk and next-random cursor documentation: there are several methods allowed on bulk and next-random cursors (for example, reset, reconfigure and close), don't try to list them. Instead reference the principle method the application is expected to use (insert for bulk cursors, next for next-random cursors). Make cursor-open with next_random configured fail for column-store objects, waiting until the next method is called seems unkind. Rework the static random-cursor tests to smoke-test both forms of random cursors.
@agorrod, @michaelcahill, I've added a new configuration, Alex, I think this one is ready for merge after your final review. |
Sorry we didn't talk this through in detail, I expected the parameter to be called "sample_size" and to be a count of the number of samples that the application will perform (which might be 1000, or 0.1%). |
…fcntp variable to be skipleafcntp, and it continues to skip some number of leaf pages (but is no longer dependent on WT_READ_SKIP_LEAF being set).
number of samples the application will perform (which might be 1000, or 0.1%).
@agorrod, @michaelcahill: I've pushed the changes Michael wanted. |
Allow a value of 0 to next_random_sample_size, it's the default so it doesn't make sense to disallow it.
size field indicates the number of records in the bitmap (as | ||
specified by the object's \c value_format configuration). | ||
Bulk-loaded bitmap values must end on a byte boundary relative | ||
to the bit count (except for the last set of values loaded)'''), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just line wrapping changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just line wrapping changes?
No; the bulk cursor docs said "... and cursors configured for bulk-load only support the WT_CURSOR::insert and WT_CURSOR::close methods.", which isn't true (they also support reconfigure and reset).
Rather than try and enumerate the supported methods, I changed it to mention WT_CURSOR::insert as the "primary" method applications will use.
I noticed because the "next_random" wording had the same problem.
@keithbostic I'm happy with this change, I made a couple of mostly style based changes. Merge if you are happy with them, or I'll be happy to talk them through in your evening. |
* !!! | ||
* Ideally, the number would be prime to avoid restart issues. | ||
*/ | ||
if (cbt->next_random_sample_size != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keithbostic I changed your version because I preferred the idea of duplicating a condition check than using a goto.
WT-2262 Have random sampling walk the tree so it isn't biased in skewed trees.
@agorrod, I think this one is done, at least as far as develop goes, I'm merging it. Do we have any sample unbalanced trees around we can use to test this fix in 3.0? |
uninitialized in __ref_is_leaf() (based on a call to __wt_ref_info()). It's not really possible because the path where type isn't set is a path where we panic because the WT_ADDR structure has an impossible type. We already ignore the __wt_ref_info() error return in one path, and there are only two paths that care about the returned type; remove the error check from __wt_ref_info() and set type to 0 in the failing case (the same value we use when there's no WT_REF addr to check), the code that calls this function already checks addr on return. This simplifies __ref_is_leaf() slightly, it now returns a boolean instead of an error code with a boolean pointer argument.
WT-2262 Have random sampling walk the tree so it isn't biased in skewed trees.
No description provided.