Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WT-2262 Have random sampling walk the tree so it isn't biased in skewed trees. #2363

Merged
merged 29 commits into from Dec 14, 2015

Conversation

michaelcahill
Copy link
Contributor

No description provided.

michaelcahill and others added 23 commits December 4, 2015 17:53
if a page is a leaf page because it's not set if the page is in memory,
we don't ever hold a hazard pointer on the root page, but that has to
be enforced outside of the page-in code.

Switch __wt_btcur_next_random from WT_RET to WT_ERR.

__wt_ref_is_leaf calls __wt_ref_info (and now, __wt_page_in), which can
fail, propagate those errors up to their callers.
out until we've skipped the appropriate number of leaf pages. Flip the
test order, and add some comments to clarify what's going on.
simplifies things to push it all down there.

Split the check for in-memory/on-disk into two parts, depending on what
we access there is at our current point in the code, __wt_ref_is_leaf()
goes back to being a simple on-page cell check.

Pages read for data sampling aren't "useful"; don't update the read
generation of pages already in memory, and if a page is read, set its
generation to a low value so it is evicted quickly.
the WT_WITH_PAGE_INDEX macro down a level, out of the high-level cursor
code.
descend label, test for a leaf page and go to the else clause, then
break out of the loop and access the pindex and slot variables without
initializing them. We only ever jump to descend when starting from the
root, which is guaranteed to be an internal page, but the compilers
don't know that. Restructure things slightly to clarify the code flow.
more sense once the debugging code is removed).
variable in order to skip some number of leaf pages.
…ages,

but there's no reason to disallow that, now we have the WT_READ_SKIP_LEAF
flag.  If no "ignore leaf page count" is specified, skip all leaf pages.
WT-2262: Have random sampling walk the tree so it isn't biased in skewed trees.
specifies a percentage of the tree to skip over before retrieving each
new record.  If next_random_sample_percent is configured, we first
retrieve a random record from the tree as a whole, and on subsequent
retrievals, skip forward the specified percentage of the tree; if
next_random_sample_percent is not configured, we use the previous method
of choosing a new leaf page from the entire tree on each retrieval.

Clarify bulk and next-random cursor documentation: there are several
methods allowed on bulk and next-random cursors (for example, reset,
reconfigure and close), don't try to list them. Instead reference the
principle method the application is expected to use (insert for bulk
cursors, next for next-random cursors).

Make cursor-open with next_random configured fail for column-store
objects, waiting until the next method is called seems unkind.

Rework the static random-cursor tests to smoke-test both forms of random
cursors.
@keithbostic
Copy link
Contributor

@agorrod, @michaelcahill, I've added a new configuration, next_random_sample_percent that specifies a percentage of the tree to skip over before retrieving each new record, and reworked the code so we use the original algorithm of selecting from the entire tree on each retrieval if that new configuration isn't specified.

Alex, I think this one is ready for merge after your final review.

@michaelcahill
Copy link
Contributor Author

Sorry we didn't talk this through in detail, I expected the parameter to be called "sample_size" and to be a count of the number of samples that the application will perform (which might be 1000, or 0.1%).

…fcntp

variable to be skipleafcntp, and it continues to skip some number of leaf
pages (but is no longer dependent on WT_READ_SKIP_LEAF being set).
number of samples the application will perform (which might be 1000, or 0.1%).
@keithbostic
Copy link
Contributor

@agorrod, @michaelcahill: I've pushed the changes Michael wanted.

Allow a value of 0 to next_random_sample_size, it's the default so
it doesn't make sense to disallow it.
size field indicates the number of records in the bitmap (as
specified by the object's \c value_format configuration).
Bulk-loaded bitmap values must end on a byte boundary relative
to the bit count (except for the last set of values loaded)'''),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just line wrapping changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just line wrapping changes?

No; the bulk cursor docs said "... and cursors configured for bulk-load only support the WT_CURSOR::insert and WT_CURSOR::close methods.", which isn't true (they also support reconfigure and reset).

Rather than try and enumerate the supported methods, I changed it to mention WT_CURSOR::insert as the "primary" method applications will use.

I noticed because the "next_random" wording had the same problem.

@agorrod
Copy link
Member

agorrod commented Dec 14, 2015

@keithbostic I'm happy with this change, I made a couple of mostly style based changes. Merge if you are happy with them, or I'll be happy to talk them through in your evening.

* !!!
* Ideally, the number would be prime to avoid restart issues.
*/
if (cbt->next_random_sample_size != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keithbostic I changed your version because I preferred the idea of duplicating a condition check than using a goto.

keithbostic added a commit that referenced this pull request Dec 14, 2015
WT-2262 Have random sampling walk the tree so it isn't biased in skewed trees.
@keithbostic keithbostic merged commit 6212aea into develop Dec 14, 2015
@keithbostic keithbostic deleted the WT-2262 branch December 14, 2015 12:39
@keithbostic
Copy link
Contributor

@agorrod, I think this one is done, at least as far as develop goes, I'm merging it.

Do we have any sample unbalanced trees around we can use to test this fix in 3.0?

keithbostic referenced this pull request Dec 14, 2015
uninitialized in __ref_is_leaf() (based on a call to __wt_ref_info()).
It's not really possible because the path where type isn't set is a path
where we panic because the WT_ADDR structure has an impossible type.

We already ignore the __wt_ref_info() error return in one path, and
there are only two paths that care about the returned type; remove the
error check from __wt_ref_info() and set type to 0 in the failing case
(the same value we use when there's no WT_REF addr to check), the code
that calls this function already checks addr on return.

This simplifies __ref_is_leaf() slightly, it now returns a boolean
instead of an error code with a boolean pointer argument.
keithbostic added a commit that referenced this pull request Dec 16, 2015
WT-2262 Have random sampling walk the tree so it isn't biased in skewed trees.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants