pre load ondemand tablets for linear scans#3429
pre load ondemand tablets for linear scans#3429keith-turner merged 3 commits intoapache:elasticityfrom
Conversation
|
Wondering if working on #3430 before continuing with this would be best. |
|
With commit ddf749e seeing a big speedup when running the BulkSplitOptimizationIT |
|
In #3437 processing of tablet hosting request was moved from the tserver to the manager. For #3437 a subset of the changes in this PR were taken and improved. This PR still has client side changes that try to keep the next few tablets for a scan hosted. With only the changes in #3437 that scan in BulkSplitOptimizationIT takes 17 seconds, which makes a hosting request for each of the ~50 tablets being scanned and waits. The changes in the PR which have the changes in #3437 plus the changes to try to keep the next few tablets hosted only takes 2.7 seconds for the same scan of ~50 tablets. So still need to continue experimenting with the changes in this PR. |
f30f8dc to
2a4b318
Compare
| return null; | ||
| if (tl != null && locationNeed == LocationNeed.REQUIRED) { | ||
| Map<KeyExtent,CachedTablet> extentsToHost = | ||
| findExtentsToHost(context, minimumHostAhead * 2, hostAheadRange, lcSession, tl); |
There was a problem hiding this comment.
why multiply hostAhead by 2 ?
| } | ||
|
|
||
| CachedTablet followingTablet = _findTablet(context, currTablet.endRow(), true, false, true, | ||
| lcSession, LocationNeed.REQUIRED); |
There was a problem hiding this comment.
the caller of this method requests tablet hosting for the result, won't REQUIRED bring the tablets online?
There was a problem hiding this comment.
The method is only called when the calling method has location need of required. I had thought of passing location need to this method and having it fail if it was not required. I can try that and see what it looks like.
There was a problem hiding this comment.
I added the preconditions check which helps tightly couple the method with the caller. Looking into the code the _findTablet method does not host tablets, it uses the location need to determine if it needs to do a metadata lookup for tablets in the cache w/o a location.
) The change in #3429 modified ClientTabletCache so that it would preload Tablets for Scanners. It appears this change may have caused the test in #3809 to fail. I modified the comparison of the range to the tablet end key using the same logic that ThriftScanner uses and it fixed the test. Fixes #3809
Revert change introduced in apache#3654 to ManagerAssignmentIT for an extra tablet being brought online, something that was introduced in apache#3429 and fixed in apache#3812.
This PR contains my experimentation with making scans pre load ondemand tablets. Below are some log messages from running BulkSplitOptimizationIT which creates around 57 splits in an on demand table and scans it. The scan is faster that it was before this change, but still slower than if all the tablets were loaded.
These changes are not ready to commit, just posting in case anyone want to look over them.