Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

Compile WHERE k IN (1,2,3) into batched get #7

Closed
jtaylor-sfdc opened this issue Jan 26, 2013 · 2 comments
Closed

Compile WHERE k IN (1,2,3) into batched get #7

jtaylor-sfdc opened this issue Jan 26, 2013 · 2 comments

Comments

@jtaylor-sfdc
Copy link
Contributor

When an IN (or the equivalent OR) appears in a query using the leading row key columns, compile it into a batched get to more efficiently retrieve the query results. Currently, the scan key will be set to the min value and max value in the IN instead.

@ghost ghost assigned ryang-sfdc Feb 27, 2013
@jtaylor-sfdc
Copy link
Contributor Author

This could be staged in two parts:

  1. handle the case where the query ends up being a batched get of fully qualified row keys instead of a scan. There are two potential queries constructs over which this optimization could be done: IN and multiple ORs with an equality on the same column. For example, if your primary key is composed of a host column, the following could be optimized to become a batched get:
    1. SELECT ... FROM ... WHERE host IN ('na1','eu1')
    2. SELECT ... FROM ... WHERE host = 'na1' OR host = 'eu1'
  2. handle the case where the query ends up being a set of sequential scans that would skip from row key range to row key range. This is the case where you don't have a set of keys, but they're not fully qualified.
    • same as above, but say the row key is made up of host and something else, like date. In this case, you could have a scan with a start key of 'na1' and a filter that has a skip hint to 'eu1' after the 'na1' rows are exhausted.

My take would be to start with implementing just 1.1 - folks can always rewrite a query like 1.2 using the IN syntax. For 1.2, you'd need to detect when you can't optimize it, so it'd be a little more work.

My take on how 1.1 would be implemented:

  1. Mostly isolated to changes in WhereOptimizer.
    1. Change KeyExpressionVisitor.visitLeave(InListExpression node, List<KeyParts> childKeyExprs) to collect all of the values in the IN in the KeyParts returned. We're currently getting the min value and max value instead (which will cause the scan to have a start key and stop key from the smallest value to the largest value).
    2. In WhereOptimizer.pushKeyExpressionsToScan()
    3. Instead of calculating a single start and stop key, we need to track a start and stop List<List<byte[]>>. Today, we'd have a list of singleton lists where each byte[] is the key being concatenated together to form the start key and stop key. With this optimization, though, we'll have the possibility that we'll have multiple keys (one per value in the IN).
    4. Get rid of the assert(keyExpr.getKeyRanges().size() == 1) and have an inner loop over the keyExpr.getKeyRanges() key ranges. Each one gets added to the List<byte[]> for that key slot.
    5. Instead of the part at the end that creates a final start/stop key, we'd want to delay doing that and hold on to the List<List<byte[]>> on the context object. It'd be good to have a class encapsulate this list.
  2. In AggregatePlan.newScanner() and QueryPlan.newScanner(), create and use a new implementation of ResultIterators for the batched get case. You wouldn't use the ParallelIterators class, but you might push the handling of the batched get into the SerialLimitingIterators class.
    1. The SerialLimitingIterators class is the class that handles the case when a limit has been defined on the query (to stop the scan early). Maybe it makes sense for the batched get to be done per region until the limit is reached? Or do it for a group of the batch get that's as big as the limit (and then do another one if you don't get back rows for some of the gets).
    2. Instead of ParallelIterators, you'd implement your own that did a batch get by having a nested loop over the List<List <byte[]>> structure, essentially expanding it to the complete combination of keys (for the case like this: SELECT ... FROM ... WHERE host IN ('a','b') AND feature IN ('c','d') you'd end up with 'ac','ad','bc', 'bd')

So somewhat involved, but somewhat isolated. I'm sure they'll be gotchas, though. If you're still up for it, let me know and we can talk further over a whiteboard.

@jtaylor-sfdc
Copy link
Contributor Author

Fixed (by Ron)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants