New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHOENIX-2722 support mysql limit,offset clauses #154
Conversation
Would you mind reviewing this, @maryannxue as you'd be the best person to catch any corner cases missed, in particular around joins? |
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this would be correct. Suppose we have "select * from (select a from t offset 2 limit 8) offset 3", guess we should return the 5th row instead. If you consider the logic is too complicated to optimize at compiletime, you can simply quit flattening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out. I thought a little and have updated the logic to
if (offsetRewrite != null || (limitRewrite != null & offset != null)) {
return select;
} else {
offsetRewrite = offset;
}
Let me know if any optimization is possible.
@ankitsinghal Thank you very much for the pull request! I am impressed by how many details you have taken into account in your commit. It took me quite a while to go through all of the changes. That said, it would be great if you could add more test cases covering most (if not all) of the code changes, e.g. the join case (left outer join w/ offset and w/wo limit), the derived table case, the join with subquery case, etc. Check JoinCompiler.isFlat(SelectStatement), it returns true only when limit == null, think it should be the same for offset. That function is called when a join contains a subquery, e.g. "select * from (select a, b from t1 offset 3) join (select c, d from t2 limit 10)". |
@maryannxue Thank you so much for the review! I have incorporated the changes as per your comments and added more test cases related to join, derived table ,group by case and join with subquery as you asked. |
@@ -134,7 +134,7 @@ public void testDerivedTableWithWhere() throws Exception { | |||
Connection conn = DriverManager.getConnection(getUrl(), props); | |||
try { | |||
// (where) | |||
String query = "SELECT t.eid, t.x + 9 FROM (SELECT entity_id eid, b_string b, a_byte + 1 x FROM aTable WHERE a_byte + 1 < 9) AS t"; | |||
String query = "SELECT t.eid, t.x + 9 FROM (SELECT entity_id eid, b_string b, a_byte + 1 x FROM aTable WHERE a_byte + 1 < 9 ) AS t"; | |||
PreparedStatement statement = conn.prepareStatement(query); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary diff
@ankitsinghal Thank you very much for making the suggested changes! Added a few more suggestions, most of them being minor. |
Thanks @maryannxue for suggestions , I have incorporated the review comments. |
@@ -85,7 +85,8 @@ public void close() throws SQLException { | |||
|
|||
@Override | |||
public Tuple next() throws SQLException { | |||
while (!this.closed && (offset != null && count++ < offset) && tupleIterator.hasNext()) { | |||
while (!this.closed && (offset != null && count < offset) && tupleIterator.hasNext()) { | |||
count++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant was this "count++" doesn't seem to work with LIMIT(below) together. LIMIT should be number of rows returned starting from the OFFSET row, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was in a notion that I updated limit to limit+offset already.
Thanks for catching it, I have corrected it now.
@ankitsinghal Some (maybe final) comments added. |
updated LiteralResultIterationPlan updated test in erivedTableIT.java
I have taken care the last review comments in my latest commit. |
@maryannxue , what do you think .. is it good to go now? |
Hi @ankitsinghal, sorry about the delay. I don't get github notification in mailbox... It looks good to me now. Thank you for all the work! @JamesRTaylor What do you think? |
subSelect = NODE_FACTORY.select(subSelect, select.getOrderBy(), select.getLimit(), null); | ||
} else { | ||
subSelect = NODE_FACTORY.select(subSelect, select.getOrderBy(), null, null); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than this if statement, can we do the following to simplify it?
subSelect = NODE_FACTORY.select(subSelect, select.getOrderBy(), select.getLimit(), select.getOffset())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually , In case of union , we need to apply offset at the final result so we can't apply limit or offset in subselect.
but, if offset is not present , limit can be applied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually , In case of union , we need to apply offset at the final result so we can't apply limit or offset in subselect.
but, if offset is not present , limit can be applied
} | ||
if (limit >= 0 && count++ >= limit) { return null; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be just {{count}} or {{count++}} ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it should count++ only, but there is a bug in offset count .. which I have updated now.
Thanks @JamesRTaylor for review.. I have made the changes are per your review comments. |
* Default implementation that creates a scan group if a plan is row key ordered (which requires a merge sort), | ||
* or if a scan crosses a region boundary and the table is salted or a local index. | ||
*/ | ||
public class OffsetScanGrouper implements ParallelScanGrouper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this class needed as it doesn't appear to do anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is needed to avoid scans in parallel even in serialIterators. should I use a anonymous class while creating SerialIterators to do offset on server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scans aren't done in parallel for SerialIterator, they're done serially. Have you tried it without this? I can't figure out what purpose it's serving, but perhaps I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JamesRTaylor , It seems for serialIterators also , we use DefaultParallelScanGrouper for splitting scans in groups by condition (isSalted || table.getIndexType() == IndexType.LOCAL) && ScanUtil.shouldRowsBeInRowKeyOrder(orderBy, context)
and run each group in parallel.
Anyways, I have modified the plan to run offset on server by including this condition and re-use the DefaultParallelScanGrouper
A few minor items, but otherwise it looks good. Nice work, @ankitsinghal! If you could file a JIRA for the corresponding website/doc changes and submit a patch for that when we're closer to the 4.8 release, that'd be much appreciated! |
No description provided.