Support pushdown sort by simple expressions#4071
Support pushdown sort by simple expressions#4071yuancu merged 12 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
yuancu
left a comment
There was a problem hiding this comment.
It seems that when the sorted field is not projected in the final output, there will be an added sort at the top.
E.g.
-
source=opensearch-sql_test_index_account | eval b = balance + 1 | sort b | fields b:EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[+($t0, $t1)], balance=[$t0], $f1=[$t2]) CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[SORT->[{ "balance" : { "order" : "asc", "missing" : "_first" } }], LIMIT->10000, PROJECT->[balance]], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":10000,"timeout":"1m","_source":{"includes":["balance"],"excludes":[]},"sort":[{"balance":{"order":"asc","missing":"_first"}}]}, requestedTotalSize=10000, pageSize=null, startFrom=0)]) -
but for
source=opensearch-sql_test_index_account | eval b = balance + 1 | sort b | fields bEnumerableSort(sort0=[$0], dir0=[ASC-nulls-first]) EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[+($t0, $t1)], $f0=[$t2]) CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[PROJECT->[balance], SORT->[{ "balance" : { "order" : "asc", "missing" : "_first" } }], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":10000,"timeout":"1m","_source":{"includes":["balance"],"excludes":[]},"sort":[{"balance":{"order":"asc","missing":"_first"}}]}, requestedTotalSize=10000, pageSize=null, startFrom=0)])
...c/main/java/org/opensearch/sql/opensearch/planner/physical/SortProjectExprTransposeRule.java
Outdated
Show resolved
Hide resolved
Nice catch! It happens when project doesn't carry the equivalent input field in its project node expressions. In I need to check if it's safe to use Project.getInput() collation instead of collations on Project itself. The problem should be solved with that fix. |
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
| // Cast from low precision to high precision | ||
| srcType = typeFactory.createSqlType(SqlTypeName.DECIMAL); | ||
| srcInput = rexBuilder.makeInputRef(srcType, 1); | ||
| dstType = | ||
| typeFactory.createSqlType( | ||
| SqlTypeName.DECIMAL, srcType.getPrecision() + 4, srcType.getScale() + 4); | ||
| cast = rexBuilder.makeCast(dstType, srcInput); | ||
| result = OpenSearchRelOptUtil.getOrderEquivalentInputInfo(cast); | ||
| assertExpectedInputInfo(result, 1, false); |
There was a problem hiding this comment.
A converse test like the following may help:
// Cast from high precision to low precision
srcType = typeFactory.createSqlType(SqlTypeName.DECIMAL);
srcInput = rexBuilder.makeInputRef(srcType, 1);
dstType =
typeFactory.createSqlType(
SqlTypeName.DECIMAL, srcType.getPrecision() - 4, srcType.getScale());
cast = rexBuilder.makeCast(dstType, srcInput);
result = OpenSearchRelOptUtil.getOrderEquivalentInputInfo(cast);
assertFalse(result.isPresent());* Support pushdown sort by simple expressions Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix IT for no pushdown case Signed-off-by: Songkan Tang <songkant@amazon.com> * Add minor case to allow sort pushdown for casted floating number Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix the issue of using wrong fromCollation Signed-off-by: Songkan Tang <songkant@amazon.com> * Add some unit tests for OpenSearchRelOptUtil Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix checkstyle Signed-off-by: Songkan Tang <songkant@amazon.com> --------- Signed-off-by: Songkan Tang <songkant@amazon.com>
* Support pushdown sort by simple expressions Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix IT for no pushdown case Signed-off-by: Songkan Tang <songkant@amazon.com> * Add minor case to allow sort pushdown for casted floating number Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix the issue of using wrong fromCollation Signed-off-by: Songkan Tang <songkant@amazon.com> * Add some unit tests for OpenSearchRelOptUtil Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix checkstyle Signed-off-by: Songkan Tang <songkant@amazon.com> --------- Signed-off-by: Songkan Tang <songkant@amazon.com>
… (#4243) * Support pushdown sort by simple expressions (#4071) * Support pushdown sort by simple expressions Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix IT for no pushdown case Signed-off-by: Songkan Tang <songkant@amazon.com> * Add minor case to allow sort pushdown for casted floating number Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix the issue of using wrong fromCollation Signed-off-by: Songkan Tang <songkant@amazon.com> * Add some unit tests for OpenSearchRelOptUtil Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix checkstyle Signed-off-by: Songkan Tang <songkant@amazon.com> --------- Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix compile issue Signed-off-by: Songkan Tang <songkant@amazon.com> --------- Signed-off-by: Songkan Tang <songkant@amazon.com>
Description
This PR aims to resolve simple sort expression pushdown problem without prerequisite of project pushdown optimization.
A PPL query may contain a sort over projected trivial expression like:
source = test | eval b = a + 1 | sort bThis optimization PR will optimize this PPL into
source = test | sort a | eval b = a + 1so that the problem is translated to pushdown field sort on column a. Also, sorting by field is supposed to be faster than sorting by script considering OpenSearch internal optimization.Related Issues
Resolves #3990
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.