Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination Phase 2: Support WHERE clause, column list in SELECT clause and for functions and expressions in the query. #1500

Merged
merged 5 commits into from
Jun 14, 2023

Conversation

Yury-Fridlyand
Copy link
Collaborator

@Yury-Fridlyand Yury-Fridlyand commented Apr 6, 2023

Pagination in V2: Phase 2. See Phase 1 in #1497.

Doc: https://github.com/Bit-Quill/opensearch-project-sql/blob/61767f2b200b2a7f8c8b2df32de209b3c30caa61/docs/dev/Pagination-v2.md

curl -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query": "select highlight(\"Body\") from beer.stackexchange where simple_query_string([\"Tags\" ^ 1.5, \"Title\", \"Body\" 4.2], \"taste\") and Tags like \"% % %\" and Title like \"%\";", "fetch_size": 20 }'

You can use attached script for testing as well. Command line: ./cursor_test.sh <table> <page size>. Requires jq. Rename the script before use.
cursor_test.sh.txt

./cursor_test.sh 'SELECT date0, DATE_ADD(CAST(date0 AS date), INTERVAL 1 HOUR), time0, DATE_ADD(datetime(cast(time0 AS string)), INTERVAL 1 HOUR), time1, DATE_ADD(CAST(time1 AS time), INTERVAL 1 HOUR), datetime0, DATE_ADD(CAST(datetime0 AS timestamp), INTERVAL 1 HOUR) FROM calcs where int3 + num1 < 20;' 2
Got 2 rows
Got 2 rows
Got 2 rows
Got 2 rows
Got 2 rows
Fetched 10 rows in 6 requests in .346s

Description

Add support for more complex queries in paged requests.
Supported things:

  • WHERE clause
  • HAVING clause
  • column names in SELECT clause
  • functions in SELECT and WHERE:
    • build-in functions
    • SQL specific functions, like IN, BETWEEN, CASE .. WHEN ..
    • OpenSearch functions (relevancy search and highlight)
  • literals
  • metafields like _id, _score, etc

Unsupported things:

  • System queries SHOW and DESCRIBE
  • NESTED function
  • aggregation functions like min, max
  • In-memory aggregation with window function OVER

Issues Resolved

Full support of SELECT clause and support WHERE clause, including functions and expressions.

DEMO

Pagination.with.WHERE.clause.mp4

TODOs:

  • Demo
  • More integration tests

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov-commenter
Copy link

codecov-commenter commented Apr 6, 2023

Codecov Report

Merging #1500 (bd27566) into main (691012d) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #1500      +/-   ##
============================================
+ Coverage     97.27%   97.32%   +0.04%     
- Complexity     4330     4418      +88     
============================================
  Files           388      388              
  Lines         10807    10976     +169     
  Branches        761      775      +14     
============================================
+ Hits          10513    10682     +169     
  Misses          287      287              
  Partials          7        7              
Flag Coverage Δ
sql-engine 97.32% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...sql/opensearch/request/OpenSearchQueryRequest.java 100.00% <ø> (ø)
...arch/sql/opensearch/request/OpenSearchRequest.java 100.00% <ø> (ø)
...ch/sql/executor/pagination/CanPaginateVisitor.java 100.00% <100.00%> (ø)
...ql/opensearch/request/OpenSearchScrollRequest.java 100.00% <100.00%> (ø)

... and 3 files with indirect coverage changes

@Yury-Fridlyand Yury-Fridlyand changed the base branch from feature/pagination/P2 to feature/pagination/integ April 27, 2023 21:06
@Yury-Fridlyand Yury-Fridlyand added the pagination Pagination feature, ref #656 label May 1, 2023
// SELECT max(age) OVER (PARTITION BY city) ...
var projections = node.getProjectList();
if (projections.size() != 1) {
public Boolean visitSort(Sort node, Object context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to override the defaultResult instead of returning Boolean.FALSE for all unsupported nodes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or on the flip side, you could make the defaultResult Boolean.TRUE so that you don't have make duplicated canPaginate(node, context) calls. Doing this can eliminate canPaginate and would use the default visitor of the AbstractNodeVisitor as it traverses through it's children anyways.

@GumpacG
Copy link
Collaborator

GumpacG commented Jun 7, 2023

Queries with aliases seem to be falling back to legacy.
For example,
SELECT int0 as 'a' FROM calcs

@Yury-Fridlyand
Copy link
Collaborator Author

Queries with aliases seem to be falling back to legacy. For example, SELECT int0 as 'a' FROM calcs

@GumpacG, There is a syntax issue. Correct one is SELECT int0 as `a` FROM calcs or SELECT int0 as a FROM calcs.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
@@ -17,6 +17,7 @@
import java.io.IOException;
import org.json.JSONObject;
import org.junit.Assert;
import org.junit.Ignore;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably missed this.


// Queries with LIMIT clause are not supported
@Override
public Boolean visitLimit(Limit node, Object context) {
Copy link
Collaborator

@forestmvey forestmvey Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of having these explicit cases for what isn't supported if it will return false without the override. Perhaps we should log something stating the feature is unsupported. Maybe add a TODO comment for future work.

forestmvey
forestmvey previously approved these changes Jun 8, 2023
GumpacG
GumpacG previously approved these changes Jun 8, 2023
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
@Yury-Fridlyand Yury-Fridlyand dismissed stale reviews from GumpacG and forestmvey via bd27566 June 14, 2023 00:13
@Yury-Fridlyand Yury-Fridlyand changed the title Add support for WHERE clause, column list in SELECT clause and for functions and expressions in the query. Pagination Phase 2: Support WHERE clause, column list in SELECT clause and for functions and expressions in the query. Jun 14, 2023
@Yury-Fridlyand
Copy link
Collaborator Author

CI fails on BWC, ignoring it for now.

@Yury-Fridlyand Yury-Fridlyand merged commit da386e5 into main Jun 14, 2023
17 of 22 checks passed
@Yury-Fridlyand Yury-Fridlyand deleted the feature/pagination/where branch June 14, 2023 22:41
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 14, 2023
…lause and for functions and expressions in the query. (#1500)

* Add support for `WHERE` clause, column list in `SELECT` clause and for functions and expressions in the query.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Fix merge issue and address PR feedback by updating comments.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* More comments.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Add extra check for unset `initialSearchRequest`.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
(cherry picked from commit da386e5)
Yury-Fridlyand added a commit that referenced this pull request Jun 15, 2023
…lause and for functions and expressions in the query. (#1500) (#1741)

* Add support for `WHERE` clause, column list in `SELECT` clause and for functions and expressions in the query.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Fix merge issue and address PR feedback by updating comments.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* More comments.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Add extra check for unset `initialSearchRequest`.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
(cherry picked from commit da386e5)

Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>
@Yury-Fridlyand Yury-Fridlyand added this to In Release 2.9 in SQL/PPL Epic Roadmap Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x pagination Pagination feature, ref #656
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants