Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Support "starrocks.filter.query" for Spark SQL read #92

Merged
merged 1 commit into from
Nov 3, 2023

Conversation

feihengye
Copy link
Contributor

What type of PR is this:

BugFix

Which issues of this PR fixes :

Fixes #91

Problem Summary(Required) :

@CLAassistant
Copy link

CLAassistant commented Nov 1, 2023

CLA assistant check
All committers have signed the CLA.

Copy link
Collaborator

@banmoy banmoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I think it's ok to support user-defined filters in the configuration. Just curious about why not add filter in the join clause like
select * from spark_starrocks_4 a join spark_starrocks_4 b on a.name = b.name where a.name = 'jhon' or a.name = 'Lucy'
The filter will be pushed down to the source automatically, and it seems more flexible. Defining the filter in the configuration is generally used for DataFrame program.

@@ -83,7 +83,11 @@ private[sql] class StarrocksRelation(
}

if (filters != null && filters.length > 0) {
paramWithScan += (ConfigurationOptions.STARROCKS_FILTER_QUERY -> filterWhereClause)
val userFilters = paramWithScan.get(ConfigurationOptions.STARROCKS_FILTER_QUERY)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. what if the user configures the filter, but the framework's filters is null?
  2. add a test to verify it. please refer to ReadWriteITTest#testSql for an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. use old config, ConfigurationOptions.STARROCKS_FILTER_QUERY
  2. ok

@banmoy
Copy link
Collaborator

banmoy commented Nov 3, 2023

Please deal with the DCO and CLA check failure

@feihengye
Copy link
Contributor Author

feihengye commented Nov 3, 2023

Thanks for the PR. I think it's ok to support user-defined filters in the configuration. Just curious about why not add filter in the join clause like select * from spark_starrocks_4 a join spark_starrocks_4 b on a.name = b.name where a.name = 'jhon' or a.name = 'Lucy' The filter will be pushed down to the source automatically, and it seems more flexible. Defining the filter in the configuration is generally used for DataFrame program.

Thanks for you answer.
User create tmp view and config the push down filters already, it is a table now. And I want to push down filters to the table.

@feihengye feihengye force-pushed the opensource-main-jhonye branch 2 times, most recently from 68f0d30 to 2234006 Compare November 3, 2023 03:12
Signed-off-by: feihengye <qiangsheng.yet@gmail.com>
@feihengye feihengye changed the title Fix spark multi join tmp view issue. [Bugfix] Fix spark multi join tmp view issue(#91). Nov 3, 2023
@banmoy banmoy changed the title [Bugfix] Fix spark multi join tmp view issue(#91). [Bugfix] Support "starrocks.filter.query" for Spark SQL read Nov 3, 2023
@banmoy banmoy merged commit 8bbc057 into StarRocks:main Nov 3, 2023
3 checks passed
@banmoy
Copy link
Collaborator

banmoy commented Nov 3, 2023

LGTM. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

【Bug】error result for Spark multi TMP view join
3 participants