-
Notifications
You must be signed in to change notification settings - Fork 999
Fix failing strict wildcard pushdown tests #1683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
See also elastic/elasticsearch#71751 |
def testDataSourcePushDown12And() { | ||
val df = esDataSource("pd_and") | ||
var filter = df.filter(df("reason").isNotNull.and(df("airport").endsWith("O"))) | ||
var filter = df.filter(df("reason").isNotNull.and(df("tag").equalTo("jan"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you help me understand why df("airport").endsWith("O")
changed to df("tag").equalTo("jan")
here and on sql-20/AbstractScalaEsSparkSQL.scala:1104
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The endsWith predicate translates into a wildcard query. Usually we would lowercase the O
character to make the query string be *o
. Wildcard queries are term level queries and do not have analyzers applied to them. When run in "strict" mode, the query keeps the character uppercased (*O
) which does not match the term in lucene. Lowercasing the O
in the test would lead to Spark filtering the data out from the results (since "o" != "O"
).
I swapped the endsWith predicate on the And-testing-methods because I wanted to bring back positive matches from the query in all test cases instead of assuming empty results when just strict mode is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more concisely: The bug deals with strict
mode being enabled, and this test shouldn't be affected by that setting. Removing the feature that does depend on the setting from the test (hopefully) makes it simpler to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, thanks for the explanation. LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
* Handle changes to the wildcard query in strict mode * fix more test
* Handle changes to the wildcard query in strict mode * fix more test
A change that went in a while ago has made changes to the wildcard query functionality. It now functions as it had historically. This PR updates the testing logic to make sure the wildcard tests account for these differences.