Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-1703] Refactored code for creation of filter for IN expression #1650

Closed
wants to merge 7 commits into from

Conversation

geetikagupta16
Copy link
Contributor

@geetikagupta16 geetikagupta16 commented Dec 13, 2017

Problem: There was a difference in result set count of carbon and hive tables

Analysis: This difference was due to the creation of filters in hive and carbon. In hive 'NULL' is not added in filter creation for IN expression whereas in carbon we were adding 'NULL' also in filter creation

Solution: Removed addition of 'NULL' values in case of IN filter creation

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed? No

  • Any backward compatibility impacted? No

  • Document update required? No

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1914/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/685/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2247/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/703/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1932/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2263/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/741/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1972/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2291/

convertToJavaList(values.map(f => getCarbonLiteralExpression(name, f)).toList))))
if (values.length == 1 && values(0) == null) {
Some(new InExpression(getCarbonExpression(name),
new ListExpression(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the In and Not In filter as par of TransformExpression. For e.g. Please look into line no 321-323..when it was corrected there it got missed in createFilter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the null check for IN expression in line no. 89

Copy link
Contributor

@gvramana gvramana Dec 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why two conditions length=1 and others when logic is same. only else condition should be sufficient
  2. Not In case when null comes, in method TransformExpression (as sounak pointed), the behaviour was similar to falseExpression. So we have to conform not IN null case with hive to decide the behaviour.
  3. Please add test cases for In and Not in case with null expression and compare behaviour with hive.
  4. can isNullLiteral(_) possible anytime? as it is in TransformExpression method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 1 condition was added when we have only NULL in our IN expression for that case, the Left expression was becoming NULL. So to avoid that we added this case.

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/788/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2022/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2334/

@sraghunandan
Copy link
Contributor

ok to test

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/809/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2351/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2040/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2086/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/861/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2384/

@geetikagupta16
Copy link
Contributor Author

retest this please

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2392/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/875/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2100/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/878/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2103/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/945/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2174/

@geetikagupta16
Copy link
Contributor Author

@gvramana Please review again, as I have made the required changes

new ListExpression(
convertToJavaList(values.map(f => getCarbonLiteralExpression(name, f)).toList))))
if (values.length == 1 && values(0) == null) {
Some(new InExpression(getCarbonExpression(name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case there are only one value in In list and that is also qualifies as Null, then why cant we return False expression just like Not In case. i.e. Some(new FalseExpression(getCarbonExpression(name))).

@@ -75,6 +78,9 @@ class ExpressionWithNullTestCase extends QueryTest with BeforeAndAfterAll {
checkAnswer(sql("select * from expression_test where id not in ('2')"), sql("select * from expression_test_hive where id not in ('2')"))
checkAnswer(sql("select * from expression_test where id not in (cast('2' as int))"), sql("select * from expression_test_hive where id not in (cast('2' as int))"))
// checkAnswer(sql("select * from expression_test where id not in (cast('null' as int))"), sql("select * from expression_test_hive where id not in (cast('null' as int))"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please uncomment the test cases and rerun it once.

@@ -75,6 +78,9 @@ class ExpressionWithNullTestCase extends QueryTest with BeforeAndAfterAll {
checkAnswer(sql("select * from expression_test where id not in ('2')"), sql("select * from expression_test_hive where id not in ('2')"))
checkAnswer(sql("select * from expression_test where id not in (cast('2' as int))"), sql("select * from expression_test_hive where id not in (cast('2' as int))"))
// checkAnswer(sql("select * from expression_test where id not in (cast('null' as int))"), sql("select * from expression_test_hive where id not in (cast('null' as int))"))
checkAnswer(sql("select * from expression_test where id not in (1,2,NULL)"), sql("select * from expression_test_hive where id not in (1,2,NULL)"))
checkAnswer(sql("select * from expression_test where id not in (NULL)"), sql("select * from expression_test_hive where id not in (NULL)"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some more test cases with explicit cast and with Nulls

if (values.contains(null)) {
Some(new FalseExpression(getCarbonExpression(name)))
} else {
Some(new NotInExpression(getCarbonExpression(name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think similar changes are required in transformExpression also. In case Cast is not resolved by Spark then transformExpression will handle those expression. But prior doing any changes we need supporting test cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please provide scenario for the same

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2387/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1171/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2394/

@geetikagupta16
Copy link
Contributor Author

retest this please

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2409/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1185/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2593/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2594/

@geetikagupta16
Copy link
Contributor Author

retest this please

@jackylk
Copy link
Contributor

jackylk commented Dec 29, 2017

@geetikagupta16 can you add more description of this bug? what is the bug and what is the root analysis of it?

@geetikagupta16
Copy link
Contributor Author

@jackylk I have updated the PR description please check

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2436/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1212/

@sounakr
Copy link
Contributor

sounakr commented Jan 3, 2018

LGTM

@sraghunandan
Copy link
Contributor

retest sdv please

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2696/

@gvramana
Copy link
Contributor

gvramana commented Jan 9, 2018

LGTM

@asfgit asfgit closed this in 082a355 Jan 9, 2018
anubhav100 pushed a commit to anubhav100/incubator-carbondata that referenced this pull request Jun 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants