Skip to content

Conversation

gatorsmile
Copy link
Member

Added the test case that can cause data loss in the following scenario:

When applying the operator Not, the current generation rule for Parquet filters simply applies Not to all the inclusive/underlying filters.

Note: will submit the fix after the test case failure.

@SparkQA
Copy link

SparkQA commented Dec 17, 2015

Test build #47891 has finished for PR 10344 at commit 660eef5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

checkAnswer(
    sqlContext.read.parquet(path).where("not (a = 2 and b in ('1'))"),
    (1 to 5).map(i => Row(i, (i%2).toString)))

The above test case failed.

[info]   == Results ==
[info]   !== Correct Answer - 5 ==   == Spark Answer - 4 ==
[info]    [1,1]                      [1,1]
[info]   ![2,0]                      [3,1]
[info]   ![3,1]                      [4,0]
[info]   ![4,0]                      [5,1]
[info]   ![5,1]

@yhuai Could you take a look at it? Should I merge my fix? Thanks!

@yhuai
Copy link
Contributor

yhuai commented Dec 17, 2015

I see. Thanks! Let's merge your fix.

@gatorsmile
Copy link
Member Author

This is just an one-line fix. It does not include the pushdown for IN. @yhuai If you want it, I also can merge that part and the related test cases. Thanks!

@SparkQA
Copy link

SparkQA commented Dec 17, 2015

Test build #47899 has finished for PR 10344 at commit 7d298fe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 17, 2015

Test build #47941 has finished for PR 10344 at commit 65c5ad7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile gatorsmile closed this Dec 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants