Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of struct with a single field named "element" #14013

Conversation

liancheng
Copy link
Contributor

What changes were proposed in this pull request?

Please refer to SPARK-16344 for details about this issue.

How was this patch tested?

New test case added in ParquetQuerySuite.

//
// This case branch must appear before the next one. See comments of the next case branch
// for details.
false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case branch is essential for the bug fix. Basically, it matches the standard 3-level layout first before trying to match the legacy 2-level layout, so that the "element" syntactic group in Parquet LIST won't be mistaken for the "element" field in the nested struct.

@liancheng
Copy link
Contributor Author

cc @yhuai

@SparkQA
Copy link

SparkQA commented Jul 1, 2016

Test build #61611 has finished for PR 14013 at commit 9620b48.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng liancheng changed the title [SPARK-16344][SQL] Decoding Parquet array of struct with a single field named "element" [SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of struct with a single field named "element" Jul 1, 2016
@liancheng
Copy link
Contributor Author

liancheng commented Jul 1, 2016

@rdblue Would you mind to help review this one? My initial investigation suggested that parquet-avro probably suffers the same issue. Will file a parquet-mr JIRA ticket soon if that's true.

@SparkQA
Copy link

SparkQA commented Jul 1, 2016

Test build #61612 has finished for PR 14013 at commit c40bccb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor Author

@rdblue Verified that parquet-avro also suffers from this issue. Filed PARQUET-651 to track it.

@SparkQA
Copy link

SparkQA commented Jul 6, 2016

Test build #61812 has finished for PR 14013 at commit e44451e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 6, 2016

Test build #61838 has finished for PR 14013 at commit 70b2e9c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng liancheng force-pushed the spark-16344-parquet-schema-corner-case branch from 70b2e9c to b942dca Compare July 10, 2016 08:50
@SparkQA
Copy link

SparkQA commented Jul 10, 2016

Test build #62052 has finished for PR 14013 at commit b942dca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor Author

I'm closing this one since we decided to fix this in master only.

@liancheng liancheng closed this Jul 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants