Skip to content

Conversation

@rxin
Copy link
Contributor

@rxin rxin commented May 20, 2016

What changes were proposed in this pull request?

Many other systems (e.g. Impala) uses _xxx as staging, and Spark should not be reading those files.

How was this patch tested?

Added a unit test case.

@rxin
Copy link
Contributor Author

rxin commented May 20, 2016

cc @liancheng and @marmbrus

@SparkQA
Copy link

SparkQA commented May 20, 2016

Test build #59017 has finished for PR 13227 at commit 447fe4e.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

LGTM, pending tests.

// because Parquet needs to find those metadata files from leaf files returned by this method.
// We should refactor this logic to not mix metadata files with data files.
(pathName.startsWith("_") || pathName.startsWith(".")) &&
!pathName.startsWith("_common_metadata") && !pathName.startsWith("_metadata")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why startsWith instead of == here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just in case we do other variants here ..

@liancheng
Copy link
Contributor

LGTM except for one minor comment.

@SparkQA
Copy link

SparkQA commented May 20, 2016

Test build #59020 has finished for PR 13227 at commit 705a76f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor Author

rxin commented May 20, 2016

Merging in master/2.0. Thanks.

asfgit pushed a commit that referenced this pull request May 20, 2016
## What changes were proposed in this pull request?
Many other systems (e.g. Impala) uses _xxx as staging, and Spark should not be reading those files.

## How was this patch tested?
Added a unit test case.

Author: Reynold Xin <rxin@databricks.com>

Closes #13227 from rxin/SPARK-15454.

(cherry picked from commit dcac8e6)
Signed-off-by: Reynold Xin <rxin@databricks.com>
@asfgit asfgit closed this in dcac8e6 May 20, 2016
@SparkQA
Copy link

SparkQA commented May 20, 2016

Test build #59018 has finished for PR 13227 at commit 0d3bc7d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants