Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23456][SPARK-21783] Turn on `native` ORC impl and PPD by default #20634

Closed
wants to merge 1 commit into from

Conversation

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Feb 17, 2018

What changes were proposed in this pull request?

Apache Spark 2.3 introduced native ORC supports with vectorization and many fixes. However, it's shipped as a not-default option. This PR enables native ORC implementation and predicate-pushdown by default for Apache Spark 2.4. We will improve and stabilize ORC data source before Apache Spark 2.4. And, eventually, Apache Spark will drop old Hive-based ORC code.

How was this patch tested?

Pass the Jenkins with existing tests.

@SparkQA

This comment has been minimized.

Copy link

SparkQA commented Feb 17, 2018

Test build #87525 has finished for PR 20634 at commit bde6818.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
Copy link
Member

gatorsmile left a comment

LGTM

@gatorsmile

This comment has been minimized.

Copy link
Member

gatorsmile commented Feb 20, 2018

Thanks! Merged to master.

@asfgit asfgit closed this in 83c0087 Feb 20, 2018
@dongjoon-hyun

This comment has been minimized.

Copy link
Member Author

dongjoon-hyun commented Feb 20, 2018

Thank you, @gatorsmile .

@dongjoon-hyun dongjoon-hyun deleted the dongjoon-hyun:SPARK-23456 branch Feb 20, 2018
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
Apache Spark 2.3 introduced `native` ORC supports with vectorization and many fixes. However, it's shipped as a not-default option. This PR enables `native` ORC implementation and predicate-pushdown by default for Apache Spark 2.4. We will improve and stabilize ORC data source before Apache Spark 2.4. And, eventually, Apache Spark will drop old Hive-based ORC code.

Pass the Jenkins with existing tests.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes apache#20634 from dongjoon-hyun/SPARK-23456.

Change-Id: Ib7ec85d2ae6b96451fd28370ef5f5e3924d10de8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.