Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[SPARK-23456][SPARK-21783] Turn on `native` ORC impl and PPD by default #20634
What changes were proposed in this pull request?
Apache Spark 2.3 introduced
How was this patch tested?
Pass the Jenkins with existing tests.
Apache Spark 2.3 introduced `native` ORC supports with vectorization and many fixes. However, it's shipped as a not-default option. This PR enables `native` ORC implementation and predicate-pushdown by default for Apache Spark 2.4. We will improve and stabilize ORC data source before Apache Spark 2.4. And, eventually, Apache Spark will drop old Hive-based ORC code. Pass the Jenkins with existing tests. Author: Dongjoon Hyun <email@example.com> Closes apache#20634 from dongjoon-hyun/SPARK-23456. Change-Id: Ib7ec85d2ae6b96451fd28370ef5f5e3924d10de8