Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-11301][SQL] Fix case sensitivity for filter on partitioned col…
## What changes were proposed in this pull request? `DataSourceStrategy` does not consider `SQLConf` in `Context` and always match column names. For instance, `HiveContext` uses case insensitive configuration, but it's ignored in `DataSourceStrategy`. This issue was originally registered at SPARK-11301 against 1.6.0 and seemed to be fixed at that time, but Apache Spark 1.6.2 still handles **partitioned column name** in a case-sensitive way always. This is incorrect like the following. ```scala scala> sql("CREATE TABLE t(a int) PARTITIONED BY (b string) STORED AS PARQUET") scala> sql("INSERT INTO TABLE t PARTITION(b='P') SELECT * FROM (SELECT 1) t") scala> sql("INSERT INTO TABLE t PARTITION(b='Q') SELECT * FROM (SELECT 2) t") scala> sql("SELECT * FROM T WHERE B='P'").show +---+---+ | a| b| +---+---+ | 1| P| | 2| Q| +---+---+ ``` The result is the same with `set spark.sql.caseSensitive=false`. Here is the result in [Databricks CE](https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/6660119172909095/3421754458488607/5162191866050912/latest.html) . This PR reads the configuration and handle the column name comparison accordingly. ## How was this patch tested? Pass the Jenkins test with a modified test. Author: Dongjoon Hyun <dongjoon@apache.org> Closes apache#14970 from dongjoon-hyun/SPARK-11301. (cherry picked from commit 958039a)
- Loading branch information