[SPARK-24716][TESTS][FOLLOW-UP] Test Hive metastore schema and parque…

…t schema are in different letter cases ## What changes were proposed in this pull request? Since apache#21696. Spark uses Parquet schema instead of Hive metastore schema to do pushdown. That change can avoid wrong records returned when Hive metastore schema and parquet schema are in different letter cases. This pr add a test case for it. More details: https://issues.apache.org/jira/browse/SPARK-25206 ## How was this patch tested? unit tests Closes apache#22267 from wangyum/SPARK-24716-TESTS. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
fjh100456 · Aug 31, 2018 · 709cdc6 · 709cdc6
1 parent 7f9bcf7
commit 709cdc6
Showing 1 changed file with 16 additions and 0 deletions.
diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala
@@ -20,6 +20,7 @@ package org.apache.spark.sql.hive
 import org.apache.spark.sql.{QueryTest, Row}
 import org.apache.spark.sql.execution.datasources.parquet.ParquetTest
 import org.apache.spark.sql.hive.test.TestHiveSingleton
+import org.apache.spark.sql.internal.SQLConf
 
 case class Cases(lower: String, UPPER: String)
 
@@ -76,4 +77,19 @@ class HiveParquetSuite extends QueryTest with ParquetTest with TestHiveSingleton
       }
     }
   }
+
+  test("SPARK-25206: wrong records are returned by filter pushdown " +
+    "when Hive metastore schema and parquet schema are in different letter cases") {
+    withSQLConf(SQLConf.PARQUET_FILTER_PUSHDOWN_ENABLED.key -> true.toString) {
+      withTempPath { path =>
+        val data = spark.range(1, 10).toDF("id")
+        data.write.parquet(path.getCanonicalPath)
+        withTable("SPARK_25206") {
+          sql("CREATE TABLE SPARK_25206 (ID LONG) USING parquet LOCATION " +
+            s"'${path.getCanonicalPath}'")
+          checkAnswer(sql("select id from SPARK_25206 where id > 0"), data)
+        }
+      }
+    }
+  }
 }