Skip to content

Commit

Permalink
[SPARK-24716][TESTS][FOLLOW-UP] Test Hive metastore schema and parque…
Browse files Browse the repository at this point in the history
…t schema are in different letter cases

## What changes were proposed in this pull request?

Since apache#21696. Spark uses Parquet schema instead of Hive metastore schema to do pushdown.
That change can avoid wrong records returned when Hive metastore schema and parquet schema are in different letter cases. This pr add a test case for it.

More details:
https://issues.apache.org/jira/browse/SPARK-25206

## How was this patch tested?

unit tests

Closes apache#22267 from wangyum/SPARK-24716-TESTS.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
wangyum authored and fjh100456 committed Aug 31, 2018
1 parent 7f9bcf7 commit 709cdc6
Showing 1 changed file with 16 additions and 0 deletions.
Expand Up @@ -20,6 +20,7 @@ package org.apache.spark.sql.hive
import org.apache.spark.sql.{QueryTest, Row}
import org.apache.spark.sql.execution.datasources.parquet.ParquetTest
import org.apache.spark.sql.hive.test.TestHiveSingleton
import org.apache.spark.sql.internal.SQLConf

case class Cases(lower: String, UPPER: String)

Expand Down Expand Up @@ -76,4 +77,19 @@ class HiveParquetSuite extends QueryTest with ParquetTest with TestHiveSingleton
}
}
}

test("SPARK-25206: wrong records are returned by filter pushdown " +
"when Hive metastore schema and parquet schema are in different letter cases") {
withSQLConf(SQLConf.PARQUET_FILTER_PUSHDOWN_ENABLED.key -> true.toString) {
withTempPath { path =>
val data = spark.range(1, 10).toDF("id")
data.write.parquet(path.getCanonicalPath)
withTable("SPARK_25206") {
sql("CREATE TABLE SPARK_25206 (ID LONG) USING parquet LOCATION " +
s"'${path.getCanonicalPath}'")
checkAnswer(sql("select id from SPARK_25206 where id > 0"), data)
}
}
}
}
}

0 comments on commit 709cdc6

Please sign in to comment.