Skip to content
Permalink
Browse files

[MINOR][DOCS] Fix Spark hive example.

## What changes were proposed in this pull request?

Documentation has an error, https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#hive-tables.

The example:
```scala
scala> val dataDir = "/tmp/parquet_data"
dataDir: String = /tmp/parquet_data

scala> spark.range(10).write.parquet(dataDir)

scala> sql(s"CREATE EXTERNAL TABLE hive_ints(key int) STORED AS PARQUET LOCATION '$dataDir'")
res6: org.apache.spark.sql.DataFrame = []

scala> sql("SELECT * FROM hive_ints").show()

+----+
| key|
+----+
|null|
|null|
|null|
|null|
|null|
|null|
|null|
|null|
|null|
|null|
+----+
```

Range does not emit `key`, but `id` instead.

Closes #24657 from ScrapCodes/fix_hive_example.

Lead-authored-by: Prashant Sharma <prashant@apache.org>
Co-authored-by: Prashant Sharma <prashsh1@in.ibm.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 5f4b505)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
  • Loading branch information...
ScrapCodes authored and HyukjinKwon committed May 21, 2019
1 parent f5310be commit 533d603cebd2fe5197f4b4d40e1f54fa94c74f36
@@ -122,16 +122,16 @@ object SparkHiveExample {
val dataDir = "/tmp/parquet_data"
spark.range(10).write.parquet(dataDir)
// Create a Hive external Parquet table
sql(s"CREATE EXTERNAL TABLE hive_ints(key int) STORED AS PARQUET LOCATION '$dataDir'")
sql(s"CREATE EXTERNAL TABLE hive_bigints(id bigint) STORED AS PARQUET LOCATION '$dataDir'")
// The Hive external table should already have data
sql("SELECT * FROM hive_ints").show()
sql("SELECT * FROM hive_bigints").show()
// +---+
// |key|
// | id|
// +---+
// | 0|
// | 1|
// | 2|
// ...
// ... Order may vary, as spark processes the partitions in parallel.

// Turn on flag for Hive Dynamic Partitioning
spark.sqlContext.setConf("hive.exec.dynamic.partition", "true")

0 comments on commit 533d603

Please sign in to comment.
You can’t perform that action at this time.