Skip to content

Commit

Permalink
[MINOR][DOCS] Fix Spark hive example.
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?

Documentation has an error, https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#hive-tables.

The example:
```scala
scala> val dataDir = "/tmp/parquet_data"
dataDir: String = /tmp/parquet_data

scala> spark.range(10).write.parquet(dataDir)

scala> sql(s"CREATE EXTERNAL TABLE hive_ints(key int) STORED AS PARQUET LOCATION '$dataDir'")
res6: org.apache.spark.sql.DataFrame = []

scala> sql("SELECT * FROM hive_ints").show()

+----+
| key|
+----+
|null|
|null|
|null|
|null|
|null|
|null|
|null|
|null|
|null|
|null|
+----+
```

Range does not emit `key`, but `id` instead.

Closes #24657 from ScrapCodes/fix_hive_example.

Lead-authored-by: Prashant Sharma <prashant@apache.org>
Co-authored-by: Prashant Sharma <prashsh1@in.ibm.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
  • Loading branch information
2 people authored and HyukjinKwon committed May 21, 2019
1 parent d90c460 commit 5f4b505
Showing 1 changed file with 4 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -122,16 +122,16 @@ object SparkHiveExample {
val dataDir = "/tmp/parquet_data"
spark.range(10).write.parquet(dataDir)
// Create a Hive external Parquet table
sql(s"CREATE EXTERNAL TABLE hive_ints(key int) STORED AS PARQUET LOCATION '$dataDir'")
sql(s"CREATE EXTERNAL TABLE hive_bigints(id bigint) STORED AS PARQUET LOCATION '$dataDir'")
// The Hive external table should already have data
sql("SELECT * FROM hive_ints").show()
sql("SELECT * FROM hive_bigints").show()
// +---+
// |key|
// | id|
// +---+
// | 0|
// | 1|
// | 2|
// ...
// ... Order may vary, as spark processes the partitions in parallel.

// Turn on flag for Hive Dynamic Partitioning
spark.sqlContext.setConf("hive.exec.dynamic.partition", "true")
Expand Down

0 comments on commit 5f4b505

Please sign in to comment.