Skip to content

Commit

Permalink
[SPARK-16803][SQL] SaveAsTable does not work when target table is a H…
Browse files Browse the repository at this point in the history
…ive serde table

### What changes were proposed in this pull request?

In Spark 2.0, `SaveAsTable` does not work when the target table is a Hive serde table, but Spark 1.6 works.

**Spark 1.6**

``` Scala
scala> sql("create table sample.sample stored as SEQUENCEFILE as select 1 as key, 'abc' as value")
res2: org.apache.spark.sql.DataFrame = []

scala> val df = sql("select key, value as value from sample.sample")
df: org.apache.spark.sql.DataFrame = [key: int, value: string]

scala> df.write.mode("append").saveAsTable("sample.sample")

scala> sql("select * from sample.sample").show()
+---+-----+
|key|value|
+---+-----+
|  1|  abc|
|  1|  abc|
+---+-----+
```

**Spark 2.0**

``` Scala
scala> df.write.mode("append").saveAsTable("sample.sample")
org.apache.spark.sql.AnalysisException: Saving data in MetastoreRelation sample, sample
 is not supported.;
```

So far, we do not plan to support it in Spark 2.1 due to the risk. Spark 1.6 works because it internally uses insertInto. But, if we change it back it will break the semantic of saveAsTable (this method uses by-name resolution instead of using by-position resolution used by insertInto). More extra changes are needed to support `hive` as a `format` in DataFrameWriter.

Instead, users should use insertInto API. This PR corrects the error messages. Users can understand how to bypass it before we support it in a separate PR.
### How was this patch tested?

Test cases are added

Author: gatorsmile <gatorsmile@gmail.com>

Closes #15926 from gatorsmile/saveAsTableFix5.

(cherry picked from commit 9c42d4a)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
  • Loading branch information
gatorsmile committed Nov 22, 2016
1 parent bd338f6 commit 64b9de9
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,10 @@ case class CreateDataSourceTableAsSelectCommand(
existingSchema = Some(l.schema)
case s: SimpleCatalogRelation if DDLUtils.isDatasourceTable(s.metadata) =>
existingSchema = Some(s.metadata.schema)
case c: CatalogRelation if c.catalogTable.provider == Some(DDLUtils.HIVE_PROVIDER) =>
throw new AnalysisException("Saving data in the Hive serde table " +
s"${c.catalogTable.identifier} is not supported yet. Please use the " +
"insertInto() API as an alternative..")
case o =>
throw new AnalysisException(s"Saving data in ${o.toString} is not supported.")
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -413,6 +413,26 @@ class MetastoreDataSourcesSuite extends QueryTest with SQLTestUtils with TestHiv
}
}

test("saveAsTable(CTAS) using append and insertInto when the target table is Hive serde") {
val tableName = "tab1"
withTable(tableName) {
sql(s"CREATE TABLE $tableName STORED AS SEQUENCEFILE AS SELECT 1 AS key, 'abc' AS value")

val df = sql(s"SELECT key, value FROM $tableName")
val e = intercept[AnalysisException] {
df.write.mode(SaveMode.Append).saveAsTable(tableName)
}.getMessage
assert(e.contains("Saving data in the Hive serde table `default`.`tab1` is not supported " +
"yet. Please use the insertInto() API as an alternative."))

df.write.insertInto(tableName)
checkAnswer(
sql(s"SELECT * FROM $tableName"),
Row(1, "abc") :: Row(1, "abc") :: Nil
)
}
}

test("SPARK-5839 HiveMetastoreCatalog does not recognize table aliases of data source tables.") {
withTable("savedJsonTable") {
// Save the df as a managed table (by not specifying the path).
Expand Down

0 comments on commit 64b9de9

Please sign in to comment.