[SPARK-15025][SQL] fix duplicate of PATH key in datasource table options#12804
[SPARK-15025][SQL] fix duplicate of PATH key in datasource table options#12804xwu0226 wants to merge 6 commits intoapache:masterfrom
Conversation
|
@yhuai @liancheng @liancheng @hvanhovell Can any of you help take a quick look at this change? Thank you very much! |
|
ok to test. |
|
Test build #57819 has finished for PR 12804 at commit
|
…lowing up PR12974
…ause following up PR12974" This reverts commit 98a1f804d7343ba77731f9aa400c00f1a26c03fe.
There was a problem hiding this comment.
Can we avoid of calling metadataHive.runSqlHive? For this case, we want to check that there is a single entry in table properties with the key as PATH and the location is the one we want, right?
There was a problem hiding this comment.
@yhuai Thanks for your input!. Yes, you are right. I can get the CatalogTable.storage.serdeProperties to check the 'PATH' key. I will modify the test case.
…ng for 'path' key when createDataSourceTable
|
@yhuai I updated the testcase to check the key |
|
Test build #58167 has finished for PR 12804 at commit
|
|
Test build #58166 has finished for PR 12804 at commit
|
|
test this please |
|
Test build #58174 has finished for PR 12804 at commit
|
|
LGTM. Merging to master and branch 2.0. |
## What changes were proposed in this pull request?
The issue is that when the user provides the path option with uppercase "PATH" key, `options` contains `PATH` key and will get into the non-external case in the following code in `createDataSourceTables.scala`, where a new key "path" is created with a default path.
```
val optionsWithPath =
if (!options.contains("path")) {
isExternal = false
options + ("path" -> sessionState.catalog.defaultTablePath(tableIdent))
} else {
options
}
```
So before creating hive table, serdeInfo.parameters will contain both "PATH" and "path" keys and different directories. and Hive table's dataLocation contains the value of "path".
The fix in this PR is to convert `options` in the code above to `CaseInsensitiveMap` before checking for containing "path" key.
## How was this patch tested?
A testcase is added
Author: xin Wu <xinwu@us.ibm.com>
Closes #12804 from xwu0226/SPARK-15025.
(cherry picked from commit 980bba0)
Signed-off-by: Yin Huai <yhuai@databricks.com>
What changes were proposed in this pull request?
The issue is that when the user provides the path option with uppercase "PATH" key,
optionscontainsPATHkey and will get into the non-external case in the following code increateDataSourceTables.scala, where a new key "path" is created with a default path.So before creating hive table, serdeInfo.parameters will contain both "PATH" and "path" keys and different directories. and Hive table's dataLocation contains the value of "path".
The fix in this PR is to convert
optionsin the code above toCaseInsensitiveMapbefore checking for containing "path" key.How was this patch tested?
A testcase is added