-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19148][SQL] do not expose the external table concept in Catalog #16528
Conversation
""" | ||
}.mkString | ||
}) | ||
foldFunctions = _.map { funCall => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor code cleanup, not related to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we revert this change?
case fs: HadoopFsRelation => | ||
if (table.tableType == CatalogTableType.EXTERNAL && fs.location.rootPaths.isEmpty) { | ||
throw new AnalysisException( | ||
"Cannot create a file-based external data source table without path") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will never hit this branch after this PR. There is no public API to set the table type and users can only set custom table path, so we will never create an external table without path
Test build #71123 has finished for PR 16528 at commit
|
f06ed16
to
b0c252a
Compare
Test build #71125 has finished for PR 16528 at commit
|
retest this please |
Test build #71128 has finished for PR 16528 at commit
|
retest this please |
|
||
/** | ||
* :: Experimental :: | ||
* Creates a table from the given path and returns the corresponding DataFrame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can explain here to say the data files will not be dropped when dropping the table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already documented this in programming guide. If we wanna explain the custom path here, there are a lot of similar places we need to add comments too. So I'd like to leave it as it was.
If we compare the above two interfaces, the Catalog's |
We also need to update the Python interface. See the code |
Test build #71143 has finished for PR 16528 at commit
|
I'd like to do it in follow-ups I'll update the python api too |
Test build #71175 has finished for PR 16528 at commit
|
f1f75ed
to
0d1baf1
Compare
Test build #71191 has started for PR 16528 at commit |
retest this please |
Test build #71198 has finished for PR 16528 at commit
|
0d1baf1
to
8f4e86e
Compare
Test build #71214 has finished for PR 16528 at commit
|
8f4e86e
to
3b59221
Compare
Test build #71238 has finished for PR 16528 at commit
|
3b59221
to
2e1d378
Compare
Test build #71292 has finished for PR 16528 at commit
|
2e1d378
to
d5b3b4f
Compare
Test build #71316 has finished for PR 16528 at commit
|
d5b3b4f
to
55cf0c3
Compare
Test build #71325 has finished for PR 16528 at commit
|
warnings.warn( | ||
"createExternalTable is deprecated since Spark 2.2, please use createTable instead.", | ||
DeprecationWarning) | ||
return self.createTable(tableName, path, source, schema, **options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**options
-> options
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's python syntax, like what we do in scala: func(options: _*)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Got it. I also manually tried it in pyspark
. It works fine.
LGTM |
ok to test |
Test build #71408 has finished for PR 16528 at commit
|
cc @yhuai for final sign-off |
looks good to me. If possible, I'd like to get the code mentioned by https://github.com/apache/spark/pull/16528/files#r96314156 reverted. |
Test build #71469 has finished for PR 16528 at commit
|
thanks for the review, merging to master! |
## What changes were proposed in this pull request? In apache#16296 , we reached a consensus that we should hide the external/managed table concept to users and only expose custom table path. This PR renames `Catalog.createExternalTable` to `createTable`(still keep the old versions for backward compatibility), and only set the table type to EXTERNAL if `path` is specified in options. ## How was this patch tested? new tests in `CatalogSuite` Author: Wenchen Fan <wenchen@databricks.com> Closes apache#16528 from cloud-fan/create-table.
## What changes were proposed in this pull request? In apache#16296 , we reached a consensus that we should hide the external/managed table concept to users and only expose custom table path. This PR renames `Catalog.createExternalTable` to `createTable`(still keep the old versions for backward compatibility), and only set the table type to EXTERNAL if `path` is specified in options. ## How was this patch tested? new tests in `CatalogSuite` Author: Wenchen Fan <wenchen@databricks.com> Closes apache#16528 from cloud-fan/create-table.
…t in Catalog ### What changes were proposed in this pull request? After we renames `Catalog`.`createExternalTable` to `createTable` in the PR: apache#16528, we also need to deprecate the corresponding functions in `SQLContext`. ### How was this patch tested? N/A Author: Xiao Li <gatorsmile@gmail.com> Closes apache#17502 from gatorsmile/deprecateCreateExternalTable.
What changes were proposed in this pull request?
In #16296 , we reached a consensus that we should hide the external/managed table concept to users and only expose custom table path.
This PR renames
Catalog.createExternalTable
tocreateTable
(still keep the old versions for backward compatibility), and only set the table type to EXTERNAL ifpath
is specified in options.How was this patch tested?
new tests in
CatalogSuite