-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35803][SQL] Support DataSource V2 CreateTempViewUsing #33922
[SPARK-35803][SQL] Support DataSource V2 CreateTempViewUsing #33922
Conversation
@cloud-fan what do you think? I'm not sure if I'm missing something. Thanks! |
ok to test |
def loadV2Source(sparkSession: SparkSession, provider: TableProvider, | ||
userSpecifiedSchema: Option[StructType], extraOptions: CaseInsensitiveMap[String], | ||
source: String, paths: String*): Option[DataFrame] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def loadV2Source(sparkSession: SparkSession, provider: TableProvider, | |
userSpecifiedSchema: Option[StructType], extraOptions: CaseInsensitiveMap[String], | |
source: String, paths: String*): Option[DataFrame] = { | |
def loadV2Source( | |
sparkSession: SparkSession, | |
provider: TableProvider, | |
userSpecifiedSchema: Option[StructType], | |
extraOptions: CaseInsensitiveMap[String], | |
source: String, | |
paths: String*): Option[DataFrame] = { |
dsOptions) | ||
(catalog.loadTable(ident), Some(catalog), Some(ident)) | ||
case _ => | ||
// TODO: Non-catalog paths for DSV2 are currently not well defined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this comment was already existent before but wanted to make a note. This isn't a good example of a comment. There's no JIRA. and we don't know what's not well defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the same, when I read it I tried to understand what was missing but I didn't get it. Shall we delete it?
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #143031 has finished for PR 33922 at commit
|
private def getOptionsWithPaths(extraOptions: CaseInsensitiveMap[String], | ||
paths: String*): CaseInsensitiveMap[String] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private def getOptionsWithPaths(extraOptions: CaseInsensitiveMap[String], | |
paths: String*): CaseInsensitiveMap[String] = { | |
private def getOptionsWithPaths( | |
extraOptions: CaseInsensitiveMap[String], | |
paths: String*): CaseInsensitiveMap[String] = { |
val analyzedPlan = Dataset.ofRows( | ||
sparkSession, LogicalRelation(dataSource.resolveRelation())).logicalPlan | ||
val analyzedPlan = DataSource.lookupDataSourceV2(provider, sparkSession.sessionState.conf) | ||
.map { tblProvider => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.map { tblProvider => | |
.flatMap { tblProvider => |
test("SPARK-35803: Support datasorce V2 in CREATE VIEW USING") { | ||
Seq(classOf[SimpleDataSourceV2], classOf[JavaSimpleDataSourceV2]).foreach { cls => | ||
withClue(cls.getName) { | ||
sql(s"CREATE or REPLACE GLOBAL TEMPORARY VIEW s1 USING ${cls.getName}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should test with normal temp view unless there is something special with the global temp view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is no reason for this test
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #143061 has finished for PR 33922 at commit
|
thanks, merging to master! |
What changes were proposed in this pull request?
Currently only DataSources V1 are supported in the CreateTempViewUsing command. This PR refactor DataframeReader to reuse the code for the creation of a DataFrame from a DataSource V2
Why are the changes needed?
Improve the support of DataSourve V2 in this command
Does this PR introduce any user-facing change?
It does not change the current behavior, it only adds a new functionality
How was this patch tested?
Unit testing