-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-28668][SQL] Support V2SessionCatalog for ALTER TABLE #25502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #109362 has finished for PR 25502 at commit
|
|
Test build #109683 has finished for PR 25502 at commit
|
|
Test build #109823 has finished for PR 25502 at commit
|
|
Test build #109824 has finished for PR 25502 at commit
|
| } | ||
| } | ||
|
|
||
| test("AlterTable: table does not exist") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests here have been moved to AlterTableTests
|
Test build #109831 has finished for PR 25502 at commit
|
| } | ||
|
|
||
| assert(exc.getMessage.contains("Unsupported table change")) | ||
| assert(exc.getMessage.contains("Cannot drop all fields")) // from the implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also test dropping all the fields of a struct-type column?
| } | ||
| } | ||
|
|
||
| test("AlterTable: remove table property") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we test updating table property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that these are existing tests. Feel free to address these comments in followup.
|
LGTM. @brkyvz is there anything still WIP? |
|
|
||
| private def resolveV2Alter( | ||
| tableName: Seq[String], | ||
| changes: Seq[TableChange]): Option[AlterTable] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the motivation to create the lookupV2Relation method, but I don't quite like the effects that it has on these rules because it mixes identifier resolution (which catalog is responsible?) with table resolution (look up the table in a catalog). This leads to some strange changes, like the addition of AlterTableStatement to checkAnalysis that throws "Table or view not found" when the real problem is that the statement wasn't converted to a v1 or v2 plan.
Before this change, only identifier resolution is done. Table resolution is done by the ResolveTables rule so it is fairly well contained. Plans were created with a place-holder UnresolvedRelation, which avoids the need to catch AlterTableStatement in checkAnalysis
But, to add the fallback to v1 for tables that are loaded by the v2 session catalog, we have to look up the table and only convert to the v2 plan if it isn't an UnresolvedTable.
I'm not sure what the right solution is. Maybe instead of avoiding conversion to the v2 plan, we should convert from v2 to v1 for the fallback case. I think that would make the rules cleaner and more orthogonal to one another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a reason to block this commit. I would just like to follow up with a reasonable solution if there isn't a quick fix that can go in this PR. The important thing is getting this PR in so this makes it into 3.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mixes identifier resolution (which catalog is responsible?) with table resolution (look up the table in a catalog)
that's fair and makes a lot of sense to keep these separate. I'm in favor of merging this as a first step (which will allow the tests to be there), and then cleaning up the logic, since the logic will look similar for INSERT INTO as well. Since this is already a 900loc PR, I'd like to that in a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought about it before. I think the ideal resolution process is:
- rules like
ResolveAlterTableare only responsible for converting XYZStatement to v1 or v2 command ResolveTablesandResolveRelationsare responsible for resolvingUnresolvedRelationto v1 or v2 relations
However, some commands like ALTER TABLE also need to get the catalog instance, which can't be done by ResolveTables or ResolveRelations. Unlike table resolution which replaces UnresolvedRelation with v1/v2 relation and can be done by a rule separately. Catalog resolution needs to be done during the converting from XYZStatement to v1/v2 command and we can't do it in a separated rule.
I don't have a good idea now but we should definitely revisit it later.
|
Test build #109922 has finished for PR 25502 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
Adds support for the V2SessionCatalog for ALTER TABLE statements.
Implementation changes are ~50 loc. The rest is just test refactoring.
Why are the changes needed?
To allow V2 DataSources to plug in through a configurable plugin interface without requiring the explicit use of catalog identifiers, and leverage ALTER TABLE statements.
How was this patch tested?
By re-using existing tests in DataSourceV2SQLSuite.