[WIP]Introducing a separation of concerns between data sources that are gi…#30106
[WIP]Introducing a separation of concerns between data sources that are gi…#30106RogerDunn wants to merge 1 commit intoapache:masterfrom
Conversation
…ven a dataframe's schema upon write, from FileDataSourceV2, in part to allow data sources that require this notion to be developed in Java (FileDataSourceV2 is not compatible with Java because of its private Table member)
|
Can one of the admins verify this patch? |
|
@AmplabJenkins , thank you for the speedy note on this patch. Please, there is no current urgency on this patch. I marked it as a [WIP] PR because I'm currently in the process of reviewing this change (weighed against other alternatives) with another Spark developer. I'll keep the notes up to date on this PR. Thank you! |
| /** | ||
| * A data source that is given the DataFrame's schema upon write operations. | ||
| */ | ||
| trait SchemaOnWriteDataSource extends TableProvider with DataSourceRegister { |
There was a problem hiding this comment.
One option is to add this trait, and another option is to officially make FileDataSourceV2 a public developer API. Does this "schema-on-write" behavior make sense for non-file data sources?
There was a problem hiding this comment.
+1, making FileDataSourceV2 a public developer API seems simpler
|
The schema inference is already in the DS v2 API Instead of adding a marker trait, I think we can just reuse |
|
@cloud-fan Your idea sounds just right. Are you proposing to make that change in the work you're already doing (in which case I'll remove this PR)? |
|
@gengliangwang is creating a PR to implement this idea. |
|
I have created #30273 for this. |
|
@gengliangwang your PR is better, thank you! |
…ven a dataframe's schema upon write, from FileDataSourceV2, in part to allow data sources that require this notion to be developed in Java (FileDataSourceV2 is not compatible with Java because of its private Table member)
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
How was this patch tested?