Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47145][SQL] Pass table identifier to row data source scan exec for V2 strategy. #45200

Conversation

urosstan-db
Copy link
Contributor

@urosstan-db urosstan-db commented Feb 21, 2024

What changes were proposed in this pull request?

Provide table identifier to RowDataSourceScanExec physical plan node in DataSourceV2Strategy.
Table identifier is taken from DataSourceV2Relation.

Why are the changes needed?

It is better to populate all available fields. Table identifier can be useful for other purposes (logging, debugging, etc)

Does this PR introduce any user-facing change?

No

How was this patch tested?

There was not any testing, only debugging manually to see whether it is populated properly.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Feb 21, 2024
@urosstan-db urosstan-db changed the title Pass table identifier to row data source scan exec for V2 strategy. [SPARK-47145][SQL] Pass table identifier to row data source scan exec for V2 strategy. Feb 23, 2024
@urosstan-db urosstan-db marked this pull request as ready for review February 23, 2024 10:45
@urosstan-db urosstan-db force-pushed the pass-table-identifier-to-datasourcescan-v2 branch from 43ff70b to d142e15 Compare February 27, 2024 13:45
@cloud-fan cloud-fan closed this in 5c4b0d1 Feb 28, 2024
@cloud-fan
Copy link
Contributor

thanks, merging to master!

TakawaAkirayo pushed a commit to TakawaAkirayo/spark that referenced this pull request Mar 4, 2024
… for V2 strategy

### What changes were proposed in this pull request?
Provide table identifier to RowDataSourceScanExec physical plan node in DataSourceV2Strategy.
Table identifier is taken from DataSourceV2Relation.

### Why are the changes needed?
It is better to populate all available fields. Table identifier can be useful for other purposes (logging, debugging, etc)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
There was not any testing, only debugging manually to see whether it is populated properly.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#45200 from urosstan-db/pass-table-identifier-to-datasourcescan-v2.

Authored-by: Uros Stankovic <uros.stankovic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
ericm-db pushed a commit to ericm-db/spark that referenced this pull request Mar 5, 2024
… for V2 strategy

### What changes were proposed in this pull request?
Provide table identifier to RowDataSourceScanExec physical plan node in DataSourceV2Strategy.
Table identifier is taken from DataSourceV2Relation.

### Why are the changes needed?
It is better to populate all available fields. Table identifier can be useful for other purposes (logging, debugging, etc)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
There was not any testing, only debugging manually to see whether it is populated properly.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#45200 from urosstan-db/pass-table-identifier-to-datasourcescan-v2.

Authored-by: Uros Stankovic <uros.stankovic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants