-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-45926][SQL] Implementing equals and hashCode which takes into account pushed runtime filters , in InMemoryTable related scans #43808
Conversation
…t the pushed runtime filters
cc @ulysses-you and @peter-toth FYI |
@HyukjinKwon thanks for correcting the titles of the PRs. will take care next time.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making a PR, @ahshahid .
Do you happen to know why we didn't catch the following? Do you think you can provide a test coverage for that?
If these classes implement equals and hashCode taking into account the pushed runtime filters, we would see that TPCDS Q14b which should ideally be reusing the exchange containing Union , is not happening due to multiple bugs which surface in AQE.
@dongjoon-hyun I think the reason for not catching the issue of reuse of exchange is a mix of multiple things
Yes I have been able to reproduce the issue using InMemoryTableScans as DataSourceV2 impl for tpcds tests. I will checkin a prototype test for reproducing the bug using q14b and if needed all other queries of tpcds can be run. Though I ought to point out that while running my test I also hit the issue of computeStats being called twice ( which throws error only in testing). I have not debugged that... yet. And not sure if the assertion of computeStats occuring only once is maintainable.. |
Thank you so much for the details and a upcoming prototype test case. For that issue, you can file a new JIRA issue.
|
It seems this PR is unrelated to runtime filter. I guess you mean is DS V2 filter pushdown |
Hi @dongjoon-hyun added bug test in PR SPARK-45866 |
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
Implementing equals and hashCode in the InMemoryBatchScan and InMemoryV2FilterBatchScan so that the pushed runtime filters are taken into account.
Why are the changes needed?
These test classes need the change so that the issue of re-use of exchange not happening in AQE when runtime filters are pushed to scan levels, can be reproduced. Currently either the tests are not actually validating the reuse of exchange happening when dynamic pruning expressions are pushed to scan levels in partitioned key based joins, or they are false passing because the equality of the test scans are not taking into account runtime filters.
Does this PR introduce any user-facing change?
No
How was this patch tested?
This is test class change., It is possibe that the existing tests may fail till the dependent fixes are made in source code.
Was this patch authored or co-authored using generative AI tooling?
no