Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push the runtime filter from hashjoin to seqscan or AM. #405

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zhangyue-hashdata
Copy link

@zhangyue-hashdata zhangyue-hashdata commented Apr 10, 2024

Change logs
If gp_enable_runtime_filter_pushdown is on, something will run during query:

  1. In ExecInitHashJoin(), try to find the direct mapper between var in hashclauses and var in seqscan. If found we will save the mapper in AttrFilter and all AttrFilters will be pushed to hash node.
  2. During building hash table we will create the range/bloom filters in AttrFilter. These filters will be converted to the list of ScanKey and pushed to Seqscan once completed.
  3. If AM support SCAN_SUPPORT_RUNTIME_FILTER, these ScanKeys will be pushed down to the AM module further, otherwise will be used to filter slot in Seqscan;

Why are the changes needed?
The commit may improve the performance of query including hashjoin.

Does this PR introduce any user-facing change?
No.

How was this patch tested?
Yes, add test cases in src/test/regress/sql/gp_runtime_filter.sql

Contributor's Checklist
Here are some reminders and checklists before/when submitting your pull request, please check them:

  • Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
  • Sign the Contributor License Agreement as prompted for your first-time contribution(One-time setup).
  • Learn the coding contribution guide, including our code conventions, workflow and more.
  • List your communication in the GitHub Issues or Discussions (if has or needed).
  • Document changes.
  • Add tests for the change
  • Pass make installcheck
  • Pass make -C src/test installcheck-cbdb-parallel
  • Feel free to request cloudberrydb/dev team for review and approval when your PR is ready馃コ

@CLAassistant
Copy link

CLAassistant commented Apr 10, 2024

CLA assistant check
All committers have signed the CLA.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hiiii, @zhangyue-hashdata welcome!馃帄 Thanks for taking the effort to make our project better! 馃檶 Keep making such awesome contributions!

+----------+  AttrFilter   +------+  ScanKey   +------------+
| HashJoin | ------------> | Hash | ---------> | SeqScan/AM |
+----------+               +------+            +------------+

If gp_enable_runtime_filter_pushdown is on, something will run during query:
1. In ExecInitHashJoin(), try to find the direct mapper between var in hashclauses
   and var in seqscan. If found we will save the mapper in AttrFilter and all
   AttrFilters will be pushed to hash node.
2. During building hash table we will create the range/bloom filters in AttrFilter.
   These filters will be converted to the list of ScanKey and pushed to Seqscan
   once completed.
3. If AM support SCAN_SUPPORT_RUNTIME_FILTER, these ScanKeys will be pushed down
   to the AM module further, otherwise will be used to filter slot in Seqscan;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants