Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify the orca optimizer's processing of unionall distribution strategy #278

Closed
wants to merge 3 commits into from

Conversation

Light-City
Copy link
Contributor

@Light-City Light-City commented Nov 2, 2023

Change logs

The orca optimizer currently returns the ANY policy for the first child of a unionall-like node, which will result in Gather Motion for the downstream children and a 1:n Redistribution for the upstream.

for example:
    ->  Redistribute Motion 1:3  (slice2)
       ->  Append  (cost=0.00..863.91 rows=18001 width=12)
          ->  Finalize Vec Aggregate
            ->  Gather Motion 3:1  (slice3; segments: 3)
		...
       ->  Gather Motion 3:1  (slice4; segments: 3)
          ->  HashAggregate
after:
    ->  Append  (cost=0.00..863.91 rows=18001 width=12)
       ->  Result  (cost=0.00..431.06 rows=1 width=12)
          ->  Redistribute Motion 1:3  (slice2)
            ->  Finalize Aggregate
               ->  Gather Motion 3:1  (slice3; segments: 3)
               ...
       ->  HashAggregate

When there are many nodes, the first plan will cause performance bottlenecks and need to be modified. Fortunately, the gpdb community has also modified th is. Commit is 0cd056a0a3d3c30a1d6d4479e67802b6673118c7.

Why are the changes needed?

1.Affect performance
2.Plan is unreasonable

Does this PR introduce any user-facing change?

yes, tpcds 167 query.

How was this patch tested?

yes.

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

  • Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
  • Sign the Contributor License Agreement as prompted for your first-time contribution(One-time setup).
  • Learn the coding contribution guide, including our code conventions, workflow and more.
  • List your communication in the GitHub Issues or Discussions (if has or needed).
  • Document changes.
  • Add tests for the change
  • Pass make installcheck
  • Pass make -C src/test installcheck-cbdb-parallel
  • Feel free to @cloudberrydb/dev team for review and approval when your PR is ready🥳

@CLAassistant
Copy link

CLAassistant commented Nov 2, 2023

CLA assistant check
All committers have signed the CLA.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hiiii, @Light-City welcome!🎊 Thanks for taking the effort to make our project better! 🙌 Keep making such awesome contributions!

@Light-City
Copy link
Contributor Author

related issue is #279.

@tuhaihe
Copy link
Member

tuhaihe commented Nov 2, 2023

related issue is #279.

One tip: we can use closes #279 here, and then after this PR is merged, the issue will be closed automatically. For more tips see the GitHub doc.

@Light-City Light-City changed the title Modify the orca optimizer's processing of unionall distribution strat… Modify the orca optimizer's processing of unionall distribution strat…#279 Nov 2, 2023
@Light-City Light-City changed the title Modify the orca optimizer's processing of unionall distribution strat…#279 Modify the orca optimizer's processing of unionall distribution strategy Nov 2, 2023
@Light-City
Copy link
Contributor Author

related issue is #279.

One tip: we can use closes #279 here, and then after this PR is merged, the issue will be closed automatically. For more tips see the GitHub doc.

ok

@my-ship-it my-ship-it marked this pull request as draft November 3, 2023 09:16
@my-ship-it my-ship-it marked this pull request as draft November 3, 2023 09:16
@my-ship-it my-ship-it self-requested a review November 3, 2023 09:16
…egy.

    The orca optimizer currently returns the ANY policy for the first child of a unionall-like node, which will result in Gather Motion for the downstream chi
ldren and a 1:n Redistribution for the upstream.

    for example:

    ->  Redistribute Motion 1:3  (slice2)
       ->  Append  (cost=0.00..863.91 rows=18001 width=12)
          ->  Finalize Vec Aggregate
            ->  Gather Motion 3:1  (slice3; segments: 3)
		...
       ->  Gather Motion 3:1  (slice4; segments: 3)
          ->  HashAggregate

    after:

    ->  Append  (cost=0.00..863.91 rows=18001 width=12)
       ->  Result  (cost=0.00..431.06 rows=1 width=12)
          ->  Redistribute Motion 1:3  (slice2)
            ->  Finalize Aggregate
               ->  Gather Motion 3:1  (slice3; segments: 3)
               ...
       ->  HashAggregate
When there are many nodes, the first plan will cause performance bottlenecks and need to be modified. Fortunately, the gpdb community has also modified th
is. Commit is 0cd056a0a3d3c30a1d6d4479e67802b6673118c7.
@my-ship-it
Copy link
Contributor

After discussion, close the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants