Skip to content

Conversation

@englefly
Copy link
Contributor

@englefly englefly commented Nov 10, 2025

What problem does this PR solve?

pick #57863
After performing an outer join, the columns on the inner side need to be supplemented with nulls. Taking a left outer join as an example, the previous estimation method was [left table row count] - [left semi join row count].
However, due to the large estimation error of the left semi join, the error in supplementing null values was relatively large. Now, it has been changed to left_table_row_count - inner_join_row_count.

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly englefly changed the title estimate num_nulls for outer join [opt](nereids) optimize estimation of num_nulls for outer join Nov 11, 2025
@englefly englefly force-pushed the est-null-outerjoin-4.0 branch from 4468a04 to d26ed24 Compare November 11, 2025 06:11
@englefly englefly changed the title [opt](nereids) optimize estimation of num_nulls for outer join branch-4.0 [opt](nereids) optimize estimation of num_nulls for outer join Nov 11, 2025
@englefly englefly marked this pull request as ready for review November 13, 2025 01:50
@englefly englefly requested a review from yiguolei as a code owner November 13, 2025 01:50
@englefly
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage `` 🎉
Increment coverage report
Complete coverage report

@englefly
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (10/10) 🎉
Increment coverage report
Complete coverage report

@englefly
Copy link
Contributor Author

run external

@englefly englefly closed this Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants