Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-24969: Predicates may be removed when decorrelating subqueries with lateral #2145

Closed
wants to merge 2 commits into from

Conversation

dengzhhu653
Copy link
Member

@dengzhhu653 dengzhhu653 commented Apr 2, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

@@ -4769,7 +4769,7 @@ POSTHOOK: Input: default@tempty
85768 almond antique chartreuse lavender yellow Manufacturer#1 Brand#12 LARGE BRUSHED STEEL 34 SM BAG 1753.76 refull
86428 almond aquamarine burnished black steel Manufacturer#1 Brand#12 STANDARD ANODIZED STEEL 28 WRAP BAG 1414.42 arefully
90681 almond antique chartreuse khaki white Manufacturer#3 Brand#31 MEDIUM BURNISHED TIN 17 SM CASE 1671.68 are slyly after the sl
Warning: Shuffle Join MERGEJOIN[39][tables = [$hdt$_0, $hdt$_1, $hdt$_2]] in Stage 'Reducer 3' is a cross product
Warning: Shuffle Join MERGEJOIN[39][tables = [$hdt$_0, $hdt$_2]] in Stage 'Reducer 3' is a cross product
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to figure out why $hdt$_1 is omited...

Copy link
Member Author

@dengzhhu653 dengzhhu653 Jul 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, the $hdt$_1 does not participate in Reducer 3, as it produces no output after the semi join(p_name IN (select p_name from part_null))

@@ -246,7 +246,7 @@ POSTHOOK: Input: default@part_null
85768 almond antique chartreuse lavender yellow Manufacturer#1 Brand#12 LARGE BRUSHED STEEL 34 SM BAG 1753.76 refull
86428 almond aquamarine burnished black steel Manufacturer#1 Brand#12 STANDARD ANODIZED STEEL 28 WRAP BAG 1414.42 arefully
90681 almond antique chartreuse khaki white Manufacturer#3 Brand#31 MEDIUM BURNISHED TIN 17 SM CASE 1671.68 are slyly after the sl
Warning: Shuffle Join MERGEJOIN[57][tables = [$hdt$_0, $hdt$_1, $hdt$_2, $hdt$_3]] in Stage 'Reducer 4' is a cross product
Warning: Shuffle Join MERGEJOIN[57][tables = [$hdt$_0, $hdt$_2, $hdt$_3]] in Stage 'Reducer 4' is a cross product
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The $hdt$_1 is omitted because $hdt$_0 and $hdt$_1 perform a left semi join, that is p_name IN (select p_name from part_null) in Reducer 2, the $hdt$_1 does not produces output after this join, only $hdt$_0, $hdt$_2(distinct p_brand) and $hdt$_2(count p_name) take place in $hdt$_4.

@@ -0,0 +1,305 @@
Warning: Shuffle Join MERGEJOIN[54][tables = [sq_1_notin_nullcheck]] in Stage 'Reducer 3' is a cross product
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the parent of Reducer 3 is a union work(alias T), when CrossProductHandler analyzes the union work, as work.getAllRootOperators() returns an empty set, so the inputs of t and lv do not added to the reduce sink info of the join, cause the waring message missing some table aliases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the MR, the warning message is Warning: Shuffle Join JOIN[34][tables = [lv, t, sq_1_notin_nullcheck]] in Stage 'Stage-1:MAPRED' is a cross product.

@dengzhhu653
Copy link
Member Author

@kgyrtkirk @jcamachor @vineetgarg02, any thoughts here?
Thanks!

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.

@github-actions github-actions bot added the stale label Nov 23, 2021
@github-actions github-actions bot closed this Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants