-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict size of automatically broadcasted tables #11667
Restrict size of automatically broadcasted tables #11667
Conversation
Observations:
Conclusions:
Additional steps for the future:
|
044c6c1
to
60582a6
Compare
@@ -149,6 +151,15 @@ public SystemSessionProperties( | |||
false, | |||
value -> JoinDistributionType.valueOf(((String) value).toUpperCase()), | |||
JoinDistributionType::name), | |||
new PropertyMetadata<>( | |||
JOIN_MAX_BROADCAST_TABLE_SIZE, | |||
"Maximum size of a table that can be broadcast for JOIN (0 disables the limit). The most pessimistic size estimate is taken when doing comparison.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most pessimistic size estimate is taken when doing comparison.
- This doesn't seem to be the case, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it nullable? So if the property is not set the limit is disabled. 0 rather implies that the broadcast should never be chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 disables the limit
It should be rather +INF
to disable limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most pessimistic size estimate is taken when doing comparison.
I would prefer to not to go to such details.
Maximum size
Please update to Maximum estimated size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sopel39 Karol, I like this change. It is quite straightforward. When you refer to a similar option in Spark, are you thinking about spark.sql.autoBroadcastJoinThreshold
?
https://spark.apache.org/docs/latest/sql-programming-guide.html
Configures the maximum size in bytes for a table that will be broadcast to all worker nodes when performing a join. By setting this value to -1 broadcasting can be disabled. Note that currently statistics are only supported for Hive Metastore tables where the command ANALYZE TABLE COMPUTE STATISTICS noscan has been run.
@@ -149,6 +151,15 @@ public SystemSessionProperties( | |||
false, | |||
value -> JoinDistributionType.valueOf(((String) value).toUpperCase()), | |||
JoinDistributionType::name), | |||
new PropertyMetadata<>( | |||
JOIN_MAX_BROADCAST_TABLE_SIZE, | |||
"Maximum size of a table that can be broadcast for JOIN (0 disables the limit). The most pessimistic size estimate is taken when doing comparison.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it nullable? So if the property is not set the limit is disabled. 0 rather implies that the broadcast should never be chosen.
} | ||
|
||
@Config("join-max-broadcast-table-size") | ||
@MinDataSize("0MB") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is extra. DataSize cannot be negative.
double buildSideSizeInBytes = buildSideStatsEstimate.getOutputSizeInBytes(buildSide.getOutputSymbols(), context.getSymbolAllocator().getTypes()); | ||
JoinDistributionType joinDistributionType = getJoinDistributionType(context.getSession()); | ||
Optional<DataSize> joinMaxBroadcastTableSize = getJoinMaxBroadcastTableSize(context.getSession()); | ||
return joinDistributionType.canReplicate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shortcircuit this check
JoinDistributionType joinDistributionType = getJoinDistributionType(context.getSession()); | ||
Optional<DataSize> joinMaxBroadcastTableSize = getJoinMaxBroadcastTableSize(context.getSession()); | ||
return joinDistributionType.canReplicate() | ||
&& (!joinMaxBroadcastTableSize.isPresent() || buildSideSizeInBytes <= joinMaxBroadcastTableSize.get().toBytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shortcircuit joinMaxBroadcastTableSize.isPresent()
before doing anything else
p.join( | ||
INNER, | ||
p.values(new PlanNodeId("valuesA"), aRows, p.symbol("A1", BIGINT)), | ||
p.filter(TRUE_LITERAL, p.values(new PlanNodeId("valuesB"), bRows, p.symbol("B1", BIGINT))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just p.values
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's leftover from upper boundary stats calculator.
p.join( | ||
INNER, | ||
p.values(new PlanNodeId("valuesA"), aRows, p.symbol("A1", BIGINT)), | ||
p.filter(TRUE_LITERAL, p.values(new PlanNodeId("valuesB"), bRows, p.symbol("B1", BIGINT))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto about filter
p.join( | ||
INNER, | ||
p.values(new PlanNodeId("valuesA"), aRows, p.symbol("A1")), | ||
p.filter(TRUE_LITERAL, p.values(new PlanNodeId("valuesB"), bRows, p.symbol("B1"))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
p.join( | ||
INNER, | ||
p.values(new PlanNodeId("valuesA"), aRows, p.symbol("A1")), | ||
p.filter(TRUE_LITERAL, p.values(new PlanNodeId("valuesB"), bRows, p.symbol("B1"))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
% comments
@@ -149,6 +151,15 @@ public SystemSessionProperties( | |||
false, | |||
value -> JoinDistributionType.valueOf(((String) value).toUpperCase()), | |||
JoinDistributionType::name), | |||
new PropertyMetadata<>( | |||
JOIN_MAX_BROADCAST_TABLE_SIZE, | |||
"Maximum size of a table that can be broadcast for JOIN (0 disables the limit). The most pessimistic size estimate is taken when doing comparison.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 disables the limit
It should be rather +INF
to disable limit
presto-main/src/main/java/com/facebook/presto/SystemSessionProperties.java
Show resolved
Hide resolved
@@ -149,6 +151,15 @@ public SystemSessionProperties( | |||
false, | |||
value -> JoinDistributionType.valueOf(((String) value).toUpperCase()), | |||
JoinDistributionType::name), | |||
new PropertyMetadata<>( | |||
JOIN_MAX_BROADCAST_TABLE_SIZE, | |||
"Maximum size of a table that can be broadcast for JOIN (0 disables the limit). The most pessimistic size estimate is taken when doing comparison.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most pessimistic size estimate is taken when doing comparison.
I would prefer to not to go to such details.
Maximum size
Please update to Maximum estimated size
private PlanNode getCostBasedJoin(JoinNode joinNode, Context context) | ||
{ | ||
CostProvider costProvider = context.getCostProvider(); | ||
List<PlanNodeWithCost> possibleJoinNodes = new ArrayList<>(); | ||
|
||
if (!mustPartition(joinNode)) { | ||
if (!mustPartition(joinNode) && canReplicate(joinNode, context)) { | ||
possibleJoinNodes.add(getJoinNodeWithCost(costProvider, joinNode.withDistributionType(REPLICATED))); | ||
} | ||
if (!mustReplicate(joinNode, context)) { | ||
possibleJoinNodes.add(getJoinNodeWithCost(costProvider, joinNode.withDistributionType(PARTITIONED))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above two ifs
can be extracted as method and reused below
60582a6
to
634bfb9
Compare
Property defines maximum estimated size of a table that can be broadcast for JOIN. This is used to prevent CBO from broadcasting large tables which could saturate the cluster. Additionally, performance penality is small if CBO incorrectly chooses to broadcast small table. Related PR: prestodb#11667
634bfb9
to
0cd8d78
Compare
@mbasmanova that's the one ( |
Property defines maximum estimated size of a table that can be broadcast for JOIN. This is used to prevent CBO from broadcasting large tables which could saturate the cluster. Additionally, performance penality is small if CBO incorrectly chooses to broadcast small table. Related PR: #11667
Add join-max-broadcast-table-size property
Property defines maximum size of a table
that can be broadcast for JOIN (0 disables the limit).
Supersedes #11570