New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partition Pruning - SQL Part [HZ-2515] #24813
Partition Pruning - SQL Part [HZ-2515] #24813
Conversation
This reverts commit cdc1f5e.
…ast#22509)"" This reverts commit 3c180fd.
hazelcast-sql/src/main/java/com/hazelcast/jet/sql/impl/PlanExecutor.java
Outdated
Show resolved
Hide resolved
hazelcast-sql/src/main/java/com/hazelcast/jet/sql/impl/PlanExecutor.java
Outdated
Show resolved
Hide resolved
hazelcast-sql/src/main/java/com/hazelcast/jet/sql/impl/PlanExecutor.java
Outdated
Show resolved
Hide resolved
hazelcast-sql/src/main/java/com/hazelcast/jet/sql/impl/SqlPlanImpl.java
Outdated
Show resolved
Hide resolved
hazelcast-sql/src/main/java/com/hazelcast/jet/sql/impl/PlanExecutor.java
Outdated
Show resolved
Hide resolved
...ain/java/com/hazelcast/jet/sql/impl/opt/prunability/PartitionStrategyConditionExtractor.java
Outdated
Show resolved
Hide resolved
...ast-sql/src/main/java/com/hazelcast/jet/sql/impl/opt/metadata/HazelcastRelMdPrunability.java
Show resolved
Hide resolved
hazelcast-sql/src/main/java/com/hazelcast/jet/sql/impl/PlanExecutor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I have nothing to add/fix for MVP version, I am approving this.
hazelcast-sql/src/main/java/com/hazelcast/jet/sql/impl/PlanExecutor.java
Outdated
Show resolved
Hide resolved
hazelcast/src/main/java/com/hazelcast/jet/impl/execution/init/ExecutionPlanBuilder.java
Outdated
Show resolved
Hide resolved
} | ||
allPartitions = partitionAssignment.values().stream() | ||
.flatMapToInt(a -> Arrays.stream(a)) | ||
.sorted() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fly-Style sorted
increases complexity from O(N) to O(NlogN) N - number of partitions. Any comments on that? TDD update? Big clusters have lots of partitions and this change impacts every Jet job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this implementation was quick and simple for demo but also important assumption changes here, allPartitions
javadoc has to be updated. Also localPartitions
no longer may contain all local partitions. What is the impact of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorted increases complexity from O(N) to O(NlogN) N - number of partitions. Any comments on that?
You wrote this, can't comment it much. I will remove sorted
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also
localPartitions
no longer may contain all local partitions.
Nothing wrong happens here from "contain all local partitions" perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wrote this
this was quick fix for the demo to be reviewed by you if it is correct and makes sense. If you did not change it I assumed that you analyzed it and determined that it is correct - right?
I will remove sorted.
You restored it (BTW, we assume that some arrays are sorted even though it is not documented as guaranteed), so my question is still valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that you avoided analysis of the impact or other solutions, but at least now it does not increase runtime for non-pruned jobs with large partitions numbers.
I will close this comment when this is described in TDD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will close this comment when this is described in TDD
It is not related to design, it's implementation detail. Why is it worth describing it in TDD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
increased complexity is not a small deal (everything > O(N) should be treated with care). When using member pruning with thousands of partitions this could be a place that limits performace. It should be at least mentioned that some code in case of partition pruning has worse than linear runtime.
Signed-off-by: Sasha Syrotenko <oleksandr.syrotenko@hazelcast.com>
…b for interactive queries
…d local member partition key
@@ -108,6 +108,9 @@ public class CreateTopLevelDagVisitor extends CreateDagVisitorBase<Vertex> { | |||
|
|||
private final DagBuildContextImpl dagBuildContext; | |||
|
|||
private Integer requiredRootPartitionId; | |||
private Object coordinatorPartitioningKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fly-Style requiredRootPartitionId
and coordinatorPartitioningKey
will be embedded in the DAG, so will be part of cached plan. However, when partition migration happens they may no longer belong to local member and the job will crash.
I am afraid that we have to determine this dynamically, we cannot give up SQL plan caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved in order to unblock other PRs, however problem with plan caching and migrations has to be fixed.
Fixes HZ-2515