-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-2106] [runtime] add Left-, Right- and Full Outer Join drivers #1052
Conversation
Hi, Thanks for your contribution. I'll review this and leave comments. |
Thanks for your contribution @r-pogalz. Could you please add some more details about the implementation to the PR description. |
final TypeComparator<IT2> comparator2 = this.taskContext.getDriverComparator(1); | ||
|
||
final TypePairComparatorFactory<IT1, IT2> pairComparatorFactory = config.getPairComparatorFactory( | ||
this.taskContext.getUserCodeClassLoader()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line break
Really good work @r-pogalz. I had only some minor comments concerning style and test cases. I like your approach to split the implementation of FLINK-687 into multiple parts. This makes it far easier to review. Concerning the description of FLINK-2106, you haven't integrated the outer sort merge join into the optimizer and the API, yet. I guess this will happen as a next step. Maybe you can update the description of FLINK-2106 accordingly. Other than that, the PR looks good to me :-) |
Hi @tillrohrmann, thanks for your review and comments. I will work through it soon. Regarding the ticket FLINK-2106, @fhueske suggested us to create multiple PRs for runtime, optimizer and API integration. I guess it is an unusual way, but is it possible to have more than one PR for an issue? If not, I will change the description of FLINK-2106 as you said and create two more tasks for the optimizer and API part. |
Hi @r-pogalz, usually you only have one PR for a ticket. Thus, it would be good to create two new sub-tasks for the API and the optimizer integration. |
@r-pogalz I would implement API and optimizer together in one issue. IMO it is easier to start from the API and then work your way through API to optimizer and finally use your runtime code. However, you cannot add the API without adding the optimizer part. So, I would do both in one issue. |
final double fractionAvailableMemory = config.getRelativeMemoryDriver(); | ||
final int numPages = memoryManager.computeNumberOfPages(fractionAvailableMemory); | ||
|
||
// test minimum memory requirements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated comment
a1331e3
to
46f825e
Compare
Hi @tillrohrmann, @fhueske and @chiwanpark, |
addInput(new UniformIntTupleGenerator(keyCnt2, valCnt2, true), this.serializer); | ||
|
||
testDriver(testTask, MockFailingJoinStub.class); | ||
Assert.fail("Driver did not forward Exception."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this assertion.
The test will fail if the test method does not throw an ExpectedTestException
.
46f825e
to
0cfae3a
Compare
Looks good to merge. 👍 |
I just saw that the OuterJoin runtime classes/tests cause a lot of warnings when compiling. Manyly raw types, unchecked casts. Would be good to fix those. In general, we strive for as little warnings as possible. Generic warnings should be fixed or suppressed (if impossible to fix). Otherwise they crowd the warnings output so that the actual critical warnings get lost. |
0cfae3a
to
9e76ec9
Compare
Thanks for you comment @StephanEwen, good to know that. I tried to minimize the occurrences of warnings where possible by fixing or suppressing it for this issue. Should I also fix it for the previous issue (runtime OuterJoin operators and tests) and commit it together with this one? |
Yeah, adding warnings fix to this PR is good. |
…n iterators and tests
29f95ec
to
9064e2f
Compare
The warnings fix commit is ready for review. |
I did not go through all details, but from what I read it was good. Go ahead from my side... |
Will merge this now |
…INNER_MERGE This closes apache#1052
This PR includes the implementation of 3 drivers for the specific outer joins and adapts the
DriverStrategy
for the merge based outer joins.To make the outer join operators implemented in ticket FLINK-2105 visible to the Java and Scala API, we have started with the integration of the merge based outer join operators bottom up. To reduce duplicated code we implemented an
AbstractOuterJoinDriver
which is responsible for most parts of the drivers setup and running the operators. The specific outer join driver implementations (LeftOuterJoinDriver
,RightOuterJoinDriver
,FullOuterJoinDriver
) set the corresponding Outer Join Iterators based on theExecutionConfig
. Therefore, we had to adapt theDriverStrategy
adding three additional enum fields (LEFT_OUTER_MERGE
,RIGHT_OUTER_MERGE
,FULL_OUTER_MERGE
).Currently, we only support the outer join on a merged based strategy, but hash based will follow (FLINK-2107).
Furthermore,
DriverStrategy.MERGE
has been renamed toDriverStrategy.INNER_MERGE
as requested by @fhueske.Next steps of this issue will be the optimizer and Java/Scala API parts.