-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-546] Add full outer join to RDD and DStream. #1395
Conversation
Can one of the admins verify this patch? |
Jenkins, this is ok to test. |
Jenkins, test this please. |
@pwendell This patch involves the core RDD API as well. Can you take a look as well? |
QA tests have started for PR 1395. This patch merges cleanly. |
* number of partitions. | ||
*/ | ||
def fullOuterJoin[W](other: JavaPairDStream[K, W]) | ||
: JavaPairDStream[K, (Optional[V], Optional[W])] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not need to be on the next line. It fits in the previous line I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's slightly over the limit at 103 characters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaargh. My visual estimation failed. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries :)
This looks pretty good to me. However, I dont see a python unit test. Can you add that? |
QA results for PR 1395: |
Thanks for taking a look. The example provided in the descriptive comment for rdd.py's fullOuterJoin implementation will be executed and checked as part of the python test suite. Any other concerns? |
Oh yeah, right about the python tests. My bad. I will let @pwendell chime in about the Spark RDD api. |
Cool, thanks! |
Also adding @rxin since he reported the JIRA |
+1, LGTM |
I am good to go with this. Need someone to sign off from the RDD and core On Wed, Sep 3, 2014 at 7:44 PM, Josh Rosen notifications@github.com wrote:
|
It looks like this was last tested a really long time ago, so I'm concerned that it might fail some of the style checks. Therefore: Jenkins, retest this please. |
From a cursory look, the Java API part of this looks fine to me, since the type signatures of the new methods are the same as existing Java API methods. |
Can one of the admins verify this patch? |
Jenkins, test this please. |
QA tests have started for PR 1395 at commit
|
QA tests have finished for PR 1395 at commit
|
@staple Could you fix the PEP8 style? ./python/pyspark/join.py:98:1: E302 expected 2 blank lines, found 1 |
Sure, I fixed the new python style issue. |
Jenkins, retest this please. |
QA tests have started for PR 1395 at commit
|
QA tests have finished for PR 1395 at commit
|
QA tests have started for PR 1395 at commit
|
QA tests have started for PR 1395 at commit
|
QA tests have finished for PR 1395 at commit
|
@JoshRosen @pwendell If you dont have any objections on this addition to Spark core, I will merge this PR. |
Go for it @tdas! |
leftOuterJoin and rightOuterJoin are already implemented. This patch adds fullOuterJoin.