Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-546] Add full outer join to RDD and DStream. #1395

Closed
wants to merge 4 commits into from

Conversation

staple
Copy link
Contributor

@staple staple commented Jul 13, 2014

leftOuterJoin and rightOuterJoin are already implemented. This patch adds fullOuterJoin.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@tdas
Copy link
Contributor

tdas commented Jul 30, 2014

Jenkins, this is ok to test.

@tdas
Copy link
Contributor

tdas commented Jul 30, 2014

Jenkins, test this please.

@tdas
Copy link
Contributor

tdas commented Jul 30, 2014

@pwendell This patch involves the core RDD API as well. Can you take a look as well?

@SparkQA
Copy link

SparkQA commented Jul 30, 2014

QA tests have started for PR 1395. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17469/consoleFull

* number of partitions.
*/
def fullOuterJoin[W](other: JavaPairDStream[K, W])
: JavaPairDStream[K, (Optional[V], Optional[W])] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not need to be on the next line. It fits in the previous line I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's slightly over the limit at 103 characters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaargh. My visual estimation failed. :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries :)

@tdas
Copy link
Contributor

tdas commented Jul 30, 2014

This looks pretty good to me. However, I dont see a python unit test. Can you add that?

@SparkQA
Copy link

SparkQA commented Jul 30, 2014

QA results for PR 1395:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17469/consoleFull

@staple
Copy link
Contributor Author

staple commented Jul 30, 2014

Thanks for taking a look. The example provided in the descriptive comment for rdd.py's fullOuterJoin implementation will be executed and checked as part of the python test suite. Any other concerns?

@tdas
Copy link
Contributor

tdas commented Jul 30, 2014

Oh yeah, right about the python tests. My bad. I will let @pwendell chime in about the Spark RDD api.

@staple
Copy link
Contributor Author

staple commented Jul 30, 2014

Cool, thanks!

@tdas
Copy link
Contributor

tdas commented Jul 30, 2014

Also adding @rxin since he reported the JIRA

@davies
Copy link
Contributor

davies commented Sep 2, 2014

+1, LGTM

@JoshRosen
Copy link
Contributor

Bump @tdas @pwendell. Does this look okay to you?

@tdas
Copy link
Contributor

tdas commented Sep 4, 2014

I am good to go with this. Need someone to sign off from the RDD and core
point of view.

On Wed, Sep 3, 2014 at 7:44 PM, Josh Rosen notifications@github.com wrote:

Bump @tdas https://github.com/tdas @pwendell
https://github.com/pwendell. Does this look okay to you?


Reply to this email directly or view it on GitHub
#1395 (comment).

@JoshRosen
Copy link
Contributor

It looks like this was last tested a really long time ago, so I'm concerned that it might fail some of the style checks. Therefore:

Jenkins, retest this please.

@JoshRosen
Copy link
Contributor

From a cursory look, the Java API part of this looks fine to me, since the type signatures of the new methods are the same as existing Java API methods.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

Can one of the admins verify this patch?

@pwendell
Copy link
Contributor

pwendell commented Sep 8, 2014

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Sep 8, 2014

QA tests have started for PR 1395 at commit 084b2d5.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 8, 2014

QA tests have finished for PR 1395 at commit 084b2d5.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor

davies commented Sep 8, 2014

@staple Could you fix the PEP8 style?

./python/pyspark/join.py:98:1: E302 expected 2 blank lines, found 1

@staple
Copy link
Contributor Author

staple commented Sep 8, 2014

Sure, I fixed the new python style issue.

@davies
Copy link
Contributor

davies commented Sep 8, 2014

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Sep 8, 2014

QA tests have started for PR 1395 at commit 1f5595c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 8, 2014

QA tests have finished for PR 1395 at commit 1f5595c.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 9, 2014

QA tests have started for PR 1395 at commit 1f5595c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 17, 2014

QA tests have started for PR 1395 at commit 1f5595c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 17, 2014

QA tests have finished for PR 1395 at commit 1f5595c.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor

tdas commented Sep 23, 2014

@JoshRosen @pwendell If you dont have any objections on this addition to Spark core, I will merge this PR.

@pwendell
Copy link
Contributor

Go for it @tdas!

@asfgit asfgit closed this in 8ca4ecb Sep 25, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants