-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-11313][SQL] implement cogroup on DataSets #9279
Conversation
@@ -513,3 +513,16 @@ case class MapGroups[K, T, U]( | |||
override def missingInput: AttributeSet = AttributeSet.empty | |||
} | |||
|
|||
case class CoGroup2[K, T, U, V]( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a factory like in the other cases to simplify implicit passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'd probably just call it cogroup as we'll probably use a single variadic operator if we decide to do more than 2.
Looking pretty good so far! Let's wait to add cogroup for more than 2 datasets. |
Test build #44351 has finished for PR 9279 at commit
|
1c7f4c0
to
2f1cef0
Compare
Test build #44424 has finished for PR 9279 at commit
|
Test build #44425 has finished for PR 9279 at commit
|
Test build #44426 has finished for PR 9279 at commit
|
cc @marmbrus |
retest this please. |
Test build #44442 has finished for PR 9279 at commit
|
will open it again when we need to support cogroup on more than 2 datasets. |
A simpler version of apache#9279, only support 2 datasets. Author: Wenchen Fan <wenchen@databricks.com> Closes apache#9324 from cloud-fan/cogroup2.
A simpler version of apache/spark#9279, only support 2 datasets. Author: Wenchen Fan <wenchen@databricks.com> Closes #9324 from cloud-fan/cogroup2.
No description provided.