-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-2895: Add mapPartitionsWithContext related support on Spark Java API. #2194
Conversation
Can one of the admins verify this patch? |
*/ | ||
@DeveloperApi | ||
def mapPartitionsWithContext[R]( | ||
f: JFunction2[TaskContext, java.util.Iterator[T], java.util.Iterator[R]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong indentation.
It might be good to add a test suite for this in |
@ChengXiangLi could you describe a bit more what the context is being used for? This is an unstable API so I'm a bit hesitant to expose this in its current form. It would be better to look at exactly what Hive needs from this interface and see if we can come up with a stable interface for it. |
Hi, @pwendell , For several Hive features, such as HIVE-7843 and HIVE-7627, Hive use partition id to distinct tasks write files in HDFS, no other dependency on task context found currently. |
* should be `false` unless this is a pair RDD and the input function doesn't modify the keys. | ||
*/ | ||
@DeveloperApi | ||
def mapPartitionsToDoubleWithContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove this one? I don't think it is needed for Hive, is it?
@JoshRosen can you also take a look at this. It is pretty short, but it is about the java api. |
Jenkins, ok to test. |
This looks fine to me, especially since it adds Java tests. |
QA tests have started for PR 2194 at commit
|
QA tests have finished for PR 2194 at commit
|
we hit the binary incompatibilities error here, i already annotated new added methods as DeveloperApi, do i miss something here? |
@ScrapCodes does mima check not exclude developer apis? |
I am looking at this. Mima check should have excluded those methods. |
Can one of the admins verify this patch? |
Jenkins, retest this please. |
QA tests have started for PR 2194 at commit
|
QA tests have finished for PR 2194 at commit
|
@DeveloperApi | ||
def mapPartitionsWithContext[R]( | ||
f: JFunction2[TaskContext, java.util.Iterator[T], java.util.Iterator[R]], | ||
preservesPartitioning: Boolean = false): JavaRDD[R] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can't have default argument values in Java
We still hit API incompatibilities error as #2285 is not finished yet. |
Jenkins, test this please. |
QA tests have started for PR 2194 at commit
|
I proposed a slightly different approach to this here: This would remove the need for special methods xWithContext. |
QA tests have finished for PR 2194 at commit
|
I pushed a commit to close this one in favor of #2425 |
No description provided.