Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-2636: Expose job ID in JobWaiter API #2176

Closed
wants to merge 12 commits into from

Conversation

lirui-apache
Copy link

This PR adds the async actions to the Java API. User can call these async actions to get the FutureAction and use JobWaiter (for SimpleFutureAction) to retrieve job Id.

@SparkQA
Copy link

SparkQA commented Aug 28, 2014

QA tests have started for PR 2176 at commit fbf5744.

  • This patch merges cleanly.

@rxin
Copy link
Contributor

rxin commented Aug 28, 2014

@lirui-intel - JobWaiter is an internal API that's never designed to be public. I would not expose it simply because you need the job id. There are lots of ways to get the job id ... e.g. you can add an interface to the future to get the job id.

Also this doesn't really make sense. Even with your change, JobWaiter is still private[spark], and only SimpleFutureAction contains a public field to it ...

@lirui-apache
Copy link
Author

Hi @rxin, thanks for the review! I can add interface to SimpleFutureAction to get the job id if we shouldn't expose JobWaiter to users.
Hive on spark currently only uses foreach to submit the job, we can change it to foreachAsync which just returns a SimpleFutureAction.

I agree this doesn't seem to be a perfect way to solve the problem and we still have ComplexFutureAction to worry about. So please let me know if you have a better idea in mind or if you think we shouldn't expose the job ID at all.

@rxin
Copy link
Contributor

rxin commented Aug 28, 2014

If foreachAsync is the only one you need right now, why don't you just add foreachAsync (and remove the rest), and add jobId to SimpleFutureAction?

@lirui-apache
Copy link
Author

I thought these async actions are missing in the java API so I added all of them from AsyncRDDActions. But sure, let me just add foreachAsync.

@SparkQA
Copy link

SparkQA commented Aug 28, 2014

QA tests have finished for PR 2176 at commit fbf5744.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Aug 28, 2014

Yea let's not add all of them since they are highly experimental. I'm not even sure if those are the APIs we want to commit to in the long run.

@lirui-apache
Copy link
Author

@rxin I've updated the patch.
Yes I see these APIs are experimental. We can make hive use it as a workaround and change it when we have a better solution.

@lirui-apache lirui-apache changed the title SPARK-2636: no where to get job identifier while submit spark job through spark API SPARK-2636: Expose job ID in JobWaiter API Aug 29, 2014
/**
* Get the corresponding job Id for this action
*/
def jobId(): Int = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe without the parenthesis, and lower case id?

/** Get the corresponding job id for this action. */
def jobId = jobWaiter.jobId

@rxin
Copy link
Contributor

rxin commented Aug 29, 2014

Jenkins, ok to test.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have started for PR 2176 at commit af4f5d9.

  • This patch merges cleanly.

@lirui-apache
Copy link
Author

Thanks @rxin . Updated the patch accordingly.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have started for PR 2176 at commit 3fa39f7.

  • This patch merges cleanly.

/**
* Get the corresponding job Id for this action
*/
def id: Int = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey I meant jobId, not id ...

I was referring to the lowercase id in your javadoc... take a look at my previous comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry about that.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have finished for PR 2176 at commit af4f5d9.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have started for PR 2176 at commit 0ca320d.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have finished for PR 2176 at commit 3fa39f7.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Aug 29, 2014

You will need to add mima excludes.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have finished for PR 2176 at commit 0ca320d.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lirui-apache
Copy link
Author

Hi @rxin , could you be more specific as how to do it?

@rxin
Copy link
Contributor

rxin commented Aug 29, 2014

Take a look at the failure message here: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19456/consoleFull

And look at file in project/Mima*

@lirui-apache
Copy link
Author

Thanks @rxin . I updated the patch.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have started for PR 2176 at commit e2e01d5.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have finished for PR 2176 at commit e2e01d5.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* @param f the function to apply to all the elements of the RDD
* @return a FutureAction for the action
*/
def foreachAsync(f: VoidFunction[T]): FutureAction[Unit] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mark this as experimental in Java?

I don't know if we have a way to mark stuff as experimental, but maybe put this in the javadoc:

* THIS IS AN EXPERIMENTAL API THAT MIGHT CHANGE IN THE FUTURE.

@pwendell thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Scala-land the javadocs start like this:

 * :: Experimental ::

Perhaps just prepend that to the first line in the javadoc (so the summary contains both the short description and that it's experimental).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin @lirui-intel Make it consistent with the other code where we do this. There is no need to have a big sentence with caps. The ::Experimental:: string along with the @Experimental annotation is what we do. Look in other Java code that has this.

@vanzin
Copy link
Contributor

vanzin commented Aug 29, 2014

+1 pending the experimental thing. This will be very useful. :-)

nchammas added a commit to nchammas/spark-pr-dashboard that referenced this pull request Aug 29, 2014
Current matching behavior misses JIRA issue keys in titles that don't have brackets, like in [this issue](apache/spark#2176).
@rxin
Copy link
Contributor

rxin commented Aug 30, 2014

This patch looks good to me as long as we mark the new java api as experimental.

@SparkQA
Copy link

SparkQA commented Aug 30, 2014

QA tests have started for PR 2176 at commit 5536d55.

  • This patch merges cleanly.

@lirui-apache
Copy link
Author

Thanks @rxin , @vanzin for the review. I've added experimental mark in the java doc. I see that mima can automatically exclude DeveloperApi and Experimental classes, not sure if the mima exclude is unnecessary now?

@SparkQA
Copy link

SparkQA commented Aug 30, 2014

QA tests have finished for PR 2176 at commit 5536d55.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@pwendell
Copy link
Contributor

I added a comment about the experimental formatting

@lirui-apache
Copy link
Author

Thanks @pwendell , patch updated.

@SparkQA
Copy link

SparkQA commented Sep 1, 2014

QA tests have started for PR 2176 at commit ccaafb7.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 1, 2014

QA tests have finished for PR 2176 at commit ccaafb7.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Sep 2, 2014

Thanks. I'm merging this in master.

@asfgit asfgit closed this in fbf2678 Sep 2, 2014
@lirui-apache lirui-apache deleted the SPARK-2636 branch September 3, 2014 06:09
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
This PR adds the async actions to the Java API. User can call these async actions to get the FutureAction and use JobWaiter (for SimpleFutureAction) to retrieve job Id.

Author: lirui <rui.li@intel.com>

Closes apache#2176 from lirui-intel/SPARK-2636 and squashes the following commits:

ccaafb7 [lirui] SPARK-2636: fix java doc
5536d55 [lirui] SPARK-2636: mark the async API as experimental
e2e01d5 [lirui] SPARK-2636: add mima exclude
0ca320d [lirui] SPARK-2636: fix method name & javadoc
3fa39f7 [lirui] SPARK-2636: refine the patch
af4f5d9 [lirui] SPARK-2636: remove unused imports
843276c [lirui] SPARK-2636: only keep foreachAsync in the java API
fbf5744 [lirui] SPARK-2636: add more async actions for java api
1b25abc [lirui] SPARK-2636: expose some fields in JobWaiter
d09f732 [lirui] SPARK-2636: fix build
eb1ee79 [lirui] SPARK-2636: change some parameters in SimpleFutureAction to member field
6e2b87b [lirui] SPARK-2636: add java API for async actions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants