Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAMZA-1246: ApplicatonRunner.stats() should include exception in case of failure #154

Closed
wants to merge 1 commit into from

Conversation

xinyuiscool
Copy link
Contributor

@xinyuiscool xinyuiscool commented May 1, 2017

Current when ApplicationRunner.stats() only returns the enum representing the status. It also need to include the exception if the status is failure.


package org.apache.samza.runtime;

public class ApplicationStatus {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So after this PR and @navina change, we're going to have
JobStatus
ApplicationStatus
ContainerStatus
... others?

It'd be nice to have a clear purpose for each of these.

For example, I don't see why ApplicationStatus needs to handle exceptions, but JobStatus doesn't. If anything, I'd think the only difference between the two is that ApplicationStatus spans multiple jobs, so it might have a status to indicate PartialFinish for the case that a subset of the jobs are not running. Otherwise, I don't see why both are needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I reverted back the JobStatus and keep only one ApplicationStatus.

@@ -106,11 +106,11 @@ public ApplicationStatus status(StreamApplication app) {
}

if (unsuccessfulFinish) {
return ApplicationStatus.UnsuccessfulFinish;
return ApplicationStatus.unsuccessfulFinish(null); // no exception for remote execution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this is a gap. Ideally, we should report an exception for the remote case too. Maybe, for example YarnJob should have the smarts to determine why an application failed using the Diagnostic Info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! IN this case, I created a SamzaException with the diagonostic info. Please take a look. Thanks.

@xinyuiscool
Copy link
Contributor Author

xinyuiscool commented May 3, 2017

btw, I also changed the status computation for RemoteApplicationRunner to be consistent with LocalApplicationRunner. So now it follows the following logic:

  • status is new if there are new jobs which are not started
  • status is running if all jobs are started and some are running
  • status is unsuccessfulFinished if all jobs are finished and some fail
  • status is successfulFinished if all jobs are finished successfully

Copy link
Contributor

@jmakes jmakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One question below.

@@ -247,7 +247,6 @@ StreamProcessor createStreamProcessor(
latch.await();

if (throwable.get() != null) {
appStatus = ApplicationStatus.UnsuccessfulFinish;
throw throwable.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume appStatus is no longer set here because the next line throws an exception. But on line 136, appStatus is set under the same circumstances. Why is that? Shouldn't run() and waitForFinish() have the same semantics for exception and status handling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I removed this line is because it's redundant. As you see, the exception thrown here will be caught in line 136 and the status will be set again. The code is pretty messy right now. I am waiting for @navina 's change to finally clean these up in patch (#135). I will also add the waitForFinish() in that patch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@asfgit asfgit closed this in 46c25cf May 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants