Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRIFFIN-197 Treat non-existing YARN app as FAILED #421

Closed
wants to merge 1 commit into from

Conversation

chemikadze
Copy link
Member

This avoids jobs becoming stuck in UNKNOWN state on Service side.
Also, improves logging for YARN client errors.

This avoids jobs becoming stuck in UNKNOWN state on Service side.
Also, improves logging for YARN client errors.
e.getMessage(), e.getResponseBodyAsString());
if (e.getStatusCode() == HttpStatus.NOT_FOUND) {
// in sync with Livy behavior, see com.cloudera.livy.utils.SparkYarnApp
instance.setState(DEAD);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree we need to handle state,
but what if this is caused by network issue,
should we double confirm before we jump to conclusion that the instance is dead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only 404 is handled here, which should not be result of network issue.

It looks like any kind of error reported by Yarn client (after internal retries) results in DEADing job on Livy side: https://github.com/cloudera/livy/blob/master/server/src/main/scala/com/cloudera/livy/utils/SparkYarnApp.scala#L307
I'll need to double check whether not found applications are ever getting retried, to make sure behavior is same as on Livy side. If not -- then that's what Livy would do.

Copy link
Contributor

@guoyuepeng guoyuepeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in single line review.

@asfgit asfgit closed this in 3dda6b3 Sep 30, 2018
@chemikadze chemikadze deleted the GRIFFIN-197 branch October 8, 2018 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants