Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-1179] Add button to JobManager web interface to request stack trace of a TaskManager #374

Closed
wants to merge 3 commits into from

Conversation

chiwanpark
Copy link
Member

This PR contains following changes:

  • Add public constructors of org.apache.flink.runtime.instance.InstanceID for sending instance ID from web interface to job manager
  • Add a helper method called getRegisteredInstanceById(InstanceID) into org.apache.flink.runtime.instance.InstanceManager for finding Akka Actor from instance ID
  • Add akka messages called RequestStackTrace, SendStackTrace and StackTrace
  • Modify a task manager page in web interface of job manager to request and show stack trace of a task manager

The following image is a screenshot of web interface of job manager.

screen shot 2015-02-08 at 3 49 51 pm

@@ -349,6 +349,11 @@ Actor with ActorLogMessages with ActorLogging {
case Heartbeat(instanceID) =>
instanceManager.reportHeartBeat(instanceID)

case RequestStackTrace(instanceID) =>
val taskManager = instanceManager.getRegisteredInstanceById(instanceID).getTaskManager
val result = AkkaUtils.ask[StackTrace](taskManager, SendStackTrace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a blocking call within the actor thread. We should avoid this. You can simply forward the SendStackTrace message to the respective TaskManager: taskManager forward SendStacktrace

@tillrohrmann
Copy link
Contributor

Hi Chiwan, thanks for your work. It looks really good. I had some just some minor remarks.

@chiwanpark
Copy link
Member Author

@tillrohrmann Thanks for your advice. I will fix it!

@StephanEwen
Copy link
Contributor

Very nice work. I have one comment inline, otherwise +1 to go!

PrintWriter w = resp.getWriter();
w.write(obj.toString());
}


private void writeStackTraceOfTaskManager(String instanceIdStr, HttpServletResponse resp) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RequestStackTrace message may fail, if the task manager is not reachable.

I suggest to surround this block with try / catch(Throwable) and forward the error message to the web client.

The response JSON may then have two fields: "errorMessage" and "stackTrace". If "errorMessage" is defined, display the message, otherwise print the stack trace.

@rmetzger
Copy link
Contributor

rmetzger commented Feb 9, 2015

I've tried it out locally. Looks very nice. Thank you.

+1 to merge.

@chiwanpark
Copy link
Member Author

@StephanEwen Thanks for your advice! I fixed it.

@StephanEwen
Copy link
Contributor

Looks good. There is a small conflict with #384 , but we can try and fix this while merging.

+1

@rmetzger
Copy link
Contributor

I'll probably merge this change tomorrow because I'm working on a bigger change on the web frontend.

@rmetzger
Copy link
Contributor

.... Merging this now ... https://github.com/rmetzger/flink/compare/FLINK-1179

@asfgit asfgit closed this in da8c02b Feb 14, 2015
balidani pushed a commit to balidani/flink that referenced this pull request Feb 15, 2015
mbalassi pushed a commit to mbalassi/flink that referenced this pull request Feb 16, 2015
@chiwanpark chiwanpark deleted the FLINK-1179 branch May 7, 2015 14:26
marthavk pushed a commit to marthavk/flink that referenced this pull request Jun 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants