[SPARK-15857]Add caller context in Spark: invoke YARN/HDFS API to set… by weiqingy · Pull Request #14312 · apache/spark

weiqingy · 2016-07-22T01:33:17Z

What changes were proposed in this pull request?

Pass 'jobId' to Task.
Add a new function 'setCallerContext' in Utils. 'setCallerContext' function will call APIs of 'org.apache.hadoop.ipc.CallerContext' to set up spark caller contexts, which will be written into HDFS hdfs-audit.log or Yarn resource manager log.
'setCallerContext' function will be called in Yarn client, ApplicationMaster, and Task class.

The Spark caller context written into HDFS log will be "JobID_stageID_stageAttemptId_taskID_attemptNumbe on Spark", and the Spark caller context written into Yarn log will be"{spark.app.name} running on Spark".

How was this patch tested?

Manual Tests against some Spark applications in Yarn client mode and cluster mode. Need to check if spark caller contexts were written into HDFS hdfs-audit.log and Yarn resource manager log successfully.

For example, run SparkKmeans on Spark:
In Yarn resource manager log, there will be a record with the spark caller context.
...
2016-07-21 13:36:26,318 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=wyang IP=127.0.0.1 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1469125587135_0004 CALLERCONTEXT=SparkKMeans running on Spark
...

In HDFS hdfs-audit.log, there will be records with spark caller contexts.
...
2016-07-21 13:38:30,799 INFO FSNamesystem.audit: allowed=true ugi=wyang (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/lr_big.txt/_spark_metadata dst=null perm=null proto=rpc callerContext=SparkKMeans running on Spark
...
2016-07-21 13:39:35,584 INFO FSNamesystem.audit: allowed=true ugi=wyang (auth:SIMPLE) ip=/127.0.0.1 cmd=open src=/lr_big.txt dst=null perm=null proto=rpc callerContext=JobId_0_StageID_0_stageAttemptId_0_taskID_1_attemptNumber_0 on Spark
...

If the hadoop version on which Spark runs does not have CallerContext APIs, there will be no information of Spark caller context in those logs.

… up caller context

AmplabJenkins · 2016-07-22T01:37:14Z

Can one of the admins verify this patch?

jerryshao · 2016-07-26T06:52:12Z

core/src/main/scala/org/apache/spark/util/Utils.scala

+      val callerContext = Utils.classForName("org.apache.hadoop.ipc.CallerContext")
+      callerContext.getMethod("setCurrent", callerContext).invoke(null, ret)
+    }
+    catch {


nit: catch should follow the above }.

jerryshao · 2016-07-26T07:11:10Z

Spark caller context written into Yarn log will be"{spark.app.name} running on Spark".

This may not be so useful, I think we could get app name form yarn through many different ways, simply printing one line log to RM is not so useful.

weiqingy · 2016-07-27T22:16:49Z

Thanks the feedback, Jerry. I am going to update the patch.

[SPARK-15857]Add caller context in Spark: invoke YARN/HDFS API to set…

38c4f58

… up caller context

jerryshao reviewed Jul 26, 2016
View reviewed changes

weiqingy closed this Jul 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-15857]Add caller context in Spark: invoke YARN/HDFS API to set…#14312

[SPARK-15857]Add caller context in Spark: invoke YARN/HDFS API to set…#14312
weiqingy wants to merge 1 commit intoapache:masterfrom
weiqingy:master

weiqingy commented Jul 22, 2016

Uh oh!

AmplabJenkins commented Jul 22, 2016

Uh oh!

jerryshao Jul 26, 2016

Uh oh!

jerryshao commented Jul 26, 2016

Uh oh!

weiqingy commented Jul 27, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

weiqingy commented Jul 22, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

AmplabJenkins commented Jul 22, 2016

Uh oh!

jerryshao Jul 26, 2016

Choose a reason for hiding this comment

Uh oh!

jerryshao commented Jul 26, 2016

Uh oh!

weiqingy commented Jul 27, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants