Skip to content

[SPARK-21689][YARN] Download user jar from remote in case of getting hadoop token …#18901

Closed
caneGuy wants to merge 8 commits intoapache:masterfrom
caneGuy:zhoukang/download-userjar
Closed

[SPARK-21689][YARN] Download user jar from remote in case of getting hadoop token …#18901
caneGuy wants to merge 8 commits intoapache:masterfrom
caneGuy:zhoukang/download-userjar

Conversation

@caneGuy
Copy link
Copy Markdown
Contributor

@caneGuy caneGuy commented Aug 10, 2017

…failed

What changes were proposed in this pull request?

When use yarn cluster mode,and we need scan hbase,there will be a case which can not work:
If we put user jar on hdfs,when local classpath will has no hbase,which will let get hbase token failed.Then later when job submitted to yarn, it will failed since has no token to access hbase table.I mock three cases:
1:user jar is on classpath, and has hbase
17/08/10 13:48:03 INFO security.HadoopFSDelegationTokenProvider: Renewal interval is 86400050 for token HDFS_DELEGATION_TOKEN 17/08/10 13:48:03 INFO security.HadoopDelegationTokenManager: Service hive 17/08/10 13:48:03 INFO security.HadoopDelegationTokenManager: Service hbase 17/08/10 13:48:05 INFO security.HBaseDelegationTokenProvider: Attempting to fetch HBase security token.

Logs showing we can get token normally.

2:user jar on hdfs
17/08/10 13:43:58 WARN security.HBaseDelegationTokenProvider: Class org.apache.hadoop.hbase.HBaseConfiguration not found. 17/08/10 13:43:58 INFO security.HBaseDelegationTokenProvider: Failed to get token from service hbase java.lang.ClassNotFoundException: org.apache.hadoop.hbase.security.token.TokenUtil at java.net.URLClassLoader$1.run(URLClassLoader.java:372) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:360) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.spark.deploy.security.HBaseDelegationTokenProvider.obtainDelegationTokens(HBaseDelegationTokenProvider.scala:41) at org.apache.spark.deploy.security.HadoopDelegationTokenManager$$anonfun$obtainDelegationTokens$2.apply(HadoopDelegationTokenManager.scala:112) at org.apache.spark.deploy.security.HadoopDelegationTokenManager$$anonfun$obtainDelegationTokens$2.apply(HadoopDelegationTokenManager.scala:109) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)

Logs showing we can get token failed with ClassNotFoundException.

If we download user jar from remote first,then things will work correctly.So this patch will download user jar from remote when in yarn cluster mode.Below is the log when download from remote first.

17/08/10 13:46:33 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. Downloading hdfs://xxx//spark/spark-examples-hadoop.jar to /tmp/tmp2982380061657838182/spark-examples-hadoop.jar. ... 17/08/10 13:46:36 INFO security.HadoopDelegationTokenManager: Service hbase 17/08/10 13:46:38 INFO security.HBaseDelegationTokenProvider: Attempting to fetch HBase security token.

How was this patch tested?

Manually tested by execute spark-submit scripts with different user jars.

@caneGuy caneGuy changed the title [SPARK-21689][YARN] Download user jar from remote in case get hadoop token … [SPARK-21689][YARN] Download user jar from remote in case of getting hadoop token … Aug 10, 2017
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in your scenario we should also download jars specified with --jars to local and add to classpath.

Copy link
Copy Markdown
Contributor Author

@caneGuy caneGuy Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hbase classes will always be packaged into user's jar if someone want to scan hbase table,so i do not download jars specified with --jars. Actually, use --jars and specify a local jar which contains hbase classes is a workaround solution.

Copy link
Copy Markdown
Contributor

@jerryshao jerryshao Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the feature completeness, we should allow users who specify remote jars with --jars in cluster mode to get correct token, your use case of HBase is just one special case, user could specify HBase dependencies with --jars, it is feasible. One the contrary, you can also specify a local primaryResource as a workaround.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, and updated the logic.

@jerryshao
Copy link
Copy Markdown
Contributor

Can you please add a unit test to verify the classpath with your changes?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please change to "In yarn cluster mode, download remote jars and primary resource to local, these jars will be added to classpath of YARN client to get tokens."?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 4 space indent.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can merge this two tests, you don't need to test them separately.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is necessary to test other unrelated things, they're already covered in other tests.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the style is still not correct, usually it should be 4 indent:

def testFun(a: String,
    b: Int,
    c: Long): Unit = {xxx}

@jerryshao
Copy link
Copy Markdown
Contributor

LGTM generally, CC @vanzin to review.

@caneGuy
Copy link
Copy Markdown
Contributor Author

caneGuy commented Aug 11, 2017

@vanzin take a look at this pr?Thanks.

@vanzin
Copy link
Copy Markdown
Contributor

vanzin commented Aug 11, 2017

ok to test

@tgravescs
Copy link
Copy Markdown
Contributor

Generally its expected that where you launch your job from has all the necessary jars already, thus all the jar would be available for the Client.
here we download all jars that the user explicitly put into hdfs. If that is a lot of jars this could take some time. I see I missed someone put something in to this in client mode, but client mode explicitly needs those to be able to run. Here we are downloading them all for a token in one specific case. The majority of cases this is just going to add overhead and take up disk space. I'm not sure I agree with always doing this.

Is this going to download the file and then just reupload it to the staging directory and not use the original file in HDFS?

@tgravescs
Copy link
Copy Markdown
Contributor

Note I'll file a separate jira for this but even in client mode it will download and then reupload the file to hdfs for yarn mode, thus defeating the purpose of it being in hdfs.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 11, 2017

Test build #80542 has finished for PR 18901 at commit 24c80d1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Copy Markdown
Contributor

vanzin commented Aug 11, 2017

I am not a fan of this change. It makes submission unnecessarily more expensive for everybody to fix an edge case.

@caneGuy
Copy link
Copy Markdown
Contributor Author

caneGuy commented Aug 12, 2017

@vanzin i have also thought about what you mentioned above.But since i do not have enough background knowledge ,i can not think about how to check user need hbase class first.And my original idea is only download primarysource jar,since most user will package their hbase client into primarysource jar.

@caneGuy
Copy link
Copy Markdown
Contributor Author

caneGuy commented Aug 12, 2017

But i think this case should be fixed since many users of our inner branch has suffered from this problem.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 12, 2017

Test build #80561 has finished for PR 18901 at commit ae785d9.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@caneGuy caneGuy force-pushed the zhoukang/download-userjar branch from ae785d9 to 3b07797 Compare August 12, 2017 00:51
@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 12, 2017

Test build #80562 has finished for PR 18901 at commit 3b07797.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Copy Markdown
Contributor

If this is a common problem for your users why not just install the hbase jars on the launcher box?

@caneGuy
Copy link
Copy Markdown
Contributor Author

caneGuy commented Aug 12, 2017

Yes i have a workaround solution,such as add local jar in "--jars".But i do not think this is a very edge case.

@caneGuy
Copy link
Copy Markdown
Contributor Author

caneGuy commented Aug 12, 2017

i execute mvn checkstyle:checkstyle locally with success status, but i can not find more logs in jenkins since i want to find which file failed with style check.

@vanzin
Copy link
Copy Markdown
Contributor

vanzin commented Aug 12, 2017

As Tom suggested, if you add hbase to your gateway's Spark installation, you won't need to download it every time you submit an application.

This change, the way it is, is really not something that should go into Spark, for the reasons already mentioned.

@caneGuy
Copy link
Copy Markdown
Contributor Author

caneGuy commented Aug 12, 2017

All right, i will close this pr.Thanks for your time @vanzin @jerryshao @tgravescs .

@caneGuy caneGuy closed this Aug 12, 2017
@caneGuy caneGuy deleted the zhoukang/download-userjar branch August 12, 2017 02:30
@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 12, 2017

Test build #80559 has finished for PR 18901 at commit 7bbd1ad.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants