New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-38516][BUILD] Add log4j-core and log4j-api to classpath if active hadoop-provided #35811
Conversation
</dependency> | ||
<dependency> | ||
<groupId>org.apache.logging.log4j</groupId> | ||
<artifactId>log4j-core</artifactId> | ||
<version>${log4j.version}</version> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
<dependency> | ||
<!-- API bridge between log4j 1 and 2 --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I think log4j-slf4j-impl
is also needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. log4j-slf4j-impl is also needed:
spark/core/src/main/scala/org/apache/spark/internal/Logging.scala
Lines 228 to 234 in 1431a4a
private def isLog4j2(): Boolean = { | |
// This distinguishes the log4j 1.2 binding, currently | |
// org.slf4j.impl.Log4jLoggerFactory, from the log4j 2.0 binding, currently | |
// org.apache.logging.slf4j.Log4jLoggerFactory | |
val binderClass = StaticLoggerBinder.getSingleton.getLoggerFactoryClassStr | |
"org.apache.logging.slf4j.Log4jLoggerFactory".equals(binderClass) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we include lof4j-slf4j-impl
will we then not end up with multiple slf4j bindings on the classpath since hadoop already includes slf4j-log4j12
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @eejbyfeldt . Removed it.
I built this branch with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built this branch using ./dev/make-distribution.sh --tgz --name hadoop-provided-test -Phadoop-provided -Pyarn
and ran the SparkPi example and that ran as expected.
I am bit surprised that having both log4j2 and log4j12 on the classpath does not cause problems. Therefore I expected that we should maybe also include log4j-1.2-api
in the hadoop provided build. But when doing that running SparkPi failed with
Exception in thread "Executor task launch worker-1" java.lang.NoClassDefFoundError: Could not initialize class org.slf4j.MDC
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$setMDCForTask(Executor.scala:751)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:441)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
So that does not seem like it will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @wangyum , @viirya , @HyukjinKwon , @eejbyfeldt .
Merged to master for Apache Spark 3.3.0.
What changes were proposed in this pull request?
Add
log4j-core
andlog4j-api
to classpath if active hadoop-provided.Why are the changes needed?
log4j-core
is needed:log4j-api
is needed:log4j-slf4j-impl
is not needed: #35811 (comment)Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manual test: