-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-49732][CORE][K8S] Spark deamons should respect spark.log.structuredLogging.enabled conf
#48198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @dongjoon-hyun because this line involves k8s change
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition sounds like a corner case to me. Could you be more specific in the PR title to clarify the corner case you want to address, @pan3793 ?
the issue only happens when java.net.InetAddress.getLocalHost returns 127.0.0.1
|
@dongjoon-hyun the previous issue happens condition is not accurate. I have updated the PR description with more generic words. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we add comments to explain why uninitialize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your advice, updated.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this PR because the verification step seems to be misleading to me, @pan3793 .
For example,
## How was this patch tested?
Write spark.log.structuredLogging.enabled=false in spark-defaults.conf
When I remove spark-defaults.conf and run this PR, the result is the same.
$ ls conf/spark-*
conf/spark-defaults.conf.template conf/spark-env.sh.template
$ SPARK_NO_DAEMONIZE=1 sbin/start-master.sh
starting org.apache.spark.deploy.master.Master, logging to /Users/dongjoon/APACHE/spark-merge/logs/spark-dongjoon-org.apache.spark.deploy.master.Master-1-M3-Max.local.out
Spark Command: /Users/dongjoon/.jenv/versions/21/bin/java -cp /Users/dongjoon/APACHE/spark-merge/conf/:/Users/dongjoon/APACHE/spark-merge/assembly/target/scala-2.13/jars/slf4j-api-2.0.16.jar:/Users/dongjoon/APACHE/spark-merge/assembly/target/scala-2.13/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host M3-Max.local --port 7077 --webui-port 8080
========================================
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
{"ts":"2024-09-24T15:28:28.863Z","level":"WARN","msg":"Your hostname, M3-Max.local, resolves to a loopback address: 127.0.0.1; using 17.233.9.145 instead (on interface utun6)","context":{"host":"M3-Max.local","host_port":"127.0.0.1","host_port2":"17.233.9.145","network_if":"utun6"},"logger":"Utils"}
{"ts":"2024-09-24T15:28:28.864Z","level":"WARN","msg":"Set SPARK_LOCAL_IP if you need to bind to another address","logger":"Utils"}
{"ts":"2024-09-24T15:28:28.887Z","level":"INFO","msg":"Started daemon with process name: 72363@M3-Max.local","logger":"Master"}
{"ts":"2024-09-24T15:28:28.888Z","level":"INFO","msg":"Registering signal handler for TERM","logger":"SignalUtils"}
{"ts":"2024-09-24T15:28:28.889Z","level":"INFO","msg":"Registering signal handler for HUP","logger":"SignalUtils"}
{"ts":"2024-09-24T15:28:28.889Z","level":"INFO","msg":"Registering signal handler for INT","logger":"SignalUtils"}
{"ts":"2024-09-24T15:28:28.992Z","level":"WARN","msg":"Unable to load native-hadoop library for your platform... using builtin-java classes where applicable","logger":"NativeCodeLoader"}
Using Spark's default log4j profile: org/apache/spark/log4j2-pattern-layout-defaults.properties
24/09/24 08:28:29 INFO SecurityManager: Changing view acls to: dongjoon
24/09/24 08:28:29 INFO SecurityManager: Changing modify acls to: dongjoon
24/09/24 08:28:29 INFO SecurityManager: Changing view acls groups to: dongjoon
Even I set it true explicitly like the following. It's ignored. I cannot get structured logging.
$ cat conf/spark-defaults.conf
spark.log.structuredLogging.enabled true
$ SPARK_NO_DAEMONIZE=1 sbin/start-master.sh
starting org.apache.spark.deploy.master.Master, logging to /Users/dongjoon/APACHE/spark-merge/logs/spark-dongjoon-org.apache.spark.deploy.master.Master-1-M3-Max.local.out
Spark Command: /Users/dongjoon/.jenv/versions/21/bin/java -cp /Users/dongjoon/APACHE/spark-merge/conf/:/Users/dongjoon/APACHE/spark-merge/assembly/target/scala-2.13/jars/slf4j-api-2.0.16.jar:/Users/dongjoon/APACHE/spark-merge/assembly/target/scala-2.13/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host M3-Max.local --port 7077 --webui-port 8080
========================================
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
{"ts":"2024-09-24T15:30:42.761Z","level":"WARN","msg":"Your hostname, M3-Max.local, resolves to a loopback address: 127.0.0.1; using 17.233.9.145 instead (on interface utun6)","context":{"host":"M3-Max.local","host_port":"127.0.0.1","host_port2":"17.233.9.145","network_if":"utun6"},"logger":"Utils"}
{"ts":"2024-09-24T15:30:42.763Z","level":"WARN","msg":"Set SPARK_LOCAL_IP if you need to bind to another address","logger":"Utils"}
{"ts":"2024-09-24T15:30:42.785Z","level":"INFO","msg":"Started daemon with process name: 72822@M3-Max.local","logger":"Master"}
{"ts":"2024-09-24T15:30:42.786Z","level":"INFO","msg":"Registering signal handler for TERM","logger":"SignalUtils"}
{"ts":"2024-09-24T15:30:42.787Z","level":"INFO","msg":"Registering signal handler for HUP","logger":"SignalUtils"}
{"ts":"2024-09-24T15:30:42.787Z","level":"INFO","msg":"Registering signal handler for INT","logger":"SignalUtils"}
{"ts":"2024-09-24T15:30:42.895Z","level":"WARN","msg":"Unable to load native-hadoop library for your platform... using builtin-java classes where applicable","logger":"NativeCodeLoader"}
Using Spark's default log4j profile: org/apache/spark/log4j2-pattern-layout-defaults.properties
24/09/24 08:30:42 INFO SecurityManager: Changing view acls to: dongjoon
24/09/24 08:30:42 INFO SecurityManager: Changing modify acls to: dongjoon
24/09/24 08:30:42 INFO SecurityManager: Changing view acls groups to: dongjoon
Could you provide us how to verify this?
|
Gentle ping, @pan3793 . Please correct me if I'm wrong~ |
|
Ping, @pan3793 . |
|
@dongjoon-hyun thanks for checking, you are right, I am fixing the issue. Converting to DRAFT now. |
3b2ce8c to
17c156e
Compare
|
@dongjoon-hyun I fixed the issue and verified the following cases:
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @pan3793 , @gengliangwang , @LuciferYang .
I also verified manually via Spark Master.
Merged to master for Apache Spark 4.0.0.
What changes were proposed in this pull request?
Explicitly call
Logging.uninitialize()afterSparkConfloadingspark-defaults.confWhy are the changes needed?
SPARK-49015 fixes a similar issue that affects services started through
SparkSubmit, while for other services like SHS, there is still a chance that the logging system is initialized beforeSparkConfconstructed, sospark.log.structuredLogging.enabledconfigured atspark-defaults.confwon't take effect.The issue only happens when the logging system is initialized before
SparkConfloadingspark-defaults.conf.example 1, when
java.net.InetAddress.getLocalHostreturns127.0.0.1,the logging system will be initialized early.
example 2: SHS calls
Utils.initDaemon(log)before loadingspark-defaults.conf(inside construction ofHistoryServerArguments)spark/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
Lines 301 to 302 in d2e8c1c
then loads
spark-defaults.confand ignoresspark.log.structuredLogging.enabled.Does this PR introduce any user-facing change?
No, spark structured logging is an unreleased feature.
How was this patch tested?
Write
spark.log.structuredLogging.enabled=falseinspark-defaults.conf4.0.0-preview2
This PR
Was this patch authored or co-authored using generative AI tooling?
No.