Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Sep 22, 2024

What changes were proposed in this pull request?

Explicitly call Logging.uninitialize() after SparkConf loading spark-defaults.conf

Why are the changes needed?

SPARK-49015 fixes a similar issue that affects services started through SparkSubmit, while for other services like SHS, there is still a chance that the logging system is initialized before SparkConf constructed, so spark.log.structuredLogging.enabled configured at spark-defaults.conf won't take effect.

The issue only happens when the logging system is initialized before SparkConf loading spark-defaults.conf.

example 1, when java.net.InetAddress.getLocalHost returns 127.0.0.1,

scala> java.net.InetAddress.getLocalHost
res0: java.net.InetAddress = H27212-MAC-01.local/127.0.0.1

the logging system will be initialized early.

{"ts":"2024-09-22T12:50:37.082Z","level":"WARN","msg":"Your hostname, H27212-MAC-01.local, resolves to a loopback address: 127.0.0.1; using 192.168.32.130 instead (on interface en0)","context":{"host":"H27212-MAC-01.local","host_port":"127.0.0.1","host_port2":"192.168.32.130","network_if":"en0"},"logger":"Utils"}
{"ts":"2024-09-22T12:50:37.085Z","level":"WARN","msg":"Set SPARK_LOCAL_IP if you need to bind to another address","logger":"Utils"}

example 2: SHS calls Utils.initDaemon(log) before loading spark-defaults.conf(inside construction of HistoryServerArguments)

Utils.initDaemon(log)
new HistoryServerArguments(conf, argStrings)

{"ts":"2024-09-22T13:20:31.978Z","level":"INFO","msg":"Started daemon with process name: 41505@H27212-MAC-01.local","logger":"HistoryServer"}
{"ts":"2024-09-22T13:20:31.980Z","level":"INFO","msg":"Registering signal handler for TERM","logger":"SignalUtils"}
{"ts":"2024-09-22T13:20:31.981Z","level":"INFO","msg":"Registering signal handler for HUP","logger":"SignalUtils"}
{"ts":"2024-09-22T13:20:31.981Z","level":"INFO","msg":"Registering signal handler for INT","logger":"SignalUtils"}

then loads spark-defaults.conf and ignores spark.log.structuredLogging.enabled.

Does this PR introduce any user-facing change?

No, spark structured logging is an unreleased feature.

How was this patch tested?

Write spark.log.structuredLogging.enabled=false in spark-defaults.conf

4.0.0-preview2

$ SPARK_NO_DAEMONIZE=1 sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/chengpan/app/spark-4.0.0-preview2-bin-hadoop3/logs/spark-chengpan-org.apache.spark.deploy.history.HistoryServer-1-H27212-MAC-01.local.out
Spark Command: /Users/chengpan/.sdkman/candidates/java/current/bin/java -cp /Users/chengpan/app/spark-4.0.0-preview2-bin-hadoop3/conf/:/Users/chengpan/app/spark-4.0.0-preview2-bin-hadoop3/jars/slf4j-api-2.0.16.jar:/Users/chengpan/app/spark-4.0.0-preview2-bin-hadoop3/jars/* -Xmx1g org.apache.spark.deploy.history.HistoryServer
========================================
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
{"ts":"2024-09-22T12:50:37.082Z","level":"WARN","msg":"Your hostname, H27212-MAC-01.local, resolves to a loopback address: 127.0.0.1; using 192.168.32.130 instead (on interface en0)","context":{"host":"H27212-MAC-01.local","host_port":"127.0.0.1","host_port2":"192.168.32.130","network_if":"en0"},"logger":"Utils"}
{"ts":"2024-09-22T12:50:37.085Z","level":"WARN","msg":"Set SPARK_LOCAL_IP if you need to bind to another address","logger":"Utils"}
{"ts":"2024-09-22T12:50:37.109Z","level":"INFO","msg":"Started daemon with process name: 37764@H27212-MAC-01.local","logger":"HistoryServer"}
{"ts":"2024-09-22T12:50:37.112Z","level":"INFO","msg":"Registering signal handler for TERM","logger":"SignalUtils"}
{"ts":"2024-09-22T12:50:37.112Z","level":"INFO","msg":"Registering signal handler for HUP","logger":"SignalUtils"}
{"ts":"2024-09-22T12:50:37.112Z","level":"INFO","msg":"Registering signal handler for INT","logger":"SignalUtils"}
{"ts":"2024-09-22T12:50:37.258Z","level":"WARN","msg":"Unable to load native-hadoop library for your platform... using builtin-java classes where applicable","logger":"NativeCodeLoader"}
{"ts":"2024-09-22T12:50:37.275Z","level":"INFO","msg":"Changing view acls to: chengpan","logger":"SecurityManager"}
{"ts":"2024-09-22T12:50:37.275Z","level":"INFO","msg":"Changing modify acls to: chengpan","logger":"SecurityManager"}
{"ts":"2024-09-22T12:50:37.276Z","level":"INFO","msg":"Changing view acls groups to: chengpan","logger":"SecurityManager"}
{"ts":"2024-09-22T12:50:37.276Z","level":"INFO","msg":"Changing modify acls groups to: chengpan","logger":"SecurityManager"}
{"ts":"2024-09-22T12:50:37.277Z","level":"INFO","msg":"SecurityManager: authentication disabled; ui acls disabled; users with view permissions: chengpan groups with view permissions: EMPTY; users with modify permissions: chengpan; groups with modify permissions: EMPTY; RPC SSL disabled","logger":"SecurityManager"}
{"ts":"2024-09-22T12:50:37.309Z","level":"INFO","msg":"History server ui acls disabled; users with admin permissions: ; groups with admin permissions: ","logger":"FsHistoryProvider"}
{"ts":"2024-09-22T12:50:37.409Z","level":"INFO","msg":"Start Jetty 0.0.0.0:18080 for HistoryServerUI","logger":"JettyUtils"}
{"ts":"2024-09-22T12:50:37.466Z","level":"INFO","msg":"Successfully started service 'HistoryServerUI' on port 18080.","logger":"Utils"}
{"ts":"2024-09-22T12:50:37.491Z","level":"INFO","msg":"Bound HistoryServer to 0.0.0.0, and started at http://192.168.32.130:18080","logger":"HistoryServer"}
...

This PR

$ SPARK_NO_DAEMONIZE=1 sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/chengpan/Projects/apache-spark/dist/logs/spark-chengpan-org.apache.spark.deploy.history.HistoryServer-1-H27212-MAC-01.local.out
Spark Command: /Users/chengpan/.sdkman/candidates/java/current/bin/java -cp /Users/chengpan/Projects/apache-spark/dist/conf/:/Users/chengpan/Projects/apache-spark/dist/jars/slf4j-api-2.0.16.jar:/Users/chengpan/Projects/apache-spark/dist/jars/* -Xmx1g org.apache.spark.deploy.history.HistoryServer
========================================
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
{"ts":"2024-09-22T13:20:31.903Z","level":"WARN","msg":"Your hostname, H27212-MAC-01.local, resolves to a loopback address: 127.0.0.1; using 192.168.32.130 instead (on interface en0)","context":{"host":"H27212-MAC-01.local","host_port":"127.0.0.1","host_port2":"192.168.32.130","network_if":"en0"},"logger":"Utils"}
{"ts":"2024-09-22T13:20:31.905Z","level":"WARN","msg":"Set SPARK_LOCAL_IP if you need to bind to another address","logger":"Utils"}
{"ts":"2024-09-22T13:20:31.978Z","level":"INFO","msg":"Started daemon with process name: 41505@H27212-MAC-01.local","logger":"HistoryServer"}
{"ts":"2024-09-22T13:20:31.980Z","level":"INFO","msg":"Registering signal handler for TERM","logger":"SignalUtils"}
{"ts":"2024-09-22T13:20:31.981Z","level":"INFO","msg":"Registering signal handler for HUP","logger":"SignalUtils"}
{"ts":"2024-09-22T13:20:31.981Z","level":"INFO","msg":"Registering signal handler for INT","logger":"SignalUtils"}
{"ts":"2024-09-22T13:20:32.136Z","level":"WARN","msg":"Unable to load native-hadoop library for your platform... using builtin-java classes where applicable","logger":"NativeCodeLoader"}
Using Spark's default log4j profile: org/apache/spark/log4j2-pattern-layout-defaults.properties
24/09/22 21:20:32 INFO SecurityManager: Changing view acls to: chengpan
24/09/22 21:20:32 INFO SecurityManager: Changing modify acls to: chengpan
24/09/22 21:20:32 INFO SecurityManager: Changing view acls groups to: chengpan
24/09/22 21:20:32 INFO SecurityManager: Changing modify acls groups to: chengpan
24/09/22 21:20:32 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: chengpan groups with view permissions: EMPTY; users with modify permissions: chengpan; groups with modify permissions: EMPTY; RPC SSL disabled
24/09/22 21:20:32 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions:
24/09/22 21:20:32 INFO JettyUtils: Start Jetty 0.0.0.0:18080 for HistoryServerUI
24/09/22 21:20:32 INFO Utils: Successfully started service 'HistoryServerUI' on port 18080.
24/09/22 21:20:32 INFO HistoryServer: Bound HistoryServer to 0.0.0.0, and started at http://192.168.32.130:18080
...

Was this patch authored or co-authored using generative AI tooling?

No.

@pan3793
Copy link
Member Author

pan3793 commented Sep 22, 2024

cc @gengliangwang @LuciferYang

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @dongjoon-hyun because this line involves k8s change

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition sounds like a corner case to me. Could you be more specific in the PR title to clarify the corner case you want to address, @pan3793 ?

the issue only happens when java.net.InetAddress.getLocalHost returns 127.0.0.1

@pan3793
Copy link
Member Author

pan3793 commented Sep 23, 2024

@dongjoon-hyun the previous issue happens condition is not accurate. I have updated the PR description with more generic words.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add comments to explain why uninitialize?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your advice, updated.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this PR because the verification step seems to be misleading to me, @pan3793 .

For example,

## How was this patch tested?

Write spark.log.structuredLogging.enabled=false in spark-defaults.conf

When I remove spark-defaults.conf and run this PR, the result is the same.

$ ls conf/spark-*
conf/spark-defaults.conf.template conf/spark-env.sh.template

$ SPARK_NO_DAEMONIZE=1 sbin/start-master.sh
starting org.apache.spark.deploy.master.Master, logging to /Users/dongjoon/APACHE/spark-merge/logs/spark-dongjoon-org.apache.spark.deploy.master.Master-1-M3-Max.local.out
Spark Command: /Users/dongjoon/.jenv/versions/21/bin/java -cp /Users/dongjoon/APACHE/spark-merge/conf/:/Users/dongjoon/APACHE/spark-merge/assembly/target/scala-2.13/jars/slf4j-api-2.0.16.jar:/Users/dongjoon/APACHE/spark-merge/assembly/target/scala-2.13/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host M3-Max.local --port 7077 --webui-port 8080
========================================
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
{"ts":"2024-09-24T15:28:28.863Z","level":"WARN","msg":"Your hostname, M3-Max.local, resolves to a loopback address: 127.0.0.1; using 17.233.9.145 instead (on interface utun6)","context":{"host":"M3-Max.local","host_port":"127.0.0.1","host_port2":"17.233.9.145","network_if":"utun6"},"logger":"Utils"}
{"ts":"2024-09-24T15:28:28.864Z","level":"WARN","msg":"Set SPARK_LOCAL_IP if you need to bind to another address","logger":"Utils"}
{"ts":"2024-09-24T15:28:28.887Z","level":"INFO","msg":"Started daemon with process name: 72363@M3-Max.local","logger":"Master"}
{"ts":"2024-09-24T15:28:28.888Z","level":"INFO","msg":"Registering signal handler for TERM","logger":"SignalUtils"}
{"ts":"2024-09-24T15:28:28.889Z","level":"INFO","msg":"Registering signal handler for HUP","logger":"SignalUtils"}
{"ts":"2024-09-24T15:28:28.889Z","level":"INFO","msg":"Registering signal handler for INT","logger":"SignalUtils"}
{"ts":"2024-09-24T15:28:28.992Z","level":"WARN","msg":"Unable to load native-hadoop library for your platform... using builtin-java classes where applicable","logger":"NativeCodeLoader"}
Using Spark's default log4j profile: org/apache/spark/log4j2-pattern-layout-defaults.properties
24/09/24 08:28:29 INFO SecurityManager: Changing view acls to: dongjoon
24/09/24 08:28:29 INFO SecurityManager: Changing modify acls to: dongjoon
24/09/24 08:28:29 INFO SecurityManager: Changing view acls groups to: dongjoon

Even I set it true explicitly like the following. It's ignored. I cannot get structured logging.

$ cat conf/spark-defaults.conf
spark.log.structuredLogging.enabled true

$ SPARK_NO_DAEMONIZE=1 sbin/start-master.sh
starting org.apache.spark.deploy.master.Master, logging to /Users/dongjoon/APACHE/spark-merge/logs/spark-dongjoon-org.apache.spark.deploy.master.Master-1-M3-Max.local.out
Spark Command: /Users/dongjoon/.jenv/versions/21/bin/java -cp /Users/dongjoon/APACHE/spark-merge/conf/:/Users/dongjoon/APACHE/spark-merge/assembly/target/scala-2.13/jars/slf4j-api-2.0.16.jar:/Users/dongjoon/APACHE/spark-merge/assembly/target/scala-2.13/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host M3-Max.local --port 7077 --webui-port 8080
========================================
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
{"ts":"2024-09-24T15:30:42.761Z","level":"WARN","msg":"Your hostname, M3-Max.local, resolves to a loopback address: 127.0.0.1; using 17.233.9.145 instead (on interface utun6)","context":{"host":"M3-Max.local","host_port":"127.0.0.1","host_port2":"17.233.9.145","network_if":"utun6"},"logger":"Utils"}
{"ts":"2024-09-24T15:30:42.763Z","level":"WARN","msg":"Set SPARK_LOCAL_IP if you need to bind to another address","logger":"Utils"}
{"ts":"2024-09-24T15:30:42.785Z","level":"INFO","msg":"Started daemon with process name: 72822@M3-Max.local","logger":"Master"}
{"ts":"2024-09-24T15:30:42.786Z","level":"INFO","msg":"Registering signal handler for TERM","logger":"SignalUtils"}
{"ts":"2024-09-24T15:30:42.787Z","level":"INFO","msg":"Registering signal handler for HUP","logger":"SignalUtils"}
{"ts":"2024-09-24T15:30:42.787Z","level":"INFO","msg":"Registering signal handler for INT","logger":"SignalUtils"}
{"ts":"2024-09-24T15:30:42.895Z","level":"WARN","msg":"Unable to load native-hadoop library for your platform... using builtin-java classes where applicable","logger":"NativeCodeLoader"}
Using Spark's default log4j profile: org/apache/spark/log4j2-pattern-layout-defaults.properties
24/09/24 08:30:42 INFO SecurityManager: Changing view acls to: dongjoon
24/09/24 08:30:42 INFO SecurityManager: Changing modify acls to: dongjoon
24/09/24 08:30:42 INFO SecurityManager: Changing view acls groups to: dongjoon

Could you provide us how to verify this?

@dongjoon-hyun
Copy link
Member

Gentle ping, @pan3793 . Please correct me if I'm wrong~

@dongjoon-hyun
Copy link
Member

Ping, @pan3793 .

@pan3793 pan3793 marked this pull request as draft September 29, 2024 04:38
@pan3793
Copy link
Member Author

pan3793 commented Sep 29, 2024

@dongjoon-hyun thanks for checking, you are right, I am fixing the issue. Converting to DRAFT now.

@pan3793 pan3793 marked this pull request as ready for review September 29, 2024 06:10
@pan3793
Copy link
Member Author

pan3793 commented Sep 29, 2024

@dongjoon-hyun I fixed the issue and verified the following cases:

  • configuring spark.log.structuredLogging.enabled in spark-defaults.conf with both true and false
  • running sbin/start-history-server.sh and sbin/start-master.sh and watching logs

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @pan3793 , @gengliangwang , @LuciferYang .
I also verified manually via Spark Master.

Merged to master for Apache Spark 4.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants