Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47579][CORE][PART1] Migrate logInfo with variables to structured logging framework #46362

Closed
wants to merge 10 commits into from

Conversation

zeotuan
Copy link
Contributor

@zeotuan zeotuan commented May 3, 2024

The PR aims to migrate logInfo in Core module with variables to structured logging framework.

Why are the changes needed?

To enhance Apache Spark's logging system by implementing structured logging.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Pass GA.

Was this patch authored or co-authored using generative AI tooling?

No.

@@ -26,13 +26,16 @@ trait LogKey {
}

/**
* Various keys used for mapped diagnostic contexts(MDC) in logging.
* All structured logging keys should be defined here for standardization.
* Various keys used for mapped diagnostic contexts(MDC) in logging. All structured logging keys
Copy link
Contributor Author

@zeotuan zeotuan May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these formatting changes happened as the result of running dev/scalafmt. Should we keep them?

@zeotuan zeotuan marked this pull request as ready for review May 6, 2024 02:17
@zeotuan
Copy link
Contributor Author

zeotuan commented May 6, 2024

@gengliangwang Hi please help review

* all the requests for a group of `barrier()` calls are received. If the coordinator is unable to
* collect enough global sync requests within a configured time, fail all the requests and return
* an Exception with timeout message.
* request is generated by `BarrierTaskContext.barrier()`, and identified by stageId +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove unnecessary changes

@gengliangwang
Copy link
Member

@zeotuan Thanks for the work. However, there are a lot of unnecessary changes in this PR, which are probably from scalafmt. This makes the PR review hard.
Usually, Spark PRs require cleaner code changes. Could you revert to c9e0209? You will have to fix the errors from ./dev/lint-scala manually.

@zeotuan
Copy link
Contributor Author

zeotuan commented May 6, 2024

@zeotuan Thanks for the work. However, there are a lot of unnecessary changes in this PR, which are probably from scalafmt. This makes the PR review hard. Usually, Spark PRs require cleaner code changes. Could you revert to c9e0209? You will have to fix the errors from ./dev/lint-scala manually.

Sure I will revert it. I was just wondering if there is any plan to fix/apply all these scalafmt changes or why would it produce such different result.

@gengliangwang
Copy link
Member

I was just wondering if there is any plan to fix/apply all these scalafmt changes or why would it produce such different result.

There has been such discussions. The major concern of not having such a format tool is to avoid making the git blame harder.

@@ -489,21 +488,24 @@ private[spark] class MapOutputTrackerMasterEndpoint(
extends RpcEndpoint with Logging {

logDebug("init") // force eager creation of logger
private def logInfoMsg(msg: MessageWithContext, shuffleId: Int, context: RpcCallContext): Unit = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a blank line

@@ -1855,7 +1859,9 @@ class SparkContext(config: SparkConf) extends Logging {
root,
if (uri.getFragment != null) uri.getFragment else source.getName)
logInfo(
s"Unpacking an archive $path from ${source.getAbsolutePath} to ${dest.getAbsolutePath}")
log"Unpacking an archive ${MDC(LogKeys.PATH, path)}" +
log" from ${MDC(LogKeys.SOURCE_ABSOLUTE_PATH, source.getAbsolutePath)}" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about simply SOURCE_PATH

s"Unpacking an archive $path from ${source.getAbsolutePath} to ${dest.getAbsolutePath}")
log"Unpacking an archive ${MDC(LogKeys.PATH, path)}" +
log" from ${MDC(LogKeys.SOURCE_ABSOLUTE_PATH, source.getAbsolutePath)}" +
log" to ${MDC(LogKeys.DESTINATION_ABSOLUTE_PATH, dest.getAbsolutePath)}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about simply DESTINATION_PATH

@@ -26,6 +26,8 @@ import scala.concurrent.duration._

import org.apache.spark.ProcessTestUtils.ProcessOutputCapturer
import org.apache.spark.SparkFunSuite
import org.apache.spark.internal.{MDC, MessageWithContext}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to migrate the log entries from tests. Let's revert the changes in this file to avoid introducing new log keys.

@@ -187,7 +188,7 @@ private object FaultToleranceTest extends App with Logging {
fn
numPassed += 1
logInfo("==============================================")
logInfo("Passed: " + name)
logInfo(log"Passed: ${MDC(TEST_NAME, name)}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to migrate the log entries from tests. Let's revert the changes in this file to avoid introducing new log keys.

@gengliangwang
Copy link
Member

LGTM otherwise. Thanks for the good work!

@gengliangwang
Copy link
Member

@zeotuan FYI I pushed a commit b45b340 to reduce a log key. Please help review it and let me know what you think of it.

@zeotuan
Copy link
Contributor Author

zeotuan commented May 7, 2024

@gengliangwang Thank you, LGTM!

@gengliangwang
Copy link
Member

Thanks, merging to master

@gengliangwang gengliangwang changed the title [SPARK-47240][CORE][PART1] Migrate logInfo with variables to structured logging framework [SPARK-47579][CORE][PART1] Migrate logInfo with variables to structured logging framework May 8, 2024
JacobZheng0927 pushed a commit to JacobZheng0927/spark that referenced this pull request May 11, 2024
…ed logging framework

The PR aims to migrate `logInfo` in Core module with variables to structured logging framework.

### Why are the changes needed?

To enhance Apache Spark's logging system by implementing structured logging.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

- Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#46362 from zeotuan/coreInfo.

Lead-authored-by: Tuan Pham <Tuan.Pham@wisetechglobal.com>
Co-authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants