Skip to content

Comments

[SPARK-47603][KUBERNETES][YARN] Resource managers: Migrate logWarn with variables to structured logging framework#45957

Closed
panbingkun wants to merge 8 commits intoapache:masterfrom
panbingkun:SPARK-47603
Closed

[SPARK-47603][KUBERNETES][YARN] Resource managers: Migrate logWarn with variables to structured logging framework#45957
panbingkun wants to merge 8 commits intoapache:masterfrom
panbingkun:SPARK-47603

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Apr 9, 2024

What changes were proposed in this pull request?

The pr aims to migrate logWanrning in module Resource managers with variables to structured logging framework.

Why are the changes needed?

To enhance Apache Spark's logging system by implementing structured logging.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Pass GA.

Was this patch authored or co-authored using generative AI tooling?

No.

@panbingkun panbingkun changed the title [SPARK-47603][CORE] Resource managers: Migrate logWarn with variables to structured logging framework [SPARK-47603][KUBERNETES][YARN] Resource managers: Migrate logWarn with variables to structured logging framework Apr 9, 2024
import org.apache.spark.deploy.k8s.submit._
import org.apache.spark.internal.Logging
import org.apache.spark.internal.{Logging, MDC}
import org.apache.spark.internal.LogKey.{CONFIG, CONFIG2, CONFIG3, CONFIG4, CONFIG5}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this looks a bit ugly, there seems to be no better way.

} else {
logError(log"Driver terminated with exit code ${MDC(EXIT_CODE, exitCode)}! " +
log"Shutting down. ${MDC(REMOTE_ADDRESS, remoteAddress)}")
log"Shutting down. ${MDC(RPC_ADDRESS, remoteAddress)}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to this PR, but I found that the LogKey REMOTE_ADDRESS and RPC_ADDRESS have the same meaning, so I removed REMOTE_ADDRESS from LogKey and only kept RPC_ADDRESS

val REDUCE_ID = Value
val RELATION_NAME = Value
val REMAINING_PARTITIONS = Value
val REMOTE_ADDRESS = Value
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that the LogKey REMOTE_ADDRESS and RPC_ADDRESS have the same meaning, so I removed REMOTE_ADDRESS from LogKey and only kept RPC_ADDRESS

@panbingkun panbingkun marked this pull request as ready for review April 12, 2024 01:18
@panbingkun
Copy link
Contributor Author

cc @gengliangwang


args.foreach { mdc =>
sb.append(mdc.value.toString)
val value = if (mdc.value != null) mdc.value.toString else null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rebase the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

s" Kubernetes cluster after $podCreationTimeout ms despite the fact that a previous" +
" allocation attempt tried to create them. The executors may have been deleted but the" +
" application missed the deletion event.")
logWarning(log"Executors with ids ${MDC(RESOURCE_NAME, timedOut.mkString(","))}} were not " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be EXECUTOR_IDS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, I misread the position of the tuple, updated

val newFailedExecutorIds = currentFailedExecutorIds.diff(failedExecutorIds)
if (newFailedExecutorIds.nonEmpty) {
logWarning(s"${newFailedExecutorIds.size} new failed executors.")
logWarning(log"${MDC(RESOURCE_COUNT, newFailedExecutorIds.size)} new failed executors.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make it simply "COUNT"

@gengliangwang
Copy link
Member

@panbingkun Thanks for the work. LGTM except for two comments.

@panbingkun
Copy link
Contributor Author

@panbingkun Thanks for the work. LGTM except for two comments.

Updated, done.

@gengliangwang
Copy link
Member

Thanks, merging to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants