New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-31970][CORE] Make MDC configuration step be consistent between setLocalProperty and log4j.properties #28801
Conversation
cc @cloud-fan @igreenfield Please take a look, thanks! |
@@ -965,4 +962,5 @@ private[spark] object Executor { | |||
// task is fully deserialized. When possible, the TaskContext.getLocalProperty call should be | |||
// used instead. | |||
val taskDeserializationProps: ThreadLocal[Properties] = new ThreadLocal[Properties] | |||
val MDC = "mdc." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is too short... I'm ok to just hardcode it in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine.
I don't have a strong preference of it, seems OK. cc @jiangxb1987 |
seems ok to me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi, All. Sorry, but how can we disable this MDC feature and enforce to generate the same result like Spark 2.4.6? |
Test build #123858 has finished for PR 28801 at commit
|
As for end-user, they will not see the MDC properties in the log if they use default log4j.peroperties (since MDC requires extra pattern configuration). So it's still the same compares to Spark 2.4.6. But internally, yes, Spark will always add at least one MDC property (taskName) to the MDC, even if nowhere else will actually use it. But considering the data is small, so probably it's fine to do it without a controllable flag. |
yea the log4j property file is kind of the config to control this MDC feature, and it's off by default. |
Test build #123884 has finished for PR 28801 at commit
|
retest this please |
Test build #123916 has finished for PR 28801 at commit
|
retest this please |
Test build #123930 has finished for PR 28801 at commit
|
retest this please. |
Test build #123939 has finished for PR 28801 at commit
|
retest this please |
Test build #123947 has finished for PR 28801 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @Ngone51 , @cloud-fan , @jiangxb1987 .
Merged to master.
thanks all!! |
What changes were proposed in this pull request?
This PR proposes to use "mdc.XXX" as the consistent key for both
sc.setLocalProperty
andlog4j.properties
when setting up configurations for MDC.Why are the changes needed?
It's weird that we use "mdc.XXX" as key to set MDC value via
sc.setLocalProperty
while we use "XXX" as key to set MDC pattern in log4j.properties. It could also bring extra burden to the user.Does this PR introduce any user-facing change?
No, as MDC feature is added in version 3.1, which hasn't been released.
How was this patch tested?
Tested manually.