Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-3816: Add MDC logging to Connect runtime #5743

Merged
merged 9 commits into from May 16, 2019

Conversation

@rhauch
Copy link
Contributor

commented Oct 4, 2018

See https://cwiki.apache.org/confluence/display/KAFKA/KIP-449%3A+Add+connector+contexts+to+Connect+worker+logs

Added LoggingContext as a simple mechanism to set and unset Mapped Diagnostic Contexts (MDC) in the loggers to provide for each thread useful parameters that can be used within the logging configuration. MDC avoids having to modify lots of log statements, since the parameters are available to all log statements issued by the thread, no matter what class makes those calls.

The design intentionally minimizes the number of changes to any existing classes, and does not use Java 8 features so it can be easily backported if desired, although per this KIP it will be applied initially only in AK 2.3 and later and must be enabled via the Log4J configuration.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)
@mageshn
Copy link
Contributor

left a comment

Thanks, @rhauch. I think this will be a great and long awaited addition to Connect. I took a high-level pass and had some questions.

@rhauch rhauch changed the title KAFKA-3816: Add MDC logging to Connect runtime [WIP] KAFKA-3816: Add MDC logging to Connect runtime Oct 10, 2018

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Oct 10, 2018

@ewencp, @kkonstantine: any feedback?

@rhauch rhauch force-pushed the rhauch:kafka-3816 branch 2 times, most recently from c571027 to d1901e0 Oct 10, 2018

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Oct 10, 2018

@mageshn, I rebased to the latest from trunk and changed the connector.type parameter name to connector.class, since as you mention we use "connector type" elsewhere to refer to whether the connector is a "source" or a "sink".

I also updated the connect-lo4j.properties file to include a commented-out format statement that uses these. Note that users will still have to explicitly enable this in their logging configuration in order to even see these in their log statements, so this change is backward compatible.

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Oct 15, 2018

@mageshn, I took your advice and simplified the LoggingContext class to remove any instance-based functionality. As before, the static methods are used in several key places within Connect, but no longer return a LoggingContext instance that can be used to clean up the context. Instead, a new clean() static method is introduced to allow callers to remove the Connect-specific MDC parameters. Note that Connect uses a pool of threads, and cleaning the MDC parameters after leaving a context is still warranted to ensure those parameters do not leak should that thread be used in a different context.

I'd appreciate another review.

@mageshn
Copy link
Contributor

left a comment

Thanks, @rhauch. Overall, this LGTM. Should we possibly add some documentation about these MDC keys that are being used by the framework? This could potentially help connectors to not override these. I can't think of a way to prevent from something like this completely.

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Oct 22, 2018

@mageshn I'm not sure where we would document those MDC context parameters. The https://kafka.apache.org/ site doesn't appear to document any of the logging configurations. I did add them as comments in the connect-log4j.properties file, which given the lack of logging documentation seems sufficient IMO.

@mageshn
Copy link
Contributor

left a comment

That sounds good to me @rhauch. I think this would be a super useful feature. LGTM

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Oct 22, 2018

@ewencp or @hachikuji, would you mind reviewing and merging if acceptable?

@rhauch rhauch force-pushed the rhauch:kafka-3816 branch from a766b6e to 06a834d Dec 17, 2018

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Jan 8, 2019

@ewencp or @hachikuji, reminder to review, please.

@rhauch rhauch force-pushed the rhauch:kafka-3816 branch from 06a834d to 7d031c5 Feb 6, 2019

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Feb 6, 2019

The JDK 8 / Scala 2.11 failure is unrelated:

kafka.api.AdminClientIntegrationTest.testMinimumRequestTimeouts
@@ -79,7 +80,12 @@ public void schedule(final ConnectorTaskId id, final WorkerSourceTask workerTask
ScheduledFuture<?> commitFuture = commitExecutorService.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
commit(workerTask);
try {
LoggingContext.forOffsets(id);

This comment has been minimized.

Copy link
@ewencp

ewencp Feb 8, 2019

Contributor

I see from previous comments that there used to be LoggingContext instances. Given current structure of code, I can see how getting rid of instances and uses static methods makes sense, but should we consider still using instances and changing clear to close() so we can just use try-with-resources? That seems less error-prone than this setup where it'd be easy to forget to put in a clear call.

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 11, 2019

Author Contributor

Agreed. I think my previous simplification went too far, and AutoCloseable does indeed make it much less error prone. Another benefit of using a try-with-resources and AutoCloseable is that the instance can capture and restore the complete MDC parameter state, and this makes nesting logging contexts that much easier and less error-prone.

@@ -220,6 +221,7 @@ public boolean startConnector(
final WorkerConnector workerConnector;
ClassLoader savedLoader = plugins.currentThreadLoader();
try {
LoggingContext.forConnector(connName);

This comment has been minimized.

Copy link
@ewencp

ewencp Feb 8, 2019

Contributor

How did we decide what to change logging context for vs not? For example, connectorTaskConfigs later in this file doesn't do this, but it will make a call into the connector which could potentially do logging without providing context as to which instance is logging. Similarly, stopConnector doesn't seem to switch logging context. (in particular, I found those by checking for where we're doing plugin swapping, which seems a likely indicator we want to change contexts)

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 11, 2019

Author Contributor

Good catch. I've tried to not change the herder, which is where some of the plugin swapping is done, but connectorTaskConfigs and stopConnector and stopTask all should have switched logging contexts.

I also adjusted where the logging context is created in some of the other methods, to be more inclusive of the whole Worker method, rather than reusing an existing try-block to minimize changes.

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 11, 2019

Author Contributor

@ewencp, ah, I remember why I moved away from AutoCloseable: In a try-with-resources statement, any catch or finally block is run after the resources declared have been closed.

I still think it's cleaner, but it may require an outer try-with-resources block outside of an existing try-catch-finally block to ensure that all log messages within the existing catch and finally blocks are logged in the same context.

WORKER("worker"),

/**
* The {@link Parameters#CONNECTOR_SCOPE scope} value for {@link org.apache.kafka.connect.connector.Task}

This comment has been minimized.

Copy link
@ewencp

ewencp Feb 8, 2019

Contributor

it looks like the plain connector context is implicit, any reason not to just make it explicit?

public static void forConnector(String connectorName) {
MDC.put(Parameters.CONNECTOR_NAME, connectorName);
MDC.remove(Parameters.CONNECTOR_TASK);
MDC.put(Parameters.CONNECTOR_SCOPE, Scope.WORKER.value());

This comment has been minimized.

Copy link
@ewencp

ewencp Feb 8, 2019

Contributor

actually, i guess i see here that for the connector itself we'd use WORKER. why is that?

I know we swap out the context such that some logs would be coming from the worker (e.g. the log messages indicating we are going to instantiate the connector when we start it), but that's also true for tasks where we scope to task.

when reading through this initially and i saw worker, I assumed anything logged for that would indicate something that might not invoke the connector/task at all but is done on behalf of them (initially i was thinking offset commit for this but that would invoke; possibly something having to do with configs or statuses?), or where there's not a connector name yet (e.g. validations).

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 11, 2019

Author Contributor

I guess because the WorkerTask is doing more than the connector is, but it is doing it for a particular named connector. So when viewing the log, I'd want to see all of the worker-related logic (whether or not it's Connect code or Connector code) in the "worker" context, and all task-related logic (whether it's Connect's WorkerTask or connector task) in the "task" context. The class names in each log message tell me exactly where that code comes from.

Based upon your previous question, is your main question why we use "worker" and not "connector" for the scope? I think I had used "connector" at some point, and think my primary reason was that "connector" scope was used for much more than just what the individual Connector instance was doing (even how a user might think of it). Plus "worker" was shorter, which was a small benefit.

*
* @param id the connector task ID; may not be null
*/
public static void forOffsets(ConnectorTaskId id) {

This comment has been minimized.

Copy link
@ewencp

ewencp Feb 8, 2019

Contributor

is this distinction going to be useful to users or us? i know it operates in a separate thread, but so do some other things (e.g. stopping a source connector expects multi-threaded).

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 11, 2019

Author Contributor

At least for us I think it does add clarity because these are called by the offset commit thread, not the connector's worker thread. Sometimes I've wanted to just find the messages that were written by all of the code running on the offset commit thread for a particular source connector, and it's not easy to do.

I'm less convinced this will be really obvious to users who are less familiar with the behaviors of Connect. Users do often see log messages related to offset commits and they're not sure whether that affects their source connector and the producer. However, I don't think a separate scope for offsets is detrimental.

@wicknicks
Copy link
Contributor

left a comment

thanks, @rhauch! this is super useful.

connect/runtime/src/test/resources/log4j.properties Outdated Show resolved Hide resolved
public static void clear() {
MDC.remove(Parameters.CONNECTOR_NAME);
MDC.remove(Parameters.CONNECTOR_TASK);
MDC.remove(Parameters.CONNECTOR_SCOPE);

This comment has been minimized.

Copy link
@wicknicks

wicknicks Feb 11, 2019

Contributor

just wondering why we need this method? AFAICT, we don't reuse threads in Connect. If a thread is dedicated for a particular task (as a task or offset committer), it is dedicated for that task throughout its lifetime. if we could get away without needing this method, it would simplify the changes. WDYT?

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 11, 2019

Author Contributor

You're right that it's not often necessary. Where we do not switch threads, the extra overhead of resetting the MDC parameters is negligible. But the offset commit thread does switch logging contexts, and IMO it is simply a good practice to encourage restoring resources.

Per @ewencp's suggestion, I went back to AutoCloseable so that try-with-resources works and reduces the risk of coding errors.

This comment has been minimized.

Copy link
@wicknicks

wicknicks Feb 12, 2019

Contributor

The heavy lifting in SourceTaskOffsetCommitter is delegated back to the WorkerSourceTask though, which doesn't help us with MDC but does bring up if we want to use LogContext instead of MDC for AK? Other components in AK (Kafka, Streams and Clients) seem to use LogContext that adds a prefix on every SLF4J logger. This is already being used by some bit of Connect (for example., in WorkerCoordinator). This would avoid having to micromanage thread context, and also keeps the platform uniform.

This comment has been minimized.

Copy link
@hachikuji

hachikuji Feb 13, 2019

Contributor

Yes, I had the same thought as @wicknicks. We preferred LogContext over MDC because our needs were basic and the API was simpler.

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 13, 2019

Author Contributor

@wicknicks, @hachikuji: indeed LogContext is simpler, but:

  1. its prefix is fixed; and
  2. it only affects log messages written by Logger instances instantiated with the LogContext.

So for a Kafka client, this works great because the client has a single ID and the AK code that creates the Logger can be changed to use LogContext.

However, with Connect the connector implementations will not change how they create their Logger instances. And, we do have some code where a single thread does switch contexts from one connector/task to another. Given these, I don't see how LogContext will work.

This PR uses MDC, and this will add the additional context (via MDC parameters) to all log messages written by all components -- including connector implementations. MDC works this way because it is thread-based, not based upon the Logger instance that is being used, so switching contexts within a single thread is easy to do.

This comment has been minimized.

Copy link
@wicknicks

wicknicks Feb 20, 2019

Contributor

alright cool -- yeah LogContext won't work if we want to tag message from plugins.

@rhauch rhauch force-pushed the rhauch:kafka-3816 branch from 6b23aff to 9f1908c Feb 11, 2019

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Feb 11, 2019

@ewencp, @wicknicks: thanks for the review. Answered most questions/comments inline above, and updated the proposed changes based upon your feedback, and would appreciate another review. (Rebased on trunk and forced pushed to the PR.)

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Feb 15, 2019

@ewencp, @wicknicks: would you mind reviewing the latest changes, and approving if you have no other concerns?

# Use %X{connector.scope} for the scope within a connector (e.g., "worker", "task", or "offsets")
# Use %X{connector.task} for the 1-based number of the task, or blank if not within a connector task
#
#log4j.appender.stdout.layout.ConversionPattern=[%d] %p %X{connector.name}|%X{connector.scope}%X{connector.task} %m (%c:%L)%n

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 15, 2019

Author Contributor

Should we enable this by default in the provided connect-log4j.properties?

This comment has been minimized.

Copy link
@ewencp

ewencp Feb 15, 2019

Contributor

log format is a public interface, if we do something like this we'd need to consider the compatibility implications. (we're already probably more lax in log message changes than we really should be...)

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 15, 2019

Author Contributor

Ok, that's why I had only included it as a comment.

This comment has been minimized.

Copy link
@wicknicks

wicknicks Feb 20, 2019

Contributor

In this example, should we add a | between scope and task?

@wicknicks
Copy link
Contributor

left a comment

took another pass at this. some comments.

LoggingContext context = new LoggingContext();
MDC.put(Parameters.CONNECTOR_NAME, connectorName);
MDC.remove(Parameters.CONNECTOR_TASK);
MDC.put(Parameters.CONNECTOR_SCOPE, Scope.WORKER.toString());

This comment has been minimized.

Copy link
@wicknicks

wicknicks Feb 20, 2019

Contributor

is there a reason to retain these different components? I think it might be easier if we only declared a single %X{connect.location}, and internally format it as we see fit. That way we can omit metadata when we think is appropriate. Couple of other advantages I can think of:

  • easier for users to reason about
  • when we add the formatting bits (| in the examples here), these show up in the logs no matter what the content of the MDC map. For example: [2019-02-19 16:49:47,580] (main) INFO || Scanning for plugin classes..
  • optional variables will be more easy to manage (right now, tasks numbers are not present in some cases).
  • In future, if we decide to add more variables, then those will show up in the logs without users having to muck around with their log4j.properties files.
# Use %X{connector.scope} for the scope within a connector (e.g., "worker", "task", or "offsets")
# Use %X{connector.task} for the 1-based number of the task, or blank if not within a connector task
#
#log4j.appender.stdout.layout.ConversionPattern=[%d] %p %X{connector.name}|%X{connector.scope}%X{connector.task} %m (%c:%L)%n

This comment has been minimized.

Copy link
@wicknicks

wicknicks Feb 20, 2019

Contributor

In this example, should we add a | between scope and task?

public void close() {
MDC.clear();
if (previous != null && !previous.isEmpty()) {
MDC.setContextMap(previous);

This comment has been minimized.

Copy link
@wicknicks

wicknicks Feb 20, 2019

Contributor

I think this would break if a thread directly sets MDC keys (maybe in a Connector) in between a LoggingContext.for..() and its close() call. For example:

log.info("blah 1");  // outputs "blah 1"
try (LoggingContext ignored = LoggingContext.forConnector("jello")) {
    MDC.put("hello", "world");
    log.info("blah 2"); // outputs "world jello|worker| blah 2"
}
log.info("blah-3"); // outputs *only* "blah-3", the "hello", "world" entry is removed from MDC map.

Maybe we want to just remove the keys we set to be safer overall?

This comment has been minimized.

Copy link
@rhauch

rhauch Feb 20, 2019

Author Contributor

That was considered in an earlier version, and @mageshn suggested we do this.

This comment has been minimized.

Copy link
@mageshn

mageshn Feb 20, 2019

Contributor

The only time a LoggingContext should be cleared is when a thread gets repurposed for a different reason. IIRC, the LoggingContext was not Closeable when I provided the original comment. If the general agreement is that it should be Closeable then we got find a way to manage the state then. IMO, I think it could become complicated trying to manage the state when the MDC is already doing in in threadLocal. One another interesting idea could be invoke close some how as part of the Plugin class loader swap, while not intuitive, it will ensure that we are clearing the context everytime its necessary. Thoughts?

This comment has been minimized.

Copy link
@wicknicks

wicknicks Mar 11, 2019

Contributor

I think the source of the confusion arises from LoggingContext.close() calling MDC.clear. Removing specific keys instead would solve the problem.

This comment has been minimized.

Copy link
@rhauch

rhauch Apr 3, 2019

Author Contributor

Actually, the purpose of LoggingContext is to essentially capture the MDC state such that it can be restored at a later time when LoggingContext.close() is called. The purpose is not just to manage the now one MDC parameter in which we're storing the Connect context. The close() method does call MDC.clear(), but then it also sets the MDC state to the same parameters that were found when the LoggingContext constructor was called.

Consider when this is used to set up a context under which a connector implementation method is called, and that connector implementation sets an MDC parameter. With the current implementation, when the logging context is closed, any changes to the MDC state will have been cleaned up properly, and the MDC state restored to exactly what it was when the context was first created.

This comment has been minimized.

Copy link
@C0urante

C0urante Apr 15, 2019

Contributor

@wicknicks @rhauch @mageshn I think this behavior is acceptable. We may want to document this logging behavior somewhere so that nobody tries to set an MDC key in, e.g., Connector.start() and then refer back to it in Connector.stop(). However, at least from my POV it seems unreasonable and unintuitive to expect that to work in the first place.

@rhauch I like the shouldAllowNestedLoggingContexts test you've written. As long as we agree that the behavior we're discussing here is acceptable, could you either add a few extra MDC keys inside one of the nested contexts and verify that they're cleared once the context is closed, or add a new test similar to shouldAllowNestedLoggingContexts that deals specifically with the wiping of extra MDC keys once a context is closed?

This comment has been minimized.

Copy link
@rhauch

rhauch Apr 15, 2019

Author Contributor

I think I see @wicknicks's point, though, and I think the current implementation might pose a problem. Consider an implementation of SourceTask that sets up an MDC parameter in the start(...) method and remove it in the stop(...) method, but wants any log messages written by this task class to include that MDC parameter. See this real-world example.

Connect currently runs each task in a separate thread via the Worker.executorService, and all of the task methods are called from within that thread. So I think it's safe to not restore the MDC state to exactly what it was when the LoggingContext was stopped, as suggested by @wicknicks.

However, when the task ends, it's possible the executor service will reuse the thread, so we need to clean that in WorkerTask.run(), just in case the thread's MDC parameters contain state from a prior task.

@rhauch rhauch force-pushed the rhauch:kafka-3816 branch from 9f1908c to 1040aa8 Apr 2, 2019

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Apr 2, 2019

Rebased, and updated to reflect the updated proposal also described in KIP-449.

@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented Apr 12, 2019

Incorporated a few changes as suggested by @kkonstantine in the KIP-449 discussion.

@C0urante
Copy link
Contributor

left a comment

Thanks for the PR, @rhauch! I've left a few comments but nothing major, feel free to address at your discretion.

public void close() {
MDC.clear();
if (previous != null && !previous.isEmpty()) {
MDC.setContextMap(previous);

This comment has been minimized.

Copy link
@C0urante

C0urante Apr 15, 2019

Contributor

@wicknicks @rhauch @mageshn I think this behavior is acceptable. We may want to document this logging behavior somewhere so that nobody tries to set an MDC key in, e.g., Connector.start() and then refer back to it in Connector.stop(). However, at least from my POV it seems unreasonable and unintuitive to expect that to work in the first place.

@rhauch I like the shouldAllowNestedLoggingContexts test you've written. As long as we agree that the behavior we're discussing here is acceptable, could you either add a few extra MDC keys inside one of the nested contexts and verify that they're cleared once the context is closed, or add a new test similar to shouldAllowNestedLoggingContexts that deals specifically with the wiping of extra MDC keys once a context is closed?

@rhauch
Copy link
Contributor Author

left a comment

Thanks, @C0urante. Left a few responses to your comments. I've also updated the PR to address the potential bug that @wicknicks brought up, where an MDC parameter set by a connector's task in the start() method (after a LoggingContext instance is created in the framework) will be removed by the LoggingContext.close() before the task's put(...) or poll(...) method is called.

public void close() {
MDC.clear();
if (previous != null && !previous.isEmpty()) {
MDC.setContextMap(previous);

This comment has been minimized.

Copy link
@rhauch

rhauch Apr 15, 2019

Author Contributor

I think I see @wicknicks's point, though, and I think the current implementation might pose a problem. Consider an implementation of SourceTask that sets up an MDC parameter in the start(...) method and remove it in the stop(...) method, but wants any log messages written by this task class to include that MDC parameter. See this real-world example.

Connect currently runs each task in a separate thread via the Worker.executorService, and all of the task methods are called from within that thread. So I think it's safe to not restore the MDC state to exactly what it was when the LoggingContext was stopped, as suggested by @wicknicks.

However, when the task ends, it's possible the executor service will reuse the thread, so we need to clean that in WorkerTask.run(), just in case the thread's MDC parameters contain state from a prior task.

workerMetricsGroup.recordConnectorStartupSuccess();
log.info("Finished creating connector {}", connName);
workerMetricsGroup.recordConnectorStartupSuccess();
}

This comment has been minimized.

Copy link
@rhauch

rhauch May 14, 2019

Author Contributor

The bulk of the changes in this class are indentation due to wrapping in a new try-catch.

@guozhangwang
Copy link
Contributor

left a comment

LGTM! (I like the way MDC is used).

@rhauch Could you rebase?

KAFKA-3816: Add MDC logging to Connect runtime.
Added LoggingContext as a simple mechanism to set and unset Mapped Diagnostic Contexts (MDC) in the loggers to provide for each thread useful parameters that can be used within the logging configuration. MDC avoids having to modify lots of log statements, since the parameters are available to all log statements issued by the thread, no matter what class makes those calls.

The design intentionally minimizes the number of changes to any existing classes, and does not use Java 8 features so it can be easily backported if desired.

rhauch added some commits Oct 10, 2018

KAFKA-3816: Minor changes to the connector kind MDC parameter
Renamed the `connector.type` MDC parameter to `connector.class`, since the value represents not “source” or “sink” (which we refer to elsewhere as the _connector type_) but rather a shortened version of the class name.
KAFKA-3816: Simplified the LoggingContext functionality
Redesigned the new LoggingContext functionality to remove all local state and to simplify the usage and implementation. Added a `LoggingContext.clear()` method that will remove all Connect-specific MDC parameters from the current thread’s MDC context, and used this in several places to ensure that the parameters don’t leak to other uses of the same thread.

Also removed the connector class MDC parameter, since this doesn’t add much value compared with the connector name.

@rhauch rhauch force-pushed the rhauch:kafka-3816 branch from 12f2bdd to 13a9a46 May 16, 2019

rhauch added some commits Feb 11, 2019

KAFKA-3816: Changed to use try-with-resources blocks for logging cont…
…exts

Make LoggingContext implement AutoCloseable so that they can be created as resources within try-with-resources blocks and automatically cleaned up at the end of the try-with-resources. However, because resources are cleaned up before the `catch` and `finally` blocks, it was often necessary to _add_ a new try-with-resources rather than reuse an existing try-catch-finally block so that log messages within the `catch` and `finally` blocks are still logged in the same context.
KAFKA-3816: Changed to use a single MDC parameter
A single MDC parameter is used for Connect, with the value being generated when the LoggingContext is created from various inputs. This retains the same information, but exposes only a single parameter to the user. Note that this does not allow the user to control the format of the additional context strings, but that’s probably an advantage since we can standardize the format and the content.
KAFKA-3816: No longer clear the MDC state when LoggingContext is closed
The previous approach had a potential bug: if a connector’s task were to set MDC parameters in its `start()` method (after a `LoggingContext` instance is created in the framework), then the `LoggingContext.close()` call within the framework will remove the connector’s MDC parameters before the framework creates a new `LoggingContext` and calls the task's `put(...)` or `poll(...)` method. The change is to only clear out the MDC parameters when the task starts, at which point there any MDC parameters set by a previous connector’s task are removed. Note that tasks are executed with an ExecutorService, which means its possible (depending upon the implementation) that some threads are reused.
@rhauch

This comment has been minimized.

Copy link
Contributor Author

commented May 16, 2019

@guozhangwang thanks for the review! Rebased on latest.

@guozhangwang guozhangwang merged commit b395ef4 into apache:trunk May 16, 2019

0 of 2 checks passed

JDK 11 and Scala 2.12 Build started for merge commit.
Details
JDK 8 and Scala 2.11 Build started for merge commit.
Details

Pengxiaolong added a commit to Pengxiaolong/kafka that referenced this pull request Jun 14, 2019

KAFKA-3816: Add MDC logging to Connect runtime (apache#5743)
See https://cwiki.apache.org/confluence/display/KAFKA/KIP-449%3A+Add+connector+contexts+to+Connect+worker+logs

Added LoggingContext as a simple mechanism to set and unset Mapped Diagnostic Contexts (MDC) in the loggers to provide for each thread useful parameters that can be used within the logging configuration. MDC avoids having to modify lots of log statements, since the parameters are available to all log statements issued by the thread, no matter what class makes those calls.

The design intentionally minimizes the number of changes to any existing classes, and does not use Java 8 features so it can be easily backported if desired, although per this KIP it will be applied initially only in AK 2.3 and later and must be enabled via the Log4J configuration.

Reviewers: Jason Gustafson <jason@conflent.io>, Guozhang Wang <wangguoz@gmail.com>

jsancio pushed a commit to jsancio/kafka that referenced this pull request Aug 6, 2019

Merge remote-tracking branch 'confluent/master' into sync-upstream-16…
…-may

* confluent/master:
  Add '?expand' query param for additional info on '/connectors'. (apache#6658)
  KAFKA-8376; Least loaded node should consider connections which are being prepared (apache#6746)
  KAFKA-8256; Replace Heartbeat request/response with automated protocol (apache#6691)
  KAFAK-3522: Add TopologyTestDriver unit tests (apache#6179)
  KAFKA-8347: Choose next record to process by timestamp (apache#6719)
  MINOR: Added missing method parameter to `performAssignment` javadoc (apache#6744)
  KAFKA-8220; Avoid kicking out static group members through rebalance timeout (apache#6666)
  KAFKA-3816: Add MDC logging to Connect runtime (apache#5743)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.