Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: ChannelLogger #5024

Merged
merged 11 commits into from Nov 7, 2018

Conversation

@zhangkun83
Copy link
Contributor

commented Nov 2, 2018

Introduce ChannelLogger, which is a utility provided to LoadBalancer implementations (potentially NameResolvers too) for recording events to channel trace. This is immediately required by client-side health checking (#4932, https://github.com/grpc/proposal/blob/master/A17-client-side-health-checking.md) to record an error about disabling health checking. It is also useful for any LoadBalancer implementations to record important information.

ChannelLogger implementation is backed by the internal ChannelTracer/Channelz. Because Channelz limits the number of retained events, and events are lost once the process ends, I have expanded it to also log Java logger. This would provide a "last resort" in cases where there are too many events or off-line investigation is needed. All logs are prefixed with logId so that they can be easily associated with the involved Channel/Subchannel.

To prevent log spamming, the logs are all at FINE level or below so that they are not visible by default. They are logged to ChannelLogger's logger, so that user can have precise control.

There are also more verbose information that may not fit in ChannelTracer, but can be useful for debugging. It's desirable that these logs are associated with logId, but they currently manually include the logId, which is cumbersome and may result in inconsistency. For this use case, I added the DEBUG level for ChannelLogger, which formats the log in the same way as other levels, while not recorded to Channelz.

I have converted most logging and channel tracer recording in the Channel implementation and LoadBalancers.

zhangkun83 added 6 commits Nov 1, 2018
* A Channel-specific logger provided by GRPC library to {@link LoadBalancer} implementations.
* Information logged here goes to <string>Channelz</strong>, and to Java logger as well.
*/
public abstract class ChannelLogger {

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Nov 2, 2018

Contributor

is this threadsafe?

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 5, 2018

Author Contributor

Yes. Added annotation.

* +---------------------+-------------------+-------------------+
* | ChannelLogger Level | Channelz Severity | Java Logger Level |
* +---------------------+-------------------+-------------------+
* | DEBUG | N/A | FINEST |

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Nov 2, 2018

Contributor

This table should be encoded as arguments to the enum valuies:

public enum Level {
  DEBUG(null, "CT_INFO", Level.FINEST),
}

That way the mapping can be without a switch statement below.

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 5, 2018

Author Contributor

I actually need to reference the CT_INFO enum to make this useful and robust, but it's internal API and I don't see how I can do it without exposing it.

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Nov 5, 2018

Contributor

It's suffiicient to make this a String instead of a hard ref, and then have a test that assert they all work.

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 5, 2018

Author Contributor

For these fields to be accessible from ChannelLoggerImpl, they would need to be public, which technically exposes these CT_ enums even not referencing them directly. Doesn't seem enough benefit from doing that.

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Nov 5, 2018

Contributor

These could be exposed via an internal accessor.

The benefit is that we can use hard references to see exactly how they map, rather than relying on switch statements.

}
};
ChannelTracer(
InternalLogId logId, final int maxEvents, long channelCreationTimeNanos, String description) {

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Nov 2, 2018

Contributor

You should document the behavior change of maxEvents

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 5, 2018

Author Contributor

Done.

return super.add(event);
}
};
ChannelTracer(

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Nov 2, 2018

Contributor

Can description be null?

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 5, 2018

Author Contributor

No one is passing null, so I just added the null check.

* +---------------------+-------------------+-------------------+
* </pre>
*/
public enum Level {

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Nov 2, 2018

Contributor

This probably deserves a name like ChannelLevel, since it behaves differently than other "Levels".

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 5, 2018

Author Contributor

Renamed to ChannelLogLevel.

@@ -32,12 +32,13 @@
import com.google.errorprone.annotations.ForOverride;
import io.grpc.Attributes;
import io.grpc.CallOptions;
import io.grpc.ChannelLogger;
import io.grpc.ChannelLogger.Level;

This comment has been minimized.

Copy link
@ejona86

ejona86 Nov 2, 2018

Member

I'd suggest not importing this and always referencing it via ChannelLogger.Level. Otherwise it is pretty confusing and you can't import java.util.logging.Level.

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 5, 2018

Author Contributor

As suggested by @carl-mastrangelo, I have renamed Level to ChannelLogLevel so there is no collision.

@@ -320,8 +314,7 @@ void shutdown() {
}

void propagateError(Status status) {
logger.log(Level.FINE, "[{0}] Had an error: {1}; dropList={2}; backendList={3}",
new Object[] {logId, status, dropList, backendList});
logger.log(Level.DEBUG, "Error: " + status);

This comment has been minimized.

Copy link
@ejona86

ejona86 Nov 2, 2018

Member

We really want to know whether or not this log message will be used, to avoid the string handling when useless.

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 2, 2018

Author Contributor

Since ChannelLogger is not on data path and is called pretty infrequently, I am not sure it's worth the code clutter to wrap it with a check.

This comment has been minimized.

Copy link
@ejona86

ejona86 Nov 2, 2018

Member

Things like logger.log(Level.DEBUG, "Got an LB response: " + response); could still be large and run on the network thread. Also, some LBs may do RPCs on cache misses, which could want to log to DEBUG.

I'm a bit worried about how much is getting pulled into this API. There's a lot of data being logged. Like every time the picker changes with logger.log(Level.INFO, state + ": picks=" + picker.pickList + ", drops=" + picker.dropList);. I know that isn't on the critical path, but that still happens frequently. And most of the time nobody will see it.

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 2, 2018

Author Contributor

What about log(Level level, String template, Object... args)?. For now I can check the loggable level in the implementation and that would save the string generation for DEBUG. We could optimize ChannelTracer even later to generate strings only when Channelz is requested.

This comment has been minimized.

Copy link
@ejona86

ejona86 Nov 5, 2018

Member

log(Level, String, Object...) seems fine to me. It'd probably be an overload (so log(Level, String) would still exist) to avoid allocating the Object[] when unnecessary.

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 5, 2018

Author Contributor

Done.

zhangkun83 added 2 commits Nov 5, 2018
* Logs a message, using a message template and a list of arguments used to generate the log
* message with {@link String#format}.
*/
public abstract void log(ChannelLogLevel level, String template, Object... args);

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Nov 5, 2018

Contributor

Isn't this format different than what normal loggers use? (i.e. {}) ?

I think this should be something that fits in with other, similar log statements. (I.e. it should do what SimpleFormatter does)

This comment has been minimized.

Copy link
@ejona86

ejona86 Nov 6, 2018

Member

It should use MessageFormat.

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 6, 2018

Author Contributor

Used MessageFormat. PTAL.

@ejona86
ejona86 approved these changes Nov 6, 2018
tracer.logOnly(Level.FINEST, MessageFormat.format(messageFormat, args));
}
} else {
reportToTracer(level, MessageFormat.format(messageFormat, args));

This comment has been minimized.

Copy link
@ejona86

ejona86 Nov 6, 2018

Member

This will create the event and format the message even when tracing is disabled.

This comment has been minimized.

Copy link
@zhangkun83

zhangkun83 Nov 7, 2018

Author Contributor

Done.

@zhangkun83 zhangkun83 merged commit 6b48eb4 into grpc:master Nov 7, 2018

13 checks passed

Android Kokoro build finished
Details
Bazel Kokoro build finished
Details
GAE Interop Kokoro build finished
Details
Linux artifacts Kokoro build finished
Details
Macos Kokoro build finished
Details
Windows Kokoro build finished
Details
android/apk_diff New APK size in bytes: 1,013,865 (delta: 744)
android/dex_diff New DEX reference count: 12,828 (delta: 28)
cla/linuxfoundation zhangkun83 authorized
Details
codecov/patch 94.61% of diff hit (target 80%)
Details
codecov/project 83.6% (+0.14%) compared to f8f86da
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.1%) to 87.766%
Details

@zhangkun83 zhangkun83 deleted the zhangkun83:lb_helper_record_trace branch Nov 7, 2018

@lock lock bot locked as resolved and limited conversation to collaborators Feb 5, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants
You can’t perform that action at this time.