Write data structures to your logs from clojure
Clojure Shell
Permalink
Failed to load latest commit information.
jenkins
src/puppetlabs/structured_logging
test/puppetlabs/structured_logging
.gitignore Initial revision Sep 29, 2015
.travis.yml
LICENSE
README.md (maint) README: really fix i18n link Nov 15, 2016
project.clj Version 0.2.1-SNAPSHOT Nov 15, 2016

README.md

puppetlabs/structured-logging Clojars Project Build Status

structured-logging is a library that helps you to:

  • write arbitrary JSON to your logs, making it easier to interface with log analysis tools.
  • write messages to arbitrarily named loggers, instead of just the one named after the namespace in which the log statement appears.

It is built on clojure.tools.logging, but it only works with logback.

Usage

maplog

Structured log entries are created by calling maplog like this:

(maplog :warn {:user-service "https://...", :status 503, :elapsed 27}
        #(format "Failed to query user-service %s. Response: status %d"
                 (:user-service %) (:status %)))

The second parameter is the structured data you want to log, and any clojure map is ok, as long as Cheshire can handle it.

The log message itself is generated by passing the structured data to the function provided as the final argument. This provides a general way to support arbitrary formatters, including some like those in clj-i18n, which require that their invocations appear literally, at each translation call site, i.e.:

(maplog :info {:temperature t :stamp (now)}
        #(i18n/trs "Temperature has exceeded {0} at {1}"
                   (:temperature %) (:stamp %)))

Named loggers

When you would pass a log-level parameter, you may instead supply a vector of [:custom-logger :level]. This will log a message to the logger named "custom-logger". Then you can specifically address that logger in your logback configuration (to redirect it to a different log file, for example).

Common fields

If you use the recommended logger configuration, as described below, you will see the following fields in each JSON log message:

  • @timestamp
  • message
  • logger_name
  • thread_name
  • level
  • level_value (numeric, suitable for sorting)
  • stack_trace

Additional relevant fields may be added in any given message, but this base set will always be present.

Configuration

JSON text

If you want to log JSON data where you would otherwise log regular text, replace the encoder element in your logback.xml with this one:

<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
  <providers>
    <timestamp/>
    <message/>
    <loggerName/>
    <threadName/>
    <logLevel/>
    <logLevelValue/>
    <stackTrace/>
    <logstashMarkers/>
  </providers>
</encoder>

Even though this says 'logstash' on it, it works completely independently from any log aggregation system. The final <logstashMarkers/> element is important - that inserts our custom properties into each json message.

Logstash integration

You can also log directly to logstash with an appender configured like this:

<appender name="stash" class="net.logstash.logback.appender.LogstashTcpSocketAppender">
  <remoteHost>my.great.logstash.server.com</remoteHost>
  <port>4560</port>

  <encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
    <providers>
      <timestamp/>
      <message/>
      <loggerName/>
      <threadName/>
      <logLevel/>
      <logLevelValue/>
      <stackTrace/>
      <logstashMarkers/>
    </providers>
  </encoder>
</appender>

You will also need to add a reference to the appender from the <root> element:

<root>
  ... 
  <appender-ref ref="stash" />
</root>

Log Output

We are using a json encoder that comes with the logstash-logback integration library. (https://github.com/logstash/logstash-logback-encoder) This can be used in conjunction with the logstash tcp appender, or in conjunction with a typical file appender. In either case, the json for the above log message looks something like this:

{
  "@timestamp": "...",
  "level": "WARN",
  "user-service": "https://...",
  "status": 503,
  "elapsed": 27
}

The exact fields that show up depend on how you've configured the encoder; '@timestamp' is a logstash convention.

Answers to expected questions

Can I embed arbitrary structures in there?

You can log anything that cheshire can serialize, but you may wish to stick to simple key-value formats to keep your logs easy to analyze.

What happens when I log a map with :level as the key?

Don't do that. It's not yet clear how this case should be handled. For now, avoid such keys. Namespacing your map keys may be prudent if you're worried about collisions.

Does this library require logstash?

No. It does depend on the json encoder that comes with the logback-logstash integration library, but you can point the json it spits out at any old logback appender. Critically, it has a feature that allows us to edit the json before it gets written, which is where we add our custom information. We could write such an encoder ourselves, but they already wrote it.

Why didn't you use one of the other logback-json encoders?

Because those just encode the normal log event fields as json; the useful feature here is the ability to include arbitrary data as part of the json.

Tips

Structured logging is a slippery slope and should be applied where it has clear benefits. While we've streamlined it a lot, this is heavier than regular logging, and it's still non-application code that you have to scatter throughout your program.