Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore use of SLF4J in Logstash and default to Log4j #4548

Closed
suyograo opened this issue Jan 21, 2016 · 26 comments
Closed

Explore use of SLF4J in Logstash and default to Log4j #4548

suyograo opened this issue Jan 21, 2016 · 26 comments

Comments

@suyograo
Copy link
Contributor

The current logging library in Logstash is Cabin, created by @jordansissel to primarily overcome the lack of structured logging capabilities in ruby logger. More information on cabin features can be found in its readme.

There are some additional requirements for logging in LS that needs to be addressed:

We could add all these features to Cabin, but most of these requirements are already solved by libraries like log4j2 which can be run on JRuby. Better yet, we can introduce SLF4J which is an abstraction layer over logging libraries and we can ship a default Log4J implementation. SLF4J and Log4J both support structured logging: http://www.slf4j.org/api/org/slf4j/MDC.html so this migration should be easy.

@ph
Copy link
Contributor

ph commented Jan 21, 2016

Dynamic update of log level: Introduce an API to control log levels dynamically, just like Elasticsearch allows users to.

❤️

@jordansissel
Copy link
Contributor

We could add all these features to Cabin

I agree, and that would be lovely. However, with some of Logstash being implemented in Java now and in the future, we'd have to maintain a Cabin->Java bridge that maybe isn't great. It might be easier to move to log4j or something similar.

SLF4J's MDC isn't as simple as Cabin's structured logging, but I believe we can implement a wrapper for SLF4J (or log4j) that makes contextual logging calls take fewer lines of code.

@colinsurprenant
Copy link
Contributor

+1 on the initiative. This would also potentially provide similarities/consistence with ES in terms of logging configuration?

+1 on a thin sugar JRuby wrapper on top of SLF4J

@jordansissel
Copy link
Contributor

As an implementation/api detail, I think Cabin's main fault is the required creation of hashes every time you pass context.

I think this could be solved by doing a varargs method instead of requiring a map:

# Today:
logger.info("message", { some => hash })

# Future?
logger.info("message", key1, value1, key2, value2, key3, value3, ...)

The above would (hopefully) not require creation of a hash every time the method is invoked. I haven't done any performance tests to see the impact of this (memory and speed), but wanted to document this idea here. This is especially important for logging from Java because there's no easy "map literal" syntax in Java, iirc.

@colinsurprenant
Copy link
Contributor

as an fyi there is a json formatter for logback (+ slf4j) https://github.com/qos-ch/logback-contrib/tree/master/json

wonder if we could provide a varargs Ruby api as @jordansissel mentioned and offer pluggable slf4j logging with preloaded formatters for a) current log format support and b) json output using logback+json ?

@gmoskovicz
Copy link
Contributor

+1 for having log rotation configuration (and defaults) as we do in Elasticsearch!

+1 for using Log4j or any other tool that can do this automatically!

@driverpt
Copy link

driverpt commented May 17, 2016

@suyograo , i recommend using rjack slf4j

After all, what we want is an abstraction layer over the Logging Layer.

@guyboertje
Copy link
Contributor

I am +1 for logback + cabin features

I feel, as we develop more java sections to LS, we are going to have a real problem logging in the same structured way as we do now in Ruby land. I am thinking expressly of @ph rewrite of the beats input internals in Java - how the hell is he going to log structurally from Java?

@driverpt
Copy link

driverpt commented May 17, 2016

@guyboertje, i think Cabin should be abandoned in favour of a Logging Facade in Logstash. I suggest overloading methods to JSON Serialize custom objects or even apply a codec to do it.

Imho there should be a bridge between Ruby and Java. Of course the log layout will have to change in order to do this but it will become unified and clean.

@guyboertje
Copy link
Contributor

@driverpt - I am suggesting "Cabin like" features in a Java wrapper/facade.

@driverpt
Copy link

@guyboertje why not "Ruby Logger" like features ?

http://ruby-doc.org/stdlib-2.1.0/libdoc/logger/rdoc/Logger.html. It's not that hard to do. Only difference is to add alias_method from fatal to error.

@jordansissel
Copy link
Contributor

Some brief research into seeing what libraries can support us:

  • log4j2: context available via ThreadContext is <String,String> mapping only. Values cannot be complex objects, even if they are serializable.
  • slf4j: http://logback.qos.ch/manual/mdc.html -- <String,String> just like log4j.

@talevy and I are noodling about what kind of interface we can provide (ignoring the implementation details) and he likes a Builder approach:

logger.builder(Logger.INFO).set("key", value).set("key2", value2).log("Hello world")

I initially tried a simpler model of

logger.info("Hello world", "key", value, "key2", value2, ...)

However, we lose type checking and other compile-time validation, so that's no good. I like @talevy's builder approach. Still, we need to figure out if any of the existing log libraries even will work for us.

To do:

  • See if we need collections as values for logging. For example, would we want to log the entire Event object? Or an array of things?

@jordansissel
Copy link
Contributor

More research:

We may be able to move forward with only using <string,string> map for context, but if we think we need more complex objects, I doubt any existing java logging library will help us.

@jordansissel
Copy link
Contributor

More research --

log4j2's LogEvent is an Serializable interface , so we could possibly implement our own Logger that emits custom LogEvents that serializes well and forget whatever <String,String> limitations the default LogEvent imposes.

@jordansissel
Copy link
Contributor

log4j2 is quite the labyrinth, haha. Ok, so, @talevy's idea right now is to ship an Object via log4j2's ObjectEvent, and we provide a custom serializer (to json) via a custom Layout implementation. This would let users choose their preferred Appender strategies, let us deliver JSON logs, and hopefully let us log complex object types (collections, etc).

More news soon.

@driverpt
Copy link

driverpt commented May 17, 2016

@jordansissel , why complicate? Baby steps, first iteration do something similar to Logstash-core-event-java.

  • Create sub-project logstash-core-logging (Logger.java) with slf4j bridge to logback or log4j2
  • Create logger.rb bridging to Logger.java (SLF4J)
  • Override the methods in logger.rb to convert Ruby logging to Java Logging

@jordansissel
Copy link
Contributor

I am not sure what you mean. The main concern is being able to log
structured data, and log4j, slf4j, and log back MDC are all strings only.

On Tuesday, May 17, 2016, Luís Duarte notifications@github.com wrote:

@jordansissel https://github.com/jordansissel , why complicate? Baby
steps, first iteration do something similar to Logstash-core-event-java.

  • Create logger.rb and Logger.java
  • Create sub-project logstash-core-logging with slf4j bridge to
    logback or log4j2
  • Override the methods in logger.rb to convert Ruby logging to Java
    Logging


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#4548 (comment)

@driverpt
Copy link

driverpt commented May 17, 2016

Thats why you should keep the Strings and just serialize the Ruby Objects in the ruby bridge when calling for example @logger.log(some_hash, some_hash_2)

@jordansissel
Copy link
Contributor

I am not sure I am making the problem clear. Structured data logging is not
a ruby problem. We will have it in Java also.

On Tuesday, May 17, 2016, Luís Duarte notifications@github.com wrote:

Thats why you should keep the Strings and just serialize the Ruby Objects
in the ruby bridge.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#4548 (comment)

@guyboertje
Copy link
Contributor

guyboertje commented May 18, 2016

@talevy @jordansissel - Y'all have seen this, https://github.com/logstash/logstash-logback-encoder/tree/master, right?

Not proposing to use it - just analysis or inspiration

@jordansissel
Copy link
Contributor

@suyograo
Copy link
Contributor Author

suyograo commented May 24, 2016

Rough notes from a meeting with @talevy @colinsurprenant @guyboertje and @ph

  • Agreement to use Log4j2 to be consistent with ES, rather than logback.
  • Use logging.yml config like ES

Use Cabin like interface so we don’t have to update plugins of existing usage

Proposal to use Cabin4j: A new wrapper with Cabin like interface written in Java. This can be used in both JRuby and Java land. None of the existing plugins need to change. Reimplement internally use Log4j. Pure Java and JRuby should use the same logger object. Look into using https://github.com/lenny/log4jruby and if the package registration magic is simple enough, we wouldn't need this extra dependency on log4jruby.

Support Structured Logging

Cabin4j.getLogger().info("some message", some, context, here)

In theory, the indirection level should look like this:

  • From existing Ruby
Cabin4j.getLogger()
    |
    Log4jruby (Jruby bridge)
           |
           Log4j2 
  • From pure Java classes:
Cabin4j.getLogger()
     |
     Log4j2

Next steps:

@driverpt
Copy link

driverpt commented May 25, 2016

@suyograo , i still believe that you should use SLF4J Facade instead of building your own. It's pretty easy to create Bridges in SLF4J.

What i would do is:

  1. Create a logger.rb that acts as a Wrapper for Logger
  2. Create Java <> Ruby bridge to LogstashLogger.java (Similar to existent from event.rb to Event.java)
  3. Create a SLF4JBridgeHandler that Bridges LogstashLogger -> SLF4J Facade
  4. Create an Adaptation Layer that JSON Serializes the Objects that are passed as parameters before passing the String to the Logging Framework.
  5. Use Log4J2 as Underlying Logging Framework

See this

In my honest and humble opinion, Cabin4J is a nice effort, good solution, but i believe that using SLF4J that is a de-facto standard in the Java World would make it easier to manage the logging.

You might even give the opportunity to the Developers to use their own Logging Framework, e.g.: Loggly, Logsene, etcd....

@fbaligand
Copy link
Contributor

Several things about slf4j / log4j2 :

  • MDC doesn't aim to help making structured logging. It aims to help adding contextual data (such as current connected user, http request id, ...) to current thread so that it is added on every log line written by current thread. One important thing : all data put in MDC will stay attached to the thread until an explicit call to remove it from MDC.
  • to log some structured data, you can use a MapMessage or a StructuredDataMessage. The Logger methods (info, warn, error, ...) can take it as an argument.
  • These 2 structures allow only Map<String, String> input data. If you want something more custom, you can create your own class implementing Message interface. Making that, you can manage a specific logstash input object and specify how it is rendered in log, implementing getFormattedMessage method.
  • Even if I like very much SLF4J, SLF4J allows only to log a "String" message.
  • Finally, if you want logs to be rendered in json, you can configure JSONLayout.

@talevy
Copy link
Contributor

talevy commented May 31, 2016

@fbaligand thanks for the info!

Since we want to allow for more structure than just a Map<String, String> I have explored two options.

  1. implement custom Message class and use a custom json layout similar to JSONLayout
  2. implement custom LogEvent, as well as a custom LogEventFactory. This way we do not have to re-invent the wheel around the JSONLayout and we have control over how the LogEvent is serialized using this layout. The downside of this strategy is that you have to set a system property to get this going

something like this:

System.setProperty("Log4jLogEventFactory", CustomLogEventFactory.class.getName());

I wish things like JSONLayout were not final so that there could be more code-reuse. Overall, I do not think it is too bad to extend Log4j2 for more structured logging in json.

@suyograo
Copy link
Contributor Author

suyograo commented Jul 19, 2016

Closing in favor of #5650 Implementation work is happening there..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants