Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging Rethink #397

Closed
mcdonnelldean opened this issue Apr 12, 2016 · 28 comments
Closed

Logging Rethink #397

mcdonnelldean opened this issue Apr 12, 2016 · 28 comments

Comments

@mcdonnelldean
Copy link
Contributor

mcdonnelldean commented Apr 12, 2016

Logging needs a rethink, my thoughts.

  • Contain a lot of unneeded information.
  • Don't capture highly valuable internal information and stats
  • Are not json based
  • Have truncation errors meaning they are lossy
  • Aren't easily pipable

In terms of logging, I would love to get a week of @mcollina's time to bring in pino. Pino is wicked fast and has sub loggers which dovetails perfectly with our plugin structure. I'd love to see a spike with three key goals

  • Bring in pino, and sub logging for plugins
  • Rework core stats and information output
  • Provide a proxy for old logging so we don't break the universe

A spike could at least give us something to discuss. Nodezoo is a great system to play around with logging since you can run each service in isolated mode and manipulate it via curl or postman.

Contributors and users, please feel free to add pain points and constructive feedback.

Agreed Detail

Feel free to correct in an issue below

  • Logging will object based
  • Loggers will need to be api compatible with bunyan / pino
  • An adapter will need to be provided for libs like Winston, etc
@mcdonnelldean mcdonnelldean changed the title Logging and Error handling Logging Rethink Apr 12, 2016
@mcollina
Copy link
Contributor

Here are my priorities about logging in Seneca:

  1. logs needs to be in newline delimited JSON, because that is a portable format that is easy to process
  2. most of the smarts about logging will happen outside of seneca through post processing
  3. I want to be able to extract from a set of logs all lines that are relevant to a specific transaction (=== a flow of acts) via a post-processing tool. This is extremely useful for reconstructing production situation. This should also work across transports.
  4. I would like to use a child logger assigned to every transaction (if possible), and making that available to users.
  5. Old logging is so useless we can just remove it doing a major version bump. Backward compatibility here is not a priority for me.
  6. I would use pino, because it's wicked fast.

Let me know what you think of these.

@AdrianRossouw
Copy link
Contributor

I especially agree with point 5 that matteo made. I seriously doubt we have anything that is busy consuming the existing seneca logging, so the likelihood of us breaking anything by just replacing it outright is miniscule.

@mcdonnelldean
Copy link
Contributor Author

@mcollina @AdrianRossouw Understand though that this will require changes to plugins before they become usable, I'm happy with either call but we need to assess the impact to plugins.

@mcollina
Copy link
Contributor

So, I think there is a point in having a child logger for each plugin, but we still need to log each message id.

Can I get rid of --seneca.log=level:info,type:plugin,handler:print and support that behavior maybe via a transform to apply by some other process (maybe we can even add some of this behavior in pino cli).

How is the support for tracing the correlation between messages through services?

@mcdonnelldean
Copy link
Contributor Author

Yeah, so feedback on the ground seems to be people expect logging to work like others, that is you connect to it and have various ways to mulch the data. That's pretty much how all loggers do it and I would strongly consider standardising on that like you say @mcollina

Bear in mind we have two concerns, that of the framework and service, and that of the business logic, If I can pull logs out for both of these in all sorts of ways and put them in various places, I think it will be for the better.

Also, the whole point of Erraro is to enable better logging through a little bit of config. Does pino support any sort of mapping. This is usually a Business Logic level log concern, nevertheless we need some way to express errors to the logs in an easy format.

My understanding is correlation is done via nested contexts. If you use this instead of a root instance of seneca the id's are correlated, it may be more robust than that but it's the basic gist of it.

In terms of tracing support, we have msgstats built in but it is not at the same sort of level as transforming logs.

@mcollina
Copy link
Contributor

Yeah, so feedback on the ground seems to be people expect logging to work like others, that is you connect to it and have various ways to mulch the data. That's pretty much how all loggers do it and I would strongly consider standardising on that like you say @mcollina

I say something slightly different. I do not want a plethora of transports/rotation/etc to be reinvented and used, which is what the other loggers do. Pino is built on the assumption that all of this post-processing will happen outside of the main node process. Node just log to stdout or to a file. This fits very well a cloud/microservice deployment, which is our main target.

Bear in mind we have two concerns, that of the framework and service, and that of the business logic, If I can pull logs out for both of these in all sorts of ways and put them in various places, I think it will be for the better.

Can you please make an example for this?

Also, the whole point of Erraro is to enable better logging through a little bit of config. Does pino support any sort of mapping. This is usually a Business Logic level log concern, nevertheless we need some way to express errors to the logs in an easy format.

Isn't Eraro for errors? We can customize the error stacks by chaing the err  serializer: https://github.com/mcollina/pino#errSerializer. The same mapping can be applied for any property we want to add to our log line. Basically, we can easily represent some state.

My understanding is correlation is done via nested contexts. If you use this instead of a root instance of seneca the id's are correlated, it may be more robust than that but it's the basic gist of it.
In terms of tracing support, we have msgstats built in but it is not at the same sort of level as transforming logs.

Can you please explain the nested contexts for a bit? Can we make them easier to use?
I think we should have a better way for supporting tracing, maybe even with a breaking change, it's highly important and would simplify adoption.

@mcdonnelldean
Copy link
Contributor Author

@mcollina @davidmarkclements This is the logging thread. Can you add your additions here. Matteo it might be an idea to put down the pros and cons you mentioned on the call.

Right now I would prefer something that gives me the easiest access to analysis otherwise they are not terribly useful (so I agree with you). Lets get them here for clarity

@mcollina
Copy link
Contributor

Here it is @mcdonnelldean, folks, please provide your input before we start pursuing one direction or another :).

We have some goals:

  1. JSON logging by default
  2. some sort of logging standard for tools to process the logs
  3. extremely lightweight

So there are two options:

  1. adopt another logger API
  2. develop a simple wrapper around loggers

Adopt another logger API

The JSON loggers are bunyan, bunyan, bole and pino. Bunyan and Pino have the same API, so they are easily swappable. Bole use globals, so I would discard it straight away.

Pro:

  • we can have tooling (maybe on top of ELK, Logentries or Cloudwatch, or just bash)
  • we can have swappable loggers

Con:

  • we can't support winston
  • we are stick with logging with JSON

Develop a simple wrapper around loggers

As an example, Hapi offers logging facility, which just emits a 'log' event on the server object. This introduces overhead.

Pro:

  • you can write your own logger, hurray!
  • adapters of adapters of adapters solves issues in enterprise contexts

Cons:

  • adapters of adapters of adapters are slow
  • we still need to provide a default, which will be used by 95% of the users
  • configuring loggers is usually awful
  • less flexibility (child loggers...)

@mcdonnelldean
Copy link
Contributor Author

Adopt a logger is my choice, Logging is a huge area of maintenance that I don't want to deal with if possible. I'd move rather a powerful, easy to consume logger that a swappable one. @rjrodger thoughts?

@davidmarkclements
Copy link
Contributor

I'm not convinced the event emitter wrapper is a good idea. It will degrade performance - and the additional indirection will make tracing the origin of a log message more complicated

We probably do need some kind of interface though, but purely for instantiating a logger

e.g.

seneca.set('logger', ([stream], [opts]) => ({child, level, fatal, error, warn, info, debug, trace}))

This would support pino and bunyan immediately - e.g. seneca.set('logger', require('pino')),
and other loggers in the family (as it were) could have simple adapters written

I'm gonna +1 the JSON format, but using the above means loggers can be swapped out for another preferred format (e.g. tabulated). Whist backwards compat isn't a concern, this would at least create a path for that.

Everything beyond this point could be internal implementation.

@mcollina
Copy link
Contributor

I think we can even go with seneca.set('logger', pino()). That will be simpler to configure, given that we won't have to do it in seneca. To reiterate, I think we should not bother with instantiation (apart from the default one)

@mcdonnelldean
Copy link
Contributor Author

@davidmarkclements @mcollina I'm happy with the spirit of this. Seneca doesn't have a .set method (that I am aware of) now. This would be the first time seneca has an internal 'thing' that could be set externally that doesn't interface with messages (like a traditional plugin). So basically we need the set functionality too.

I agree with @mcollina on just accepting the preconfigured instance, although I assume this is what you are doing above Dave, but fancier :D

@davidmarkclements
Copy link
Contributor

davidmarkclements commented Apr 20, 2016

@mcdonnelldean no there's a difference

I'm suggesting set('logger', fn) where fn is a function that returns an object with the appropriate logger methods/props @mcollina is saying set('logger', obj) where obj is the logger object (e.g. pino instance, bunyan instance etc).

My approach delegates logger configuration to seneca, Matteo's delegates logger configuration to the instantiator of the logger. It depends on the approach we want to take - Matteo's decoupled approach is inline with prevailing composition centric philoshophy, whereas my original suggestion caters to the "framework" style of seneca. It really depends on which approach is suitable - I like both

@davidmarkclements
Copy link
Contributor

btw it doesn't have to be set, we could use the current approach

require('seneca')({logger: require('pino')()})

@marcopiraccini
Copy link

+1 for seneca.set('logger', obj).
However, from my point-of-view, the real issue is about correlation. We must agree on a model with which it's simple to reconstruct (with tools) what its -well- correlated :). I'm not sure I understood the "corrlation through nested contexts" above.

@mcdonnelldean
Copy link
Contributor Author

@marcopiraccini You might be interested in #399 There is a doc there that is being discussed. This will form the basis of correlation too I think. Would be great to have comments on this and the other issue.

@davidmarkclements Yeah I thought that. And you bring up an interesting issue...

Lets all talk about the elephant in the room. We can't break old logging in a single version. Old logging is EVERYWHERE. We would have the furies of hell upon us if we broke old logging in a single version. Our org alone is going to take 3 versions to purge of all traces of old logging, and thats with the reduced plugin list being worked on.

We need this solution to exist side by side. I thinking using a different load mech and a decoration to get the logger is the best option.

function plugin (options) {
  var seneca = this
  var logger = seneca.getLogger()
}
  • not final api

@mcollina
Copy link
Contributor

mcollina commented Apr 21, 2016

@mcdonnelldean define breaking, as it means two things 1. breaking the log format (which we agreed on) and 2. breaking the API.

I propose that we redo the logging API, and then we wrap the new logger in the old API, and we gradually phase out this using a config variable or deprecation warning. I think this should happen in 2 releases (not 3), just because old code needs to go as fast as possible.

So, my preference goes for:

require('seneca')({logger: require('pino')()}) // pass any logger that match the interface here, really

This is compatible with the current API, so it should be easier to implement.

@mcollina
Copy link
Contributor

mcollina commented May 4, 2016

I've put some time thinking about logging in Seneca, and I think I come with a good solution, tell me what you think.

Here are my hypothesis:

  1. people are currently fine with no logging at all
  2. everybody else want JSON logging

So, I propose that seneca does not log by default. It will provide an internal function trace(obj), attached to the seneca and delegates. A plugin might override that function, and do something with that.
The current log functionality then can be routed directly to the new trace API.

A plugin can also expose a new logging system (if they like), so that a seneca instance (or a delegate) gets a full blown child logger.

@mcdonnelldean
Copy link
Contributor Author

@mcollina People are using the logging (icky as it may be). I'd also be someone who would prefer it to log by default. Really I just want some JSON logs, that's pretty much it. I'm just a little worried we are over thinking this for now.

The use case for this has to be simple if it's not people won't use it, this is why the old logs are not great. They require a lot of config, your's would be the same.

What the community are telling me is just JSON logs, by default with a few switches / options to control level. Bog standard stuff.

@mcollina
Copy link
Contributor

@mcdonnelldean can you please work with @rjrodger to sum up the goals we want to achieve with the new logging system? I fear we are not in agreement. Or I just do something, and then we discuss there.

@mcdonnelldean
Copy link
Contributor Author

@mcollina Yeah that's a good point. Let me come back to this Monday with a scope for you.

@fkrauthan
Copy link

Just wondering I know at one point you guys where talking about traceability of transactions in log messages. But I haven't read anything about that in the last posts. Is that still a consideration for the logging system or was that requirement dropped? Because I think that would be the most important requirement to allow me as a Developer/DevOps person to exactly be able to trace back a certain user transaction thru all my microservices for debugging purpose.

@mcdonnelldean
Copy link
Contributor Author

@fkrauthan The logger is now fully pluggable. There are changes to transport happening soon that will enable better traceability. It hasn't gone away, and is definitely on the list.

@djensen47
Copy link

djensen47 commented Oct 27, 2016

Where can I find the documentation for the pluggable logger?

P.S., I have looked. Repeatedly, like I do for all the other missing Seneca docs. 😕

@dgonzalez
Copy link
Member

@djensen47 here you go:

Those are examples on how to build a pluggable logger. Please come back to me if you have any questions.

@djensen47
Copy link

Thanks. I've seen these but it is helpful to know that these are examples of the logging framework.

Also, the README files from all three of these never mention how you actually log a message. Looking at the test dir, I can guess that I need to use seneca.log.

How do I set the log level? What about the payload?

@dgonzalez
Copy link
Member

The payload is coming from Seneca and the log level is set by Seneca (passing the log level flag). There is another project called seneca-log-filter (https://github.com/senecajs/seneca-log-filter) that does the filtering by level.

@djensen47
Copy link

djensen47 commented Oct 28, 2016

I kinda feel like you answered a different question but maybe I asked my question incorrectly so I'll ask in a different way.

How do I log something in seneca? Let's say that I have set up one of the pluggable loggers, like pino, and now I want to use the logger in a plugin. Now what?

I'm asking because I literally cannot find this documented anywhere. I can guess, maybe it's seneca.log('info', {key:value}) or maybe it's seneca.log.info({key:value}) or maybe it's something else.

Is this still relevant in 3.0? http://senecajs.org/docs/tutorials/logging-with-seneca.html

🤕 😩 😵

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants