Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logging system #2148

Closed
shuse2 opened this issue Jun 21, 2018 · 17 comments · Fixed by #3416
Closed

Improve logging system #2148

shuse2 opened this issue Jun 21, 2018 · 17 comments · Fixed by #3416

Comments

@shuse2
Copy link
Member

shuse2 commented Jun 21, 2018

Description

Currently, logging does not have standard and outputted in text format.
It is hard to use for analysis of KPI.

Expected

  • console log in the text format for readability
  • file log in the JSON format for extensible usage
  • Standardised format and keys for better searchability

Which version(s) does this affect? (Environment, OS, etc...)

1.0

@shuse2
Copy link
Member Author

shuse2 commented Jun 21, 2018

Logger choices

@willclarktech
Copy link
Contributor

Another popular option: https://github.com/visionmedia/debug

@nazarhussain
Copy link
Contributor

@willclarktech this debug package is something different than the logging. BTW we are already using this debugging technique in core. The whole HTTP layer is configured with it to help you debug every middleware in the pipeline.

@shuse2
Copy link
Member Author

shuse2 commented Jun 22, 2018

yes debug more of real debugging tool, where we want is logging.
Im currently considering winston which has profiler, really useful for taking KPI.

@nazarhussain
Copy link
Contributor

Pino is heavily inspired by Bunyan so architecturally both are same. If you look at the benchmarks claimed by Pino, it is only fast in two cases:

  1. Plain strings
  2. Plain objects

When it comes to Deep Nested Objects Bunyan is the fastest even in their benchmarks. I believe our logging will grow to deep level objects, so we should prefer Bunyan over Pino.

Fir Watson I didn't have extensive experience, a quick look told that the profiler feature is pretty basic, so not useful for us. Rest is quite similar to Bunyan, naming conventions and approach to initializing is different.

Bunyan have a concept of child logger that I believe can be much useful in our case. Where each module can have its on child logger adding more data attributes to main logger.

@shuse2
Copy link
Member Author

shuse2 commented Jun 22, 2018

yea, Pino is fast on simple cases, but Bunyan is better in others.
but i think for logging, we should keep the logged object to very plain objects anyway.

Pino and Bunyan both has the child logger system, which I also like.

I was thinking we just need pretty basic profiler, but do you have any other complicated usecase in mind?

@shuse2
Copy link
Member Author

shuse2 commented Jun 25, 2018

Log levels

  • fatal
    Error where application must be shutdown
  • error
    Error but still can proceed with the process
  • warn
    Warning but still can proceed with the process
  • info
    Useful for all the users
  • debug
    Only useful for developer
  • trace
    Never useful for any user except for local perpose

Format

Required

  • module
    Defines which module it's coming from
  • action
    Define unique logging action

Example

library.logger.info('Found common block',
	{
		module: 'loader',
		action: 'FOUND_COMMON_BLOCK',
		meta: {
			blockId: commonBlock.id,
			peer: peer.string,
		},
	}
);

Result

Console output

info: 	Found common block {"module":"loader","action":"FOUND_COMMON_BLOCK","meta":{"blockId":"2832375840683614883","peer":"138.197.191.107:5000"},"timestamp":"2018-06-25T11:38:05.012Z"}

File output

{"module":"loader","action":"FOUND_COMMON_BLOCK","meta":{"blockId":"2832375840683614883","peer":"138.197.191.107:5000"},"level":"info","message":"\tFound common block","timestamp":"2018-06-25T11:38:05.012Z"}

@4miners
Copy link
Contributor

4miners commented Jun 25, 2018

I don't really like the format.

@shuse2
Copy link
Member Author

shuse2 commented Jun 25, 2018

@4miners can you elaborate more?
Input or Output format? any suggestions? 🤕

@4miners
Copy link
Contributor

4miners commented Jun 25, 2018

While it makes automatic parsing of logs simpler (for KPIs or whatever) it makes reading the logs more difficult. Fox example it will be harder to see actual timestamp when meta object is big. Also action looks weird for me, I prefer to log function that the log is coming from instead. We're currently using some conventions that are not bad, like:
Logic/Transaction->apply (scope/module->function)
Blocks->Chain (module->submodule) // not really good
Peers->updatePeerStatus (module->function)
Also it should be possible for logger to discover module automatically.

@nazarhussain
Copy link
Contributor

@4miners When we have raw log data in a structured format. We can decide on however to print it in console to make it to looks pretty.

@shuse2
Copy link
Member Author

shuse2 commented Jun 25, 2018

As @nazarhussain says, output for console is configurable, so i can fix it pretty quick.

For the action or module format, I think (module->function) works too, but my concern here is when we have multiple logs within a function, it would be hard to distinguish them.
How about instead of action, module/function:internalIndentifier?

library.logger.info('Found common block',
	{
		module: 'loader/loadBlocksFromNetwork:getCommonBlock', // module/function:internalIndentifier
		meta: {
			blockId: commonBlock.id,
			peer: peer.string,
		},
	}
);

Also, kind of auto inject exists on bunyan and pino, but they don't have good formatting tool and profiler 😢

@nazarhussain
Copy link
Contributor

nazarhussain commented Jun 25, 2018

I prefer to have a detail log record. Its up to us how much detail we should show in console, but we must have all those details in file. This will give us flexibility while we want to do some analysis on top of logs.

{
   // in case of multiple log files
  "name": "log_name",   

  // This is must to have when put big log files into some log analysis 
  // and want to search specific behavior on certain node 
  "hostname": "banana.local",

  // log level 
  "level": 3,

   // A string based message if provided a string to log otherwise blank string
   "msg": "start request",

   // Time of log entry in UTC format
  "time": "2012-02-03T19:02:57.534Z",

   // A snapshot for blockchain information will be useful
   // When we want to get all logs happens for particular block height 
   "blockchain": {
       "height": "12345"
       "blockId": "7899584854854"
    },

    // Particular module which triggers this log entry 
    "module": "peers",

    // A json object for additional data related to that particular log object 
    "meta": { }
}

If you look closely only meta and msg are the keys that been provided by user. All other information can be auto-injected to logs. This will enrich our logs and gave us capabilities to query logs for more meaningful ways. e.g.

Show all transaction logs between block height 123 to 130

Block height and block id is more integral key points which we always used for debugging. We can enrich our logs further but in first iterations converting those to json with above structure and passing existing log entries as msg are quite enough. Also this implementation approach will not required to change any thing in modules code for logging.

@4miners @shuse2 @MaciejBaj

@shuse2
Copy link
Member Author

shuse2 commented Jun 25, 2018

I agree overall.
I think msg will be default to message, which is the first input of the logger, so i guess thats covered.
I found a way module part will be covered automatically when it's initialised per module.

@shuse2
Copy link
Member Author

shuse2 commented Jun 25, 2018

new Logger creates the default logger and the container.
blockchain key is not applied because it will produce complex dependency now.

library.logger = new Logger({xxx}).child('loader');
library.logger.info('Found common block',
	{
		blockId: commonBlock.id,
		peer: peer.string,
	}
);

Will produce
console

[info] 2018-06-25T20:18:20.094Z <loader>: Found common block
	 {"blockId":"10993844314999483414","peer":"178.128.173.16:5000"}

JSON

{"meta":{"blockId":"10993844314999483414","peer":"138.197.184.25:5000"},"level":"info","message":"Found common block","timestamp":"2018-06-25T19:25:30.669Z","module":"loader","hostname":"shuse2-local"}

@nazarhussain
Copy link
Contributor

@shuse2 I did some benchmarks based on our requirmenets and our implementation;

Here are the simple logging objects;

// String 
logger.debug('Block was forged with the the given length.')

// Simple Object
logger.debug('Given transaction with was invalid in the block', {blockId: '11111111', transactionId: '22222222222222', height: 2});

// Deep Object
logger.debug('Request you just made had invalid properties', {
		blockId: '11111111', transactionId: '22222222222222',
		req: {
			remoteAddress: '1289.988.908',
			method: 'GET',
			url: '/api/transactions',
			queryString: {
				transactionId: '11334455555',
				sort: 'rank:asc'
			},
			headers: {'X-Forwarded-For': '123.144.449.9988'},
			etag: '111888888999999999'}});

Here are the results;

WinstonStringLogger*5000: 313.806ms
BunyanStringLogger*5000: 135.352ms
PinoStringLogger*5000: 94.083ms
WinstonSimpleObjectLogger*5000: 269.177ms
BunyanSimpleObjectLogger*5000: 120.502ms
PinoSimpleObjectLogger*5000: 78.336ms
WinstonDeepObjectLogger*5000: 317.363ms
BunyanDeepObjectLogger*5000: 147.940ms
PinoDeepObjectLogger*5000: 113.202ms

From the benchmarks I notice Bunyan is around 2-3 times faster than Winston, while difference of Pino and Bunyan is not that noticeable (specially in case of deep object). So I suggest to use Bunyan or Pino. If I choose, I will choose Bunyan based on community and production use cases.

NOTE: These benchmarks are only for file based logging not the console.

@shuse2
Copy link
Member Author

shuse2 commented Jun 27, 2018

Thanks @nazarhussain, good experiment!
I think winston is pretty slow on deep nested object. Lets use Bunyan

@MaciejBaj MaciejBaj removed this from the Version 1.1.0 milestone Jul 4, 2018
@diego-G diego-G added this to the Version 1.1.0 milestone Jul 25, 2018
@MaciejBaj MaciejBaj removed this from the Version 1.1.0 milestone Jul 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants