Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP API Logger logs request path and method too #1644

Conversation

ketanbhatt
Copy link
Contributor

Fixes #1546

@codecov-io
Copy link

codecov-io commented Jul 9, 2017

Codecov Report

Merging #1644 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1644      +/-   ##
==========================================
+ Coverage   98.46%   98.46%   +<.01%     
==========================================
  Files          59       59              
  Lines        2862     2864       +2     
==========================================
+ Hits         2818     2820       +2     
  Misses         44       44

@ttmc
Copy link
Contributor

ttmc commented Jul 10, 2017

Hi @ketanbhatt - Thanks! Please see my comment on your other pull request regarding our Contributor Licensing Agreement (CLA): #1643 (comment)

Dev Team: Please don't merge this PR yet; we have to clear the CLA process first.

@ttmc ttmc requested a review from TimDaub July 10, 2017 18:25
@ttmc
Copy link
Contributor

ttmc commented Jul 10, 2017

@TimDaub I assigned you to review this PR because you created the original issue. @ketanbhatt has now agreed to the CLA, so that's not a blocker any more.

Copy link
Contributor

@TimDaub TimDaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this. Works great. Thanks @ketanbhatt!

Will add @sbellem for having this reviewed.

@TimDaub TimDaub requested a review from sbellem July 11, 2017 09:31
@ketanbhatt
Copy link
Contributor Author

@sbellem anything pending on my side?

@ttmc
Copy link
Contributor

ttmc commented Jul 25, 2017

@ketanbhatt - This week, @sbellem is away on vacation. Sorry for the delay.

@ketanbhatt
Copy link
Contributor Author

@ttmc @TimDaub actually I had a doubt.

Earlier the format for the errors was: HTTP API error: %(status)s - %(message)s which has now been changed to HTTP API error: %(status)s - %(method)s:%(path)s - %(message)s.

For some people, who are parsing these logs for indexing/monitoring, won't this be a breaking change? Do you think this is a valid concern? If yes, should we mention in docs somewhere?

Thanks

@TimDaub
Copy link
Contributor

TimDaub commented Jul 25, 2017

@ketanbhatt very good point. How about we append %(method)s:%(path)s to the log message then? /cc @r-marques

@ketanbhatt
Copy link
Contributor Author

ketanbhatt commented Jul 25, 2017

I am not sure if that would be completely foolproof as well. From what I know, appending it would still mean that a lot of custom patterns will stop recognising the logs, given how they work (example: http://grokdebug.herokuapp.com/patterns#)

I have the following ideas:

  1. We start logging twice. Once in the old format, once in the new format, with a warning that the old format would be deprecated. And in the next version (or according to BigchainDB's deprecation policy) we remove the old format.
  2. Let the breaking change happen, but warn in the release docs that this would break. Or maybe we wait for this change to merge until the next major release ( 😢 )

What do you think @TimDaub

@vrde
Copy link
Contributor

vrde commented Jul 25, 2017

For some people, who are parsing these logs for indexing/monitoring, won't this be a breaking change? Do you think this is a valid concern? If yes, should we mention in docs somewhere?

Interesting questions! I'm thinking of workarounds on how to merge this and not break things.
First, let's compare the two outputs.

Behavior in v1.0

[2017-07-25 16:06:02] [ERROR] (bigchaindb.web.views.base) HTTP API error: 404 - Not found (webapi     - pid: 14634)

Behavior in this branch

[2017-07-25 16:07:39] [ERROR] (bigchaindb.web.views.base) HTTP API error: 404 - GET:/api/v1/transactions/d3rp - Not found (webapi     - pid: 15900)

Suggested behavior

What if we slightly modify the error and unify the two things like:

[2017-07-25 16:06:02] [ERROR] (bigchaindb.web.views.base) HTTP API error: 404 - Not found GET:/api/v1/transactions/d3rp (webapi     - pid: 14634)

The string would be:

'HTTP API error: %(status)s - %(message)s %(method)s:%(path)s'

This should't break log parsers because message is a "generic string with a - in front".

@krish7919
Copy link
Contributor

Valid points, @ketanbhatt.

As an example, a current log message looks like:
[2017-07-13 14:45:22] [ERROR] (bigchaindb.web.views.base) HTTP API error: 404 - Not found (webapi - pid: 62)

I wouldn't mind changing it to something like
[2017-07-13 14:45:22] [ERROR] (bigchaindb.web.views.base) "HTTP API, 404, GET, /xyz, Not found, webapi-pid: 62 as per:

  1. Format the log with space as delimiter:
%(status)s %(method)s %(path)s "%(message)s"

or bunch the message in quotes and delimit using commas:

"%(status)s, %(method)s, %(path)s, %(message)s"

Log analysis usually becomes easy with space or comma delimited tokens supported implicitly in many tools (and trivial to code if needed to be done from scratch).
Also, the second option groups message after the module together in double quotes.

I am not a big fan of custom hyphens/separators with spaces around them which will need custom parsing logic. I like nicely tokenize-able strings which won't be an overhead to process.

@TimDaub I wouldn't worry too much about breaking any existing log analytics as I do not think anyone else currently parses the log (IPDB parses logs, but that won't break because of the above changes).

@ketanbhatt
Copy link
Contributor Author

ketanbhatt commented Jul 26, 2017

@vrde good suggestion. Although I would like if there was a separation between the message and the new info we will add (as a future developer who starts consuming these logs with no knowledge about how this format came to be, I will just think that this isn't a good way to output logs).

@krish7919 I was thinking so too, that the change will not break a lot of things, and nothing too critical. But quoting just the message or using , will still break the patterns and be ambiguous respectively. If possible, we will just want to do the best without breaking things, like @vrde suggested. (Although if this isnt the case, if the team decides so, then we can change the format completely too)

I think to keep all your above points in mind, and to not go too far away from the current format (by changing the format to use double quotes etc.), we can simply append - %(method)s - %(path)s to the old format (Like how @TimDaub suggested). This will have the following advantages:

  1. Format is consistent with the old format
  2. Will not break the majority of the parsers as they will catch message - method - path as just message, worst case.
  3. The separation method (using -) is consistent

My only qualm with this is that the message should ideally be the last thing in the log. Thats how all other standard software logs output as well.
Keeping this in mind, I suggest the following final format:
HTTP API error: %(status)s - %(method)s:%(path)s:%(message)s

This has all the benefits of the previous method (of just appending) and also keeps the log format more consistent.

Sorry for the long comment, just thought I should also document the thought process for future reference. Let me know what you guys think and I will quickly make the desired changes.

Thanks

@vrde vrde mentioned this pull request Jul 26, 2017
@vrde
Copy link
Contributor

vrde commented Jul 26, 2017

Sorry for the long comment, just thought I should also document the thought process for future reference. Let me know what you guys think and I will quickly make the desired changes.

This PR generated a really interesting question. Let's wait a bit to see how the team reacts to my changes for the CHANGELOG.md.

@krish7919
Copy link
Contributor

krish7919 commented Jul 27, 2017

The problem with using custom delimiters (like - or : above) is that they cause a lot of problems when they appear in the message field.
The quotes and csv suggestions are just what I have seen to be widely used. It also makes it easy to write parser and/or regex for log messages.

I will approve of any of the solution described above as long as we know the pros and cons and are not flummoxed by unexpected behavior during log parsing (might be of interest to @ttmc when configuring Azure log analytics for IPDB).

@ttmc
Copy link
Contributor

ttmc commented Aug 1, 2017

@vrde and @krish7919 --- I haven't been following this PR very closely. Is merging it still blocked by the decision about how to handle the changes to the logging message format?

@krish7919
Copy link
Contributor

@vrde and @krish7919 --- I haven't been following this PR very closely. Is merging it still blocked by the decision about how to handle the changes to the logging message format?

It isn't. We need to remember that parsing will fail in case we have any more - in the path or message fields in the format specified in the code:
logger.error('HTTP API error: %(status)s - %(method)s:%(path)s - %(message)s', request_info).

Other than being cognizant of that fact for now, I don't think anything's blocking me from this change.

@ttmc
Copy link
Contributor

ttmc commented Aug 1, 2017

@vrde are you okay with merging this pull request? If so, please feel free to merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants