Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added uuidv4 and now() in RFC3339 UTC Format #1645

Merged
merged 44 commits into from Oct 20, 2017
Merged

Conversation

@spagno
Copy link
Contributor

@spagno spagno commented Oct 17, 2017

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    feature

  • What is the related issue number (starting with #)

  • What is the current behavior? (You can also link to an open issue here)
    There isn't the possibility to configure acces_log_format with uuidv4 and timestamp in rfc3339 format

  • What is the new behavior (if this is a feature change)?
    You can use "%(i)s" to have a unique request_id in the log
    You can use "%(z)s" to have a UTC rfc3339 timestamp in the log

  • Other information:
    first PR ever, if something is wrong please don't be mean 🗡️

@spagno spagno closed this Oct 17, 2017
@webknjaz
Copy link
Member

@webknjaz webknjaz commented Oct 17, 2017

Hi @spagno,

What did you try to do here? What went wrong? Do you need any assistance with this?

@webknjaz
Copy link
Member

@webknjaz webknjaz commented Oct 17, 2017

If you're concerned about broken linters, you just need to fix this:
https://travis-ci.org/cherrypy/cherrypy/jobs/289121270#L715

And then pushing new commit to the branch should fix that. You'll need to reopen this PR in such case.

@spagno
Copy link
Contributor Author

@spagno spagno commented Oct 17, 2017

thank you very much, I'm trying to understand why I broke that

@spagno
Copy link
Contributor Author

@spagno spagno commented Oct 17, 2017

Understood, I'll reopen the PR

@spagno spagno reopened this Oct 17, 2017
@codecov
Copy link

@codecov codecov bot commented Oct 17, 2017

Codecov Report

Merging #1645 into master will increase coverage by 0.57%.
The diff coverage is 89.28%.

@@            Coverage Diff             @@
##           master    #1645      +/-   ##
==========================================
+ Coverage   76.99%   77.57%   +0.57%     
==========================================
  Files         106      106              
  Lines       14313    14365      +52     
==========================================
+ Hits        11021    11143     +122     
+ Misses       3292     3222      -70
@webknjaz
Copy link
Member

@webknjaz webknjaz commented Oct 17, 2017

Oh, just follow the link and you'll see:

Fix double quoted strings................................................Failed
hookid: double-quote-string-fixer
Files were modified by this hook. Additional output:
Fixing strings in cherrypy/_cplogging.py

Which means that you used double quotes for strings, while the convention is to use single quotes.

You also can pip install pre-commit locally + do pre-commit run --all-files manually if needed :)

@webknjaz
Copy link
Member

@webknjaz webknjaz commented Oct 17, 2017

@spagno may I expect any tests for this additionally?

@spagno
Copy link
Contributor Author

@spagno spagno commented Oct 17, 2017

@webknjaz yes, I'm working on it. Just don't know if test all parameters (creating a new testing def) or just the new ones

@webknjaz
Copy link
Member

@webknjaz webknjaz commented Oct 17, 2017

@spagno both cases are fine.
I'd be thankful for testing just your changes, but if you're able to cover more — that would be fantastic :)

@webknjaz webknjaz requested review from jaraco and webknjaz Oct 17, 2017
@spagno
Copy link
Contributor Author

@spagno spagno commented Oct 17, 2017

@webknjaz well I'll try!

@spagno
Copy link
Contributor Author

@spagno spagno commented Oct 18, 2017

seems like python in windows manages log differently from linux; can't run appropriate tests on windows, I have to delete the tests (for now) and reimplement them

@cherrypy cherrypy deleted a comment Oct 18, 2017
@cherrypy cherrypy deleted a comment Oct 18, 2017
@cherrypy cherrypy deleted a comment Oct 18, 2017
@cherrypy cherrypy deleted a comment Oct 18, 2017
@cherrypy cherrypy deleted a comment Oct 18, 2017
@cherrypy cherrypy deleted a comment Oct 18, 2017

import cherrypy
import six

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

Don't move it here, as it's a third-party. It must remain between stdlib imports and the project imports.

This comment has been minimized.

@spagno

spagno Oct 18, 2017
Author Contributor

sorry, it was the visualstudio code import sorter, I'll not use it anymore

@@ -216,7 +215,8 @@ def error(self, msg='', context='', severity=logging.INFO,
if traceback:
exc_info = _cperror._exc_info()

self.error_log.log(severity, ' '.join((self.time(), context, msg)), exc_info=exc_info)
self.error_log.log(severity, ' '.join(

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

I think, it's nicer to split lines between arguments of log() method, not inside argument expressions.
Also please not that we follow 120-chars line restrictions (there's no 80-char terminals nowadays), so you don't really have to split it.

@@ -262,6 +262,8 @@ def access(self):
'f': dict.get(inheaders, 'Referer', ''),
'a': dict.get(inheaders, 'User-Agent', ''),
'o': dict.get(inheaders, 'Host', '-'),
'i': str(uuid.uuid4()),

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

Why exactly do you need a new uuid for each access call shouldn't it be per-request?

This comment has been minimized.

@spagno

spagno Oct 18, 2017
Author Contributor

because if you want to set different logformat with cherrypy._cplogging.LogManager.access_log_format(format)
if you want to have unique uuid for each request I thougth that could be the right place to set the uuid.
Use case:
I'm developing a simple semaphore dashboard for our CI; I want that the dashboard logs would be compliant with our best practices: in my case the configuration would be:
cherrypy._cplogging.LogManager.access_log_format('{"request_id":"%(i)s", "user-agent":"%(a)s", "timestamp": "%(z)s", "referrer":"%(f)s", "status":%(s)s, "remote_ip":"%(h)s"}')

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

I mean that it would call uuid.uuid4() every time access() is called if I remember correctly, meaning that there can be several different UUIDs generated per request if you'd call access() more then once. Right?

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

I've explored prior art and found out that @openstack uses %(uuid)s format string for this, so I'd like to follow the same convention.
Do you know any project using %(i)s for this?

This comment has been minimized.

@spagno

spagno Oct 18, 2017
Author Contributor

Nope, uuid is generally custom (for example nginx) and in some cases it isn't supported and you have to do workaround for it (in haproxy you have to use a lua script for that). I decided to use "i" because there other atoms are single char, to mantain a sort of "convention"

This comment has been minimized.

@webknjaz

webknjaz Oct 19, 2017
Member

(applies to both %(i)s and %(z)s)

This comment has been minimized.

@spagno

spagno Oct 19, 2017
Author Contributor

Agree, for that reason I added only the atoms without touching the default access_log_format which is in standard Apache format

This comment has been minimized.

@webknjaz

webknjaz Oct 19, 2017
Member

But they are still not lazy, meaning we waste CPU time calculating them.

This comment has been minimized.

@spagno

spagno Oct 19, 2017
Author Contributor

got it, I'll fix it, thanks for the tips and sorry for the poor start quality: like I said it's my first public PR and I'm a sysadmin not a developer :D

This comment has been minimized.

@webknjaz

webknjaz Oct 19, 2017
Member

No worries @spagno, I'm not judging. I'm here to help and walk you through the process :) You're doing great!

@@ -262,6 +262,8 @@ def access(self):
'f': dict.get(inheaders, 'Referer', ''),
'a': dict.get(inheaders, 'User-Agent', ''),
'o': dict.get(inheaders, 'Host', '-'),
'i': str(uuid.uuid4()),
'z': self.time_z(),

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

What about exposing these atoms to error log?

This comment has been minimized.

@spagno

spagno Oct 19, 2017
Author Contributor

you are right, I'll check how to do it

This comment has been minimized.

@spagno

spagno Oct 19, 2017
Author Contributor

I read the code and I'm in a little in trouble: in the error_log the atoms aren't used at all, is that right? If it's right I think that adding this support could be out of scope of this PR and I can work on that with a different PR if there's the interest about that

This comment has been minimized.

@webknjaz

webknjaz Oct 19, 2017
Member

well, supporting these atoms we allow users use them if they want to change that error_log format template. Okay, you can work on this separately.

@@ -4,7 +4,6 @@
import time

import six

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

no, there must be separation between stdlib and project imports

@webknjaz
Copy link
Member

@webknjaz webknjaz commented Oct 18, 2017

@spagno please don't remove tests w/o review as it's harder for me to find them in history and nearly impossible to comment in a nice way.

I've looked at the failed test and it turns out that your assertValidUUIDv4 helper has this data var:

data = [b'127.0.0.1 - - [18/Oct/2017:01:46:50] "GET /as_yield HTTP/1.1" 200 7 "" ""\r\n', b"b'test suite marker: '1508291210....RER" "USERAGENT" HOST\r\n', b"b'test suite marker: '1508291210.6715672\n", b'21afb35b-518c-4c20-a745-c6c071ba82c8\r\n']

And then at the first iteration of the loop you throw an error, while the UUID is the 3rd element.

If what I see in log is correct, you could improve it via smth like this:

...
data = [chunk.decode('utf-8').rstrip('\n').rstrip('\r') for chunk in data]
for log_chunk in data:
    try:
        uuid_log = data[-1]
        uuid_obj = UUID(uuid_log, version=4)
    except (TypeError, ValueError):
        msg = '%r not a valid UUIDv4' % uuid_log
        self._handleLogError(msg, data, marker, lines)
    else:
        if str(uuid_obj) != uuid_log:
            msg = '%r not a valid UUIDv4' % uuid_log
            self._handleLogError(msg, data, marker, lines)
@@ -312,6 +314,11 @@ def time(self):
return ('[%02d/%s/%04d:%02d:%02d:%02d]' %
(now.day, month, now.year, now.hour, now.minute, now.second))

def time_z(self):

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

@spagno plz turn this into @staticmethod

@spagno spagno force-pushed the spagno:master branch 3 times, most recently from 1cc3046 to d60b61f Oct 18, 2017

import cherrypy

This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

remove this line

from cherrypy import _cperror
from cherrypy._cpcompat import ntob


This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

restore this line


import cherrypy
from cherrypy import _cperror
from cherrypy._cpcompat import ntob


This comment has been minimized.

@webknjaz

webknjaz Oct 18, 2017
Member

restore this line

spagno and others added 11 commits Oct 19, 2017
@webknjaz webknjaz force-pushed the spagno:master branch from 61ac10d to 6c508d6 Oct 20, 2017
webknjaz added 11 commits Oct 20, 2017
@webknjaz webknjaz force-pushed the spagno:master branch from e49cc7e to 7a499f0 Oct 20, 2017
@webknjaz webknjaz force-pushed the spagno:master branch from 7a499f0 to 4743879 Oct 20, 2017
@webknjaz
Copy link
Member

@webknjaz webknjaz commented Oct 20, 2017

@spagno I've pushed improvements and everything is likely to work well for now. Still, I'd like to better test time_z (maybe use spy object for that).

@webknjaz webknjaz merged commit 122a484 into cherrypy:master Oct 20, 2017
4 checks passed
4 checks passed
codecov/patch 89.28% of diff hit (target 76.99%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details
webknjaz added a commit that referenced this pull request Oct 20, 2017
PR #1645 by @spagno:

* ``i`` holds a per-request UUID4
* ``z`` outputs UTC time in format of RFC 3339
* ``cherrypy._cprequest.Request.unique_id.uuid4`` now has lazily
  invocable UUID4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.