Added uuidv4 and now() in RFC3339 UTC Format#1645
Conversation
|
Hi @spagno, What did you try to do here? What went wrong? Do you need any assistance with this? |
|
If you're concerned about broken linters, you just need to fix this: And then pushing new commit to the branch should fix that. You'll need to reopen this PR in such case. |
|
thank you very much, I'm trying to understand why I broke that |
|
Understood, I'll reopen the PR |
Codecov Report
@@ Coverage Diff @@
## master #1645 +/- ##
==========================================
+ Coverage 76.99% 77.57% +0.57%
==========================================
Files 106 106
Lines 14313 14365 +52
==========================================
+ Hits 11021 11143 +122
+ Misses 3292 3222 -70 |
|
Oh, just follow the link and you'll see: Which means that you used double quotes for strings, while the convention is to use single quotes. You also can |
|
@spagno may I expect any tests for this additionally? |
|
@webknjaz yes, I'm working on it. Just don't know if test all parameters (creating a new testing def) or just the new ones |
|
@spagno both cases are fine. |
|
@webknjaz well I'll try! |
|
seems like python in windows manages log differently from linux; can't run appropriate tests on windows, I have to delete the tests (for now) and reimplement them |
| import uuid | ||
|
|
||
| import cherrypy | ||
| import six |
There was a problem hiding this comment.
Don't move it here, as it's a third-party. It must remain between stdlib imports and the project imports.
There was a problem hiding this comment.
sorry, it was the visualstudio code import sorter, I'll not use it anymore
| exc_info = _cperror._exc_info() | ||
|
|
||
| self.error_log.log(severity, ' '.join((self.time(), context, msg)), exc_info=exc_info) | ||
| self.error_log.log(severity, ' '.join( |
There was a problem hiding this comment.
I think, it's nicer to split lines between arguments of log() method, not inside argument expressions.
Also please not that we follow 120-chars line restrictions (there's no 80-char terminals nowadays), so you don't really have to split it.
| 'f': dict.get(inheaders, 'Referer', ''), | ||
| 'a': dict.get(inheaders, 'User-Agent', ''), | ||
| 'o': dict.get(inheaders, 'Host', '-'), | ||
| 'i': str(uuid.uuid4()), |
There was a problem hiding this comment.
Why exactly do you need a new uuid for each access call shouldn't it be per-request?
There was a problem hiding this comment.
because if you want to set different logformat with cherrypy._cplogging.LogManager.access_log_format(format)
if you want to have unique uuid for each request I thougth that could be the right place to set the uuid.
Use case:
I'm developing a simple semaphore dashboard for our CI; I want that the dashboard logs would be compliant with our best practices: in my case the configuration would be:
cherrypy._cplogging.LogManager.access_log_format('{"request_id":"%(i)s", "user-agent":"%(a)s", "timestamp": "%(z)s", "referrer":"%(f)s", "status":%(s)s, "remote_ip":"%(h)s"}')
There was a problem hiding this comment.
I mean that it would call uuid.uuid4() every time access() is called if I remember correctly, meaning that there can be several different UUIDs generated per request if you'd call access() more then once. Right?
There was a problem hiding this comment.
I've explored prior art and found out that @openstack uses %(uuid)s format string for this, so I'd like to follow the same convention.
Do you know any project using %(i)s for this?
There was a problem hiding this comment.
Nope, uuid is generally custom (for example nginx) and in some cases it isn't supported and you have to do workaround for it (in haproxy you have to use a lua script for that). I decided to use "i" because there other atoms are single char, to mantain a sort of "convention"
There was a problem hiding this comment.
(applies to both %(i)s and %(z)s)
There was a problem hiding this comment.
Agree, for that reason I added only the atoms without touching the default access_log_format which is in standard Apache format
There was a problem hiding this comment.
But they are still not lazy, meaning we waste CPU time calculating them.
There was a problem hiding this comment.
got it, I'll fix it, thanks for the tips and sorry for the poor start quality: like I said it's my first public PR and I'm a sysadmin not a developer :D
There was a problem hiding this comment.
No worries @spagno, I'm not judging. I'm here to help and walk you through the process :) You're doing great!
| 'a': dict.get(inheaders, 'User-Agent', ''), | ||
| 'o': dict.get(inheaders, 'Host', '-'), | ||
| 'i': str(uuid.uuid4()), | ||
| 'z': self.time_z(), |
There was a problem hiding this comment.
What about exposing these atoms to error log?
There was a problem hiding this comment.
you are right, I'll check how to do it
There was a problem hiding this comment.
I read the code and I'm in a little in trouble: in the error_log the atoms aren't used at all, is that right? If it's right I think that adding this support could be out of scope of this PR and I can work on that with a different PR if there's the interest about that
There was a problem hiding this comment.
well, supporting these atoms we allow users use them if they want to change that error_log format template. Okay, you can work on this separately.
| import time | ||
|
|
||
| import six | ||
|
|
There was a problem hiding this comment.
no, there must be separation between stdlib and project imports
|
@spagno please don't remove tests w/o review as it's harder for me to find them in history and nearly impossible to comment in a nice way. I've looked at the failed test and it turns out that your data = [b'127.0.0.1 - - [18/Oct/2017:01:46:50] "GET /as_yield HTTP/1.1" 200 7 "" ""\r\n', b"b'test suite marker: '1508291210....RER" "USERAGENT" HOST\r\n', b"b'test suite marker: '1508291210.6715672\n", b'21afb35b-518c-4c20-a745-c6c071ba82c8\r\n']And then at the first iteration of the loop you throw an error, while the UUID is the 3rd element. If what I see in log is correct, you could improve it via smth like this: ...
data = [chunk.decode('utf-8').rstrip('\n').rstrip('\r') for chunk in data]
for log_chunk in data:
try:
uuid_log = data[-1]
uuid_obj = UUID(uuid_log, version=4)
except (TypeError, ValueError):
msg = '%r not a valid UUIDv4' % uuid_log
self._handleLogError(msg, data, marker, lines)
else:
if str(uuid_obj) != uuid_log:
msg = '%r not a valid UUIDv4' % uuid_log
self._handleLogError(msg, data, marker, lines) |
| return ('[%02d/%s/%04d:%02d:%02d:%02d]' % | ||
| (now.day, month, now.year, now.hour, now.minute, now.second)) | ||
|
|
||
| def time_z(self): |
1cc3046 to
d60b61f
Compare
| import uuid | ||
|
|
||
| import cherrypy | ||
|
|
| from cherrypy import _cperror | ||
| from cherrypy._cpcompat import ntob | ||
|
|
||
|
|
| from cherrypy import _cperror | ||
| from cherrypy._cpcompat import ntob | ||
|
|
||
|
|
|
@spagno I've pushed improvements and everything is likely to work well for now. Still, I'd like to better test time_z (maybe use spy object for that). |
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
feature
What is the related issue number (starting with
#)What is the current behavior? (You can also link to an open issue here)
There isn't the possibility to configure acces_log_format with uuidv4 and timestamp in rfc3339 format
What is the new behavior (if this is a feature change)?
You can use "%(i)s" to have a unique request_id in the log
You can use "%(z)s" to have a UTC rfc3339 timestamp in the log
Other information:
first PR ever, if something is wrong please don't be mean 🗡️