Replace all uses of twisted.python.log with twisted.logger. #29

ldanielburr · 2019-06-19T16:57:14Z

Just getting this PR started so that I don't lose motivation; all tests passing.

ldanielburr · 2019-06-19T17:00:08Z

Sorry, I should mention that this PR address #22.

glyph

Hi @ldanielburr ! Thanks so much for the contribution; it would be great to get some structure in these log messages. I do have a bit of feedback here; mainly, it would be great to simultaneously eliminate the anti-pattern of emitting multiple messages just to get some multi-line formatting on the message. Also, any thought to maybe adding a test or two to extract some structure?

Please ping me with an at-mention when you've dealt with it so I can notice and respond.

glyph · 2019-07-02T07:21:21Z

ampoule/pool.py

-        log.msg('\trecycle after: %r' % (self.recycleAfter,))
-        log.msg('\tProcessStarter:')
-        log.msg('\t\t%r' % (self.starter,))
+        log.info(u'ProcessPool stats:')


Would you mind putting this together into one big log message, rather than spreading it out across a whole bunch of different ones? Conceptually, what's happening here is a dump of stats at a particular moment in time; they should all be associated with each other. The repeated logging calls were just there trying to get the message to format in a particular way, which shouldn't be the log emitter's concern.

glyph · 2019-07-02T07:22:05Z

ampoule/pool.py

@@ -135,17 +141,17 @@ def _addProcess(self, child, finished):
        Adds the newly created child process to the pool.
        """
        def fatal(reason, child):
-            log.msg("FATAL: Process exited %s" % (reason,))
+            log.error(u'FATAL: Process exited.')


This should be a single log message as well.

glyph · 2019-07-02T07:23:43Z

ampoule/child.py

-        log.msg("Shutdown message received, goodbye.")
+        log.info(
+            u'Shutdown message received, goodbye.',
+            log_system=self.__class__.__name__


If you put the log object onto AMPChild, and then do self.log.info, this type of namespacing comes along for free. Is this a feature you knew about but didn't want to use, or just something that wasn't clearly documented enough?

Iog_system itself is undocumented at https://twistedmatrix.com/documents/current/core/howto/logger.html; I found it while digging around in the code looking for the equivalent of twisted.python.log.msg's "system" kwarg. twisted.logger looks as though this kind of thing should be handled by the "namespace" kwarg passed to Logger(); in my experiments, setting the namespace explicitly, e.g., "log = logger.Logger(namespace=u'foo')" doesn't always work. I didn't get to the bottom of that, as I found the "log_system" kwarg and it did exactly what I wanted.

Ideally, I think a logger should be constructed with the namespace argument, as the preferred way to do instantiate it, as opposed to embedding it as a class member and relying on automagical namespace configuration, but that's likely to be out of scope for this PR.

Update: I tried the "self.log" approach, and it emits a more correct, but also more verbose namespace for messages, which I found very noisy.

self.log.info(u'Shutdown message received, goodbye.') [ampoule.test.test_process.TempDirChild#info] Shutdown message received, goodbye.'

versus

log.info( u'Shutdown message received, goodbye.', log_system=self.__class__.__name__ ) [TempDirChild] Shutdown message received, goodbye.'

@glyph any thoughts on this? I personally like the log_system approach, because it offers more control over what shows up in the log message's namespace, and also because it allows me to have exactly one instance of Logger for the module, whereas the self.log approach creates a new Logger per instance of the containing class. Maybe that's not a big deal, but if you are making a lot of instances, that's a lot of identical loggers logging exactly the same information and seems wasteful to me.

I definitely don't like the log_system approach 😄. As you've observed, it's not documented! It's the way the log system implements the namespacing feature, not a parameter that's supposed to be passed in directly.

TempDirChild is not a qualified name, so there's no way for someone to look at that message and figure out what module or source file it's come from. The design of the log system is that you should rely on automagical namespace configuration, unless you have a really good reason not to. Why is it that you want to avoid it? I can see wanting to avoid too much Logger construction, since objects are not cheap as they should be.

I want to avoid self.log because of the potentially excessive object creation. Also, I am a fan of "log" being a module-level object that can serve the needs of any class within that module, rather than each class having its own log attribute. The module-level usage of a logger fits fairly well with the idioms employed in other Python projects, and I don't see a lot of value in being different here.

OK. Module-global is definitely OK by me, and Python is definitely slower than I'd like about creating objects. And in practice (i.e.: module-level) this is how I've mostly used it; I was mainly suggesting instance-level just because you were deriving the namespace from the class rather than the module.

Regarding the notion that "you should rely on automagical namespace configuration", I'd rather have no namespace at all, than one that I can't control. People who create logging messages in their code know what they want to log, so letting them control the namespace is a win, imo. Using log_system or something equivalent, lets a single logger serve multiple needs instead of having a logger-per-namespace approach. I'd be happy with any mechanism other than an automagical one :)

For an application, you might be right in some cases. (However, at work we just completed a migration from manual namespace specification to automagical ) But ampoule isn't an application, and the purpose of a namespace is not for our reading pleasure but for people operating applications to filter on. So having a hierarchical identifier that separates it from the application is key.

I think I understand the desire to make logging something more structured, searchable, etc., but anecdotally, I can't find anybody who wants to deal with that.

Surprise! You found one. I run Ampoule in production and consume structured logs via Cloudwatch in AWS :). (In a previous job, I consumed structured logs in ElasticSearch. And I'm aware of at least 2 other Twisted deployments doing this with an ELK stack.)

People want to read the same kind of logfile output that Apache and Nginx have been putting out for years; they don't want structured stuff that gets dumped into ElasticSearch and exposed via Kibana. Given that other logging systems tend to adhere to this admittedly primitive format, I don't see the value in doing anything else.

I agree! There's a huge audience for plaintext logging. This is why Twisted exposes a format string; to preserve this traditional log-grepping experience where you just get a string out and get mostly ad-hoc formatted text files you can scrape with regexes.

I totally disagree! In addition to my personal extensive usage of structured logging on multiple occasions, I think that the reason that people don't "don't want structured stuff" is that the tooling is largely garbage and the setup and configuration (particularly for ELK stacks) is so complicated and intricate that it's not worth it to set it up at the beginning of an app's development cycle. And then later, it's not worth setting up because you don't have structured output, so there's not much point in investing heavily in a backend that can do structured queries. Breaking this cycle, I believe, starts with libraries and frameworks defaulting to emitting structured logs, so that when you do have the time to invest in the backend to manage it, the source data is already there.

Sorry for the wall of text, this discussion is way outside of the scope of this PR. Please let me know what you want me to do in this specific instance, and I'll implement it. The larger discussion about twisted.logger is probably better for the mailing list.

One good wall deserves another :). But I think for the purpose of this code review, just "let's use a module-level thing" is a fine start.

@glyph thanks for the thoughtful response, I appreciate it. So, module-level logger instance for this PR, what about the use of the log_system kwarg? I ask because a module-level logger will not include the classname in its namespace.

Also, I decided to figure out how I got ahold of the log_system kwarg in the first place and it was in the docstring for formatEventAsClassicLogText:

C{system} is the event's C{"log_system"} value, if set, otherwise, the C{"log_namespace"} and C{"log_level"}, joined by a C{u"#"}. Each defaults to C{u"-"} is not set.

...which reads to me like it is something I could meaningfully expect to use.

Further, eventAsText documents its use as well, using a copy of the same docstring, but still it is an expected value.

If using log_system is acceptable, then a single, module-level Logger instance can still provide class-level logging, e.g.,

# assumes self._loggingName = log.namespace + u'.' + self.__class__.__name__ log.info( u'Shutdown message received, goodbye.', log_system=self._loggingName + u'#info' ) 2019-07-11 17:40:54-0500 [-] FROM 0: b'2019-07-11T17:40:54-0500 [ampoule.child.TempDirChild#info] Shutdown message received, goodbye.'

Thoughts?

I don't think we really need to include the class name in the namespace at all, frankly, especially for this first cut; I'd just skip passing it entirely. I definitely wouldn't want to custom-construct one like this which mashes the log level in there as a string like that. So just

log.info(u'Shutdown message received, goodbye.')

Except maybe structured-format like a PID in there or something so we can identify who received it.

@glyph I've updated to PR to remove the log_system kwarg and go with plain module-level logging for now.

I did not find any clear path forward to adding a PID to that shutdown message, so I skipped that task.

glyph · 2019-07-02T07:24:58Z

ampoule/main.py

+    logLevelPredicate = logger.LogLevelFilterPredicate(
+        defaultLogLevel=logger.LogLevel.info
+    )
+    filteringObserver = logger.FilteringLogObserver(


Thanks for changing the bootstrapping as well! One interesting question might be how to configure the filters on this predicate without having to put in a whole custom BOOTSTRAP. For a future change, perhaps.

That's a good thought. I kind of hate that all this code lives in a big-old wad of text. I can vaguely imagine some hand-wavey thing where we take command-line arguments, or an ampoule.cfg file, and generate the bootstrap code from a template. Definitely bigger than the scope of this work, but it would have several benefits, including configuration of log levels and log formatters.

I kind of hate that all this code lives in a big-old wad of text

same

an ampoule.cfg file, and generate the bootstrap code from a template

or you know, parse a datastructure passed in argv ?

ldanielburr · 2019-07-02T17:28:37Z

@glyph I've made one of the requested changes, and solicited feedback on the other requested change. Regarding your comment about "adding a test or two to extract some structure": can you elucidate a bit? I know Logger is all about emitting events that can be formatted as text or json or what-have-you, but I'm very foggy on how to go about testing "structure". Are you referring to the raw events being emitted, or the formatted representation of those events?

glyph · 2019-07-11T09:20:05Z

Are you referring to the raw events being emitted, or the formatted representation of those events?

The event dictionaries being emitted. Such a test would add a log tap for the duration of the test, and assert about some stuff in the event dicts logged; hopefully this is a lightweight thing to do, but if it's at all hard let's skip it and come back to it later - there's no direct tests for this as it stands and I don't want the perfect to be the enemy of the good here.

… name from stats.

ldanielburr · 2019-07-12T20:26:05Z

@glyph regarding unit-tests for the new logger: I can't get my head around it, sorry. It seems to me that twisted.logger's unit-tests must already be testing the emission of events so what additional surety would be provided here?

glyph

I think we've got plenty of stuff to go on for the future here, but le mieux est l'ennemi du bien, let's land this and build on it 😄 .

glyph · 2019-08-04T22:28:06Z

Hmm, we probably need to do a release for this...

ldanielburr · 2019-08-04T23:35:44Z

@glyph, please note https://twistedmatrix.com/trac/ticket/9680, which is the reason for the large number of logger instances created when making a logger a class member. If this twisted.logger ticket could be resolved, class-based loggers might become a better option than what I ended up doing here.

glyph · 2019-08-05T00:22:46Z

Thanks for the reference. I'm not actually sure that module-level loggers are all that bad of an idea, honestly… the main reason to use instance-level ones is for convenience logging fields off the instance; module-level namespace granularity seems sufficient for most cases.

Replace all uses of twisted.python.log with twisted.logger.

268e4dc

ldanielburr mentioned this pull request Jun 24, 2019

modernize logging #22

Closed

glyph reviewed Jul 2, 2019

View reviewed changes

Collapse multiple log statements down to just 2, per code review.

99ca5b2

Remove log_system from shutdown log message, remove duplicate package…

c78141d

… name from stats.

glyph approved these changes Jul 12, 2019

View reviewed changes

glyph merged commit ae528e8 into twisted:master Jul 12, 2019

ldanielburr mentioned this pull request Jul 12, 2019

Ampoule logging is exceedingly verbose #9

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace all uses of twisted.python.log with twisted.logger. #29

Replace all uses of twisted.python.log with twisted.logger. #29

ldanielburr commented Jun 19, 2019 •

edited

ldanielburr commented Jun 19, 2019

glyph left a comment

glyph Jul 2, 2019

glyph Jul 2, 2019

glyph Jul 2, 2019

ldanielburr Jul 2, 2019

ldanielburr Jul 2, 2019

ldanielburr Jul 2, 2019

glyph Jul 9, 2019

glyph Jul 11, 2019

ldanielburr Jul 11, 2019 •

edited

glyph Jul 12, 2019

ldanielburr Jul 12, 2019

ldanielburr Jul 12, 2019

glyph Jul 2, 2019

ldanielburr Jul 2, 2019

glyph Jul 9, 2019

ldanielburr commented Jul 2, 2019

glyph commented Jul 11, 2019

ldanielburr commented Jul 12, 2019

glyph left a comment

glyph commented Aug 4, 2019

ldanielburr commented Aug 4, 2019

glyph commented Aug 5, 2019

Replace all uses of twisted.python.log with twisted.logger. #29

Replace all uses of twisted.python.log with twisted.logger. #29

Conversation

ldanielburr commented Jun 19, 2019 • edited

ldanielburr commented Jun 19, 2019

glyph left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ldanielburr Jul 11, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ldanielburr commented Jul 2, 2019

glyph commented Jul 11, 2019

ldanielburr commented Jul 12, 2019

glyph left a comment

Choose a reason for hiding this comment

glyph commented Aug 4, 2019

ldanielburr commented Aug 4, 2019

glyph commented Aug 5, 2019

ldanielburr commented Jun 19, 2019 •

edited

ldanielburr Jul 11, 2019 •

edited