Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace all uses of twisted.python.log with twisted.logger. #29

Merged
merged 3 commits into from Jul 12, 2019

Conversation

ldanielburr
Copy link
Contributor

@ldanielburr ldanielburr commented Jun 19, 2019

Just getting this PR started so that I don't lose motivation; all tests passing.

@ldanielburr
Copy link
Contributor Author

Sorry, I should mention that this PR address #22.

@ldanielburr ldanielburr mentioned this pull request Jun 24, 2019
Copy link
Member

@glyph glyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ldanielburr ! Thanks so much for the contribution; it would be great to get some structure in these log messages. I do have a bit of feedback here; mainly, it would be great to simultaneously eliminate the anti-pattern of emitting multiple messages just to get some multi-line formatting on the message. Also, any thought to maybe adding a test or two to extract some structure?

Please ping me with an at-mention when you've dealt with it so I can notice and respond.

ampoule/pool.py Outdated
log.msg('\trecycle after: %r' % (self.recycleAfter,))
log.msg('\tProcessStarter:')
log.msg('\t\t%r' % (self.starter,))
log.info(u'ProcessPool stats:')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind putting this together into one big log message, rather than spreading it out across a whole bunch of different ones? Conceptually, what's happening here is a dump of stats at a particular moment in time; they should all be associated with each other. The repeated logging calls were just there trying to get the message to format in a particular way, which shouldn't be the log emitter's concern.

ampoule/pool.py Outdated
@@ -135,17 +141,17 @@ def _addProcess(self, child, finished):
Adds the newly created child process to the pool.
"""
def fatal(reason, child):
log.msg("FATAL: Process exited %s" % (reason,))
log.error(u'FATAL: Process exited.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a single log message as well.

ampoule/child.py Outdated
log.msg("Shutdown message received, goodbye.")
log.info(
u'Shutdown message received, goodbye.',
log_system=self.__class__.__name__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you put the log object onto AMPChild, and then do self.log.info, this type of namespacing comes along for free. Is this a feature you knew about but didn't want to use, or just something that wasn't clearly documented enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iog_system itself is undocumented at https://twistedmatrix.com/documents/current/core/howto/logger.html; I found it while digging around in the code looking for the equivalent of twisted.python.log.msg's "system" kwarg. twisted.logger looks as though this kind of thing should be handled by the "namespace" kwarg passed to Logger(); in my experiments, setting the namespace explicitly, e.g., "log = logger.Logger(namespace=u'foo')" doesn't always work. I didn't get to the bottom of that, as I found the "log_system" kwarg and it did exactly what I wanted.

Ideally, I think a logger should be constructed with the namespace argument, as the preferred way to do instantiate it, as opposed to embedding it as a class member and relying on automagical namespace configuration, but that's likely to be out of scope for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I tried the "self.log" approach, and it emits a more correct, but also more verbose namespace for messages, which I found very noisy.

self.log.info(u'Shutdown message received, goodbye.')
[ampoule.test.test_process.TempDirChild#info] Shutdown message received, goodbye.'

versus

log.info(
    u'Shutdown message received, goodbye.',
    log_system=self.__class__.__name__
)
[TempDirChild] Shutdown message received, goodbye.'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glyph any thoughts on this? I personally like the log_system approach, because it offers more control over what shows up in the log message's namespace, and also because it allows me to have exactly one instance of Logger for the module, whereas the self.log approach creates a new Logger per instance of the containing class. Maybe that's not a big deal, but if you are making a lot of instances, that's a lot of identical loggers logging exactly the same information and seems wasteful to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely don't like the log_system approach 😄. As you've observed, it's not documented! It's the way the log system implements the namespacing feature, not a parameter that's supposed to be passed in directly.

TempDirChild is not a qualified name, so there's no way for someone to look at that message and figure out what module or source file it's come from. The design of the log system is that you should rely on automagical namespace configuration, unless you have a really good reason not to. Why is it that you want to avoid it? I can see wanting to avoid too much Logger construction, since objects are not cheap as they should be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to avoid self.log because of the potentially excessive object creation. Also, I am a fan of "log" being a module-level object that can serve the needs of any class within that module, rather than each class having its own log attribute. The module-level usage of a logger fits fairly well with the idioms employed in other Python projects, and I don't see a lot of value in being different here.

OK. Module-global is definitely OK by me, and Python is definitely slower than I'd like about creating objects. And in practice (i.e.: module-level) this is how I've mostly used it; I was mainly suggesting instance-level just because you were deriving the namespace from the class rather than the module.

Regarding the notion that "you should rely on automagical namespace configuration", I'd rather have no namespace at all, than one that I can't control. People who create logging messages in their code know what they want to log, so letting them control the namespace is a win, imo. Using log_system or something equivalent, lets a single logger serve multiple needs instead of having a logger-per-namespace approach. I'd be happy with any mechanism other than an automagical one :)

For an application, you might be right in some cases. (However, at work we just completed a migration from manual namespace specification to automagical ) But ampoule isn't an application, and the purpose of a namespace is not for our reading pleasure but for people operating applications to filter on. So having a hierarchical identifier that separates it from the application is key.

I think I understand the desire to make logging something more structured, searchable, etc., but anecdotally, I can't find anybody who wants to deal with that.

Surprise! You found one. I run Ampoule in production and consume structured logs via Cloudwatch in AWS :). (In a previous job, I consumed structured logs in ElasticSearch. And I'm aware of at least 2 other Twisted deployments doing this with an ELK stack.)

People want to read the same kind of logfile output that Apache and Nginx have been putting out for years; they don't want structured stuff that gets dumped into ElasticSearch and exposed via Kibana. Given that other logging systems tend to adhere to this admittedly primitive format, I don't see the value in doing anything else.

  1. I agree! There's a huge audience for plaintext logging. This is why Twisted exposes a format string; to preserve this traditional log-grepping experience where you just get a string out and get mostly ad-hoc formatted text files you can scrape with regexes.
  2. I totally disagree! In addition to my personal extensive usage of structured logging on multiple occasions, I think that the reason that people don't "don't want structured stuff" is that the tooling is largely garbage and the setup and configuration (particularly for ELK stacks) is so complicated and intricate that it's not worth it to set it up at the beginning of an app's development cycle. And then later, it's not worth setting up because you don't have structured output, so there's not much point in investing heavily in a backend that can do structured queries. Breaking this cycle, I believe, starts with libraries and frameworks defaulting to emitting structured logs, so that when you do have the time to invest in the backend to manage it, the source data is already there.

Sorry for the wall of text, this discussion is way outside of the scope of this PR. Please let me know what you want me to do in this specific instance, and I'll implement it. The larger discussion about twisted.logger is probably better for the mailing list.

One good wall deserves another :). But I think for the purpose of this code review, just "let's use a module-level thing" is a fine start.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glyph thanks for the thoughtful response, I appreciate it. So, module-level logger instance for this PR, what about the use of the log_system kwarg? I ask because a module-level logger will not include the classname in its namespace.

Also, I decided to figure out how I got ahold of the log_system kwarg in the first place and it was in the docstring for formatEventAsClassicLogText:

      C{system} is the event's C{"log_system"} value, if set, otherwise,
      the C{"log_namespace"} and C{"log_level"}, joined by a C{u"#"}.  Each
      defaults to C{u"-"} is not set.

...which reads to me like it is something I could meaningfully expect to use.

Further, eventAsText documents its use as well, using a copy of the same docstring, but still it is an expected value.

If using log_system is acceptable, then a single, module-level Logger instance can still provide class-level logging, e.g.,

# assumes self._loggingName = log.namespace + u'.' + self.__class__.__name__
log.info(
    u'Shutdown message received, goodbye.',
    log_system=self._loggingName + u'#info'
)

2019-07-11 17:40:54-0500 [-] FROM 0: b'2019-07-11T17:40:54-0500 [ampoule.child.TempDirChild#info] Shutdown message received, goodbye.'

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we really need to include the class name in the namespace at all, frankly, especially for this first cut; I'd just skip passing it entirely. I definitely wouldn't want to custom-construct one like this which mashes the log level in there as a string like that. So just

log.info(u'Shutdown message received, goodbye.')

Except maybe structured-format like a PID in there or something so we can identify who received it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glyph I've updated to PR to remove the log_system kwarg and go with plain module-level logging for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not find any clear path forward to adding a PID to that shutdown message, so I skipped that task.

logLevelPredicate = logger.LogLevelFilterPredicate(
defaultLogLevel=logger.LogLevel.info
)
filteringObserver = logger.FilteringLogObserver(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for changing the bootstrapping as well! One interesting question might be how to configure the filters on this predicate without having to put in a whole custom BOOTSTRAP. For a future change, perhaps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good thought. I kind of hate that all this code lives in a big-old wad of text. I can vaguely imagine some hand-wavey thing where we take command-line arguments, or an ampoule.cfg file, and generate the bootstrap code from a template. Definitely bigger than the scope of this work, but it would have several benefits, including configuration of log levels and log formatters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of hate that all this code lives in a big-old wad of text

same

an ampoule.cfg file, and generate the bootstrap code from a template

or you know, parse a datastructure passed in argv ?

@ldanielburr
Copy link
Contributor Author

@glyph I've made one of the requested changes, and solicited feedback on the other requested change. Regarding your comment about "adding a test or two to extract some structure": can you elucidate a bit? I know Logger is all about emitting events that can be formatted as text or json or what-have-you, but I'm very foggy on how to go about testing "structure". Are you referring to the raw events being emitted, or the formatted representation of those events?

@glyph
Copy link
Member

glyph commented Jul 11, 2019

Are you referring to the raw events being emitted, or the formatted representation of those events?

The event dictionaries being emitted. Such a test would add a log tap for the duration of the test, and assert about some stuff in the event dicts logged; hopefully this is a lightweight thing to do, but if it's at all hard let's skip it and come back to it later - there's no direct tests for this as it stands and I don't want the perfect to be the enemy of the good here.

@ldanielburr
Copy link
Contributor Author

@glyph regarding unit-tests for the new logger: I can't get my head around it, sorry. It seems to me that twisted.logger's unit-tests must already be testing the emission of events so what additional surety would be provided here?

Copy link
Member

@glyph glyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've got plenty of stuff to go on for the future here, but le mieux est l'ennemi du bien, let's land this and build on it 😄 .

@glyph glyph merged commit ae528e8 into twisted:master Jul 12, 2019
@glyph
Copy link
Member

glyph commented Aug 4, 2019

Hmm, we probably need to do a release for this...

@ldanielburr
Copy link
Contributor Author

@glyph, please note https://twistedmatrix.com/trac/ticket/9680, which is the reason for the large number of logger instances created when making a logger a class member. If this twisted.logger ticket could be resolved, class-based loggers might become a better option than what I ended up doing here.

@glyph
Copy link
Member

glyph commented Aug 5, 2019

Thanks for the reference. I'm not actually sure that module-level loggers are all that bad of an idea, honestly… the main reason to use instance-level ones is for convenience logging fields off the instance; module-level namespace granularity seems sufficient for most cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants