Skip to content
This repository has been archived by the owner on Mar 19, 2021. It is now read-only.

Remove logging.root basicConfig on pystan import #685

Closed
wants to merge 2 commits into from

Conversation

cdeil
Copy link

@cdeil cdeil commented Feb 16, 2020

Summary:

This PR removes the logging.basicConfig call from pystan/__init__.py, to avoid messing with the root logger on import pystan.

This is following the standard advice for Python libraries (see https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library). The same PR was made before (see #466) and rejected, partly with the argument that PyStan is mostly used as an application, not as a library. For us, this is not the case, in our machine learning pipeline we import fbprophet and import pystan for one time series model, and we import and try several other time series modelling libraries - for us pystan is just one library one of our dependencies uses, and we want to configure logging as we like. See also https://github.com/stan-dev/pystan/issues/465 and stan-dev/pystan#97 and related facebook/prophet#1336 which I opened just now.

I understand your want to print messages to the user by default. I think that is the case with the standard config as proposed here already starting with Python 3.2, since logging.lastResort was added (see https://docs.python.org/3/library/logging.html#logging.lastResort)?

If you still support Python 2.7, you could attach a StreamHandler to your logger as in the example here - and remove it once you drop Python 2 support.

Intended Effect:

import pystan doesn't affect logging.root.

How to Verify:

Use logging_tree.printout() before and after import pystan to see the logging config (see https://pypi.org/project/logging_tree/), or try out simple example scripts to see how logging behaves.

Side Effects:

This PR removes the side effects of import pystan on the Python standard library logging config.

Documentation:

I did not update the documentation yet. Please re-consider this question, and if you're willing to make a change, I'd be happy to update this PR and docs accordingly.

Reviewer Suggestions:

Try out this branch, see if output by default is OK. Read this and https://rhodesmill.org/brandon/2012/logging_tree/.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):

Me.

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@riddell-stan
Copy link
Contributor

riddell-stan commented Feb 16, 2020

Thanks. We should look into this.

@cdeil do you have any opinion on what pymc3 does? They do this in their __init__.py:

import logging

_log = logging.getLogger("pymc3")
if not logging.root.handlers:
    _log.setLevel(logging.INFO)
    if len(_log.handlers) == 0:
        handler = logging.StreamHandler()
        _log.addHandler(handler)

@ahartikainen
Copy link
Collaborator

Are you using pystan in a module?

This should silence the logger for pystan

logging.getLogger("pystan").propagate=False

Or do you want to manually set the logging properties for pystan? This should work if you setup logger before importing pystan.

@cdeil
Copy link
Author

cdeil commented Feb 17, 2020

opinion on what pymc3 does?

I think this is much better than what you currently have. What they do is mostly equivalent to what e.g. scikit-learn does (see here), they set the level to INFO and attach a StreamHandler to their logger. They do not touch the root logger.

Would you be OK to change to what PyMC3 has, or to what scikit-learn has?

The difference is just that PyMC3 has an if statement and only conditionally sets the level and attaches the handler to their logger on import -- I'm not sure if this is better or worse than what scikit-learn does. It tries to do the right thing depending on what the user is doing, but it's probably also confusing to users to sometimes behave one way, sometimes the other.

Are you using pystan in a module?

We don't import pystan at all, but we from our pipeline import fbprophet and they import pystan.


Note that logging.basicConfig only works on first call, due do the check here. Users expect it to work. But if you already call it, then if they call it after importing your library, currently it will silently have no effect:

Try this:

>>> import pystan
>>> import logging
>>> logging.basicConfig(level=logging.ERROR)
>>> logging.root
<RootLogger root (INFO)>

The logging.basicConfig call by the user is a no-op, because you already called it on import pystan before.

@riddell-stan
Copy link
Contributor

I believe the intention was always to follow the scikit-learn and pymc convention. I would be happy switching to that if @ahartikainen has no objections.

Thanks as well for the tip about logging.lastResort. I didn't know about that.

@ahartikainen
Copy link
Collaborator

Following sklearn sounds good.

So does this mean the following:

import pystan --> uses logger as indented

Manually handling loggers -->

import pystan --> follows logger settings set by the user?

Or something else?

This is following what scikit-learn does.
@cdeil
Copy link
Author

cdeil commented Apr 12, 2020

I've added a StreamHandler to the pystan logger in 7361507, to follow what scikit-learn does, as a comprimise in the hope that you will accept this change.

The general recommendation is to not add handlers in libraries (see e.g. here or here), but to leave all logging configuration to the users, meaning the people that import pystan, not necessarily end-user, e.g. we have a machine learning pipeline where we import fbprophet which does import pystan, and as a result of your __init__.py logging shenanigans, we have to "fix up" your changes to the root logger after we call import fbprophet, and copy that fix-up code to all pipelines we write to avoid having double-log messages from the root logger.

But I can see why you want a default handler, because a large fraction of your users are end-users that just run import pystan and you directly want INFO level logging messages to show up for them. I'm pretty sure that's also the reason scikit-learn added the stream handler. I've suggested they remove it at scikit-learn/scikit-learn#16451, no comment so far. As mentioned there, on Python 3 this is not needed any more, because there's the last resort handler.

But the more important change here is to not mess with the root logger config in libraries. pystan is the only Python library I know of that does that. It's really confusing for users that after import pystan does this.

@ahartikainen - I don't understand the questions in your last comment, sorry. For me, reading this helped to understand and see how logging in Python works, and generally the key points are that

(a) logging in Python is co-operative. It only works nicely if libraries play by the rules mentioned here.
(b) logging configuration should be left to the end-users. Each library (like pystan) has different users that want different things, and thus trying to configure logging (especially add handlers to the root logger) for the user on import pystan is impossible.

@ahartikainen
Copy link
Collaborator

@cdeil thanks for the explanation.

I was thinking if there is a way to check for pre-existing loggers to the behavior would follow the wanted logic.

If no loggers are set up, then create streamlogger so user can see the messages, otherwise follow the default logger?

@cdeil
Copy link
Author

cdeil commented Apr 12, 2020

@ahartikainen - I think what you are suggesting is what PyMC does?
See https://github.com/stan-dev/pystan/pull/685#issuecomment-586764087 above.

As I mentioned in https://github.com/stan-dev/pystan/pull/685#issuecomment-586862336 I would not recommend that. I think that's confusing / a bit hard to understand for users, to do one thing or another depending on the state of logging config in import pystan. Try explaining this in the docs exactly what the conditions are when you attach a StreamHandler, taking into consideration that most won't know how logging handlers work or what the root logger is.

But IMO that's a minor point, in one case a StreamHandler is always attached, in the other case it sometimes happens. If you (and the other pystan devs) have a preference for that, let me know and I can make the change here.

@riddell-stan
Copy link
Contributor

@cdeil sorry this has taken so long. Thanks for bringing the issue to our attention. Before merging this I want to verify this isn't going to change behavior for people that are relying on the current behavior.

And I do want to shift to the recommended pattern. It was a mistake to use the logger to send messages to the user. PyStan 3 and httpstan will follow the recommended pattern in the Logging HOWTO.

@ahartikainen
Copy link
Collaborator

@riddell-stan I have been thinking about this, should we move to use print for user output?

@riddell-stan
Copy link
Contributor

I haven't had a chance to test this. I don't want to break backwards compatibility.

If it doesn't change behavior, feel free to merge it.

@cdeil
Copy link
Author

cdeil commented May 21, 2020

@riddell-stan @ahartikainen - it's been three months with lots of discussion going in circles. Seems like a waste of time for you and me. Closing this PR.

@cdeil cdeil closed this May 21, 2020
@cdeil cdeil deleted the root-log branch May 21, 2020 21:27
@ahartikainen
Copy link
Collaborator

Sorry for the wasted time. Lately, PR time-durations have been long for PyStan2

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants