Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use python logging to report on convergence progress it level info for long running tasks #78

Open
ogrisel opened this issue Feb 12, 2011 · 14 comments

Comments

@ogrisel
Copy link
Member

@ogrisel ogrisel commented Feb 12, 2011

This is a proposal to use python's logging module instead of using stdout and verbose flags in the models API.

Using the logging module would make it easier for the user to control the verbosity of the scikit using a single and well documented configuration interface and logging API.

http://docs.python.org/library/logging.html

@GaelVaroquaux

This comment has been minimized.

Copy link
Member

@GaelVaroquaux GaelVaroquaux commented Dec 8, 2013

Work has started on this in https://github.com/GaelVaroquaux/scikit-learn/tree/progress_logger

What remains to be done is most probably fairly mechanical work.

@amueller

This comment has been minimized.

Copy link
Member

@amueller amueller commented Jan 7, 2014

There is also work in the new Gradient Boosting module.

@larsmans

This comment has been minimized.

Copy link
Member

@larsmans larsmans commented May 10, 2014

Logging actually isn't that easy to use at all, in my experience, so -1 on this.

@aadilh

This comment has been minimized.

Copy link

@aadilh aadilh commented Oct 19, 2015

Is anyone working on this ?

@amueller

This comment has been minimized.

Copy link
Member

@amueller amueller commented Oct 13, 2016

How about we add a logger that by default prints to STDOUT? That should be fairly simple, right?

@domoritz

This comment has been minimized.

Copy link

@domoritz domoritz commented May 30, 2018

This issue has been open since 2011 and so I wonder whether this is going to be fixed. I've run into this issue with RFECV (

class RFECV(RFE, MetaEstimatorMixin):
). I wanted to print the progress but the default verbose printing prints too many messages. I didn't want to monkey patch sys.stdout to make this work and overriding the logger would be the simple and clean solution.

There are other issued in sklearn such as #8105 and #10973 that would benefit from real logging in sklearn. Overall, I think logging would be a great addition to sklearn.

@jnothman

This comment has been minimized.

Copy link
Member

@jnothman jnothman commented May 30, 2018

@domoritz

This comment has been minimized.

Copy link

@domoritz domoritz commented May 30, 2018

I'm a bit busy right now but I support customizable logging in sklean in whatever form (although I prefer standard python logging).

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

@TomAugspurger TomAugspurger commented Nov 18, 2019

Has there been any discussion about what verbose=True will mean when scikit-learn starts using logging? We're dealing with this a bit in dask-ml: dask/dask-ml#528.

Given that libraries aren't supposed to do logging configuration, it's up to the user to configure their "application" (which may just be a script or interactive session) to log things appropriately. This isn't easy always to do correctly.

My proposal in dask/dask-ml#528 is for verbose=True to mean "temporarily configure logging for me". You can use a context manager to configure logging, and scikit-learn would want to ensure that INFO level messages are printed to stdout to match the current behavior.

@jnothman

This comment has been minimized.

Copy link
Member

@jnothman jnothman commented Nov 18, 2019

@thomasjpfan

This comment has been minimized.

Copy link
Member

@thomasjpfan thomasjpfan commented Nov 18, 2019

My proposal in dask/dask-ml#528 is for verbose=True to mean "temporarily configure logging for me".

This seems like a good balance. Using the logging module is not that too user friendly. Another "hack" would be to use info by default, but when a user sets verbose=True the logs can be elevated to warning. This would work because warnings are displayed by default.

@jnothman

This comment has been minimized.

Copy link
Member

@jnothman jnothman commented Nov 18, 2019

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

@TomAugspurger TomAugspurger commented Nov 19, 2019

@jnothman’s comment matches my thoughts. scikit-learn would always emit the message, and the verbose keyword controls the logger level and handlers.

@thomasjpfan

This comment has been minimized.

Copy link
Member

@thomasjpfan thomasjpfan commented Nov 19, 2019

But the local handler could change from warning to info to debug
level on stream as verbose increases

Okay, lets go with this. Currently the logging levels are https://docs.python.org/3/library/logging.html#logging-levels By default, we can use the INFO, which does not emit by default. When verbose=1, we have the handler change info -> warning, and debug -> info. When we set verbose>=2, we still have info -> warning but also have debug -> warning, and the estimator can interpret the verbose>=2 to mean "emit more debug messages as verbose increases". This meaning can be different between different estimators.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
9 participants
You can’t perform that action at this time.