Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document initLogging requirement #621

Open
bertsky opened this issue Oct 2, 2020 · 9 comments
Open

document initLogging requirement #621

bertsky opened this issue Oct 2, 2020 · 9 comments

Comments

@bertsky
Copy link
Collaborator

bertsky commented Oct 2, 2020

Using any of core's API (even just ocrd_utils) now requires calling initLogging at some point prior to usage, or the respective subpackage's getLogger will complain with CRITICAL.

So not only is it not correct anymore to do ocrd_utils.getLogger at module level, any program or library using core must also do ocrd_utils.initLogging during its initialization (but not necessarily on the module level) now.

This must be reflected somewhere in the documentation prominently.

@M3ssman
Copy link
Contributor

M3ssman commented Oct 18, 2020

Three thumbs up!
Without this call any logging records in any other application that uses stuff from ocrd-core gets dropped. Why become all loggers silently PropagationShyLoggers? All our logs vanished after updating from 2.16 (behavior changed from v2.16 to v2.17, as I realized just by now).

@bertsky
Copy link
Collaborator Author

bertsky commented Oct 18, 2020

Without this call any logging records in any other application that uses stuff from ocrd-core gets dropped. Why become all loggers silently PropagationShyLoggers?

Sorry to hear this hit you in the back.

I don't understand what you say there exactly, though. So you have a program that imports (some of) core's packages, but does not use the Processor CLI decorator (which brings ocrd_utils.initLogging), and that program does not give you any log messages at all anymore, or just not the OCR-D log messages?

Also, what's a PropagationShyLogger?

@M3ssman
Copy link
Contributor

M3ssman commented Oct 19, 2020

Within our main migration workflow scripts I do use some functionalities of OCR-D core for workspace preparation and so on (e.g. : https://github.com/OCR-D/ocrd-website/wiki/How-to-create-searchable-Fulltext-Data-for-DFG-Viewer).

PropagationShyLogger is introduced from core/blob/master/ocrd_utils/ocrd_utils/logging.py, on the line with logging.setLoggerClass(PropagationShyLogger). This also turns any other logger from an app that uses ocrd_core-library into this type. If you stay in ocrd-realms, you might not even bother. I wonder if this was intended ... but it seems not to be that harmfull anyway, since it was already introduced back in april when we first reviewed the logging module.

This one 4ce207ee8 ocrd_utils/ocrd_utils/logging.py (Konstantin Baierer 2020-09-11 17:06:03 +0200 165) disableLogging() effectively turns off any logger in any application that imports ocrd-stuff. This leads to situations where empty log-files are being written. We really need to keep in mind this is actually breaking behavior.

The reason to introduce this was to have a short-cut to silence logging from all 3rd-party-libs ocrd uses ... ?

@kba
Copy link
Member

kba commented Oct 19, 2020

The reason to introduce this was to have a short-cut to silence logging from all 3rd-party-libs ocrd uses ... ?

The reason for doing this was to prevent libraries like tensorflow non-processing-related logging in processors from interfering with sys.stdout/sys.stderr, so as not to break output like --dump-json.

This also turns any other logger from an app that uses ocrd_core-library into this type.

That was intentional and this behavior, that PropagationShyLogger is set as the default logger, has been around a while. What is new is that in the absence of initialization, logging is completely disabled. I can see how this can be a problem, sorry for the inconvenience.

Where should I document this?

@M3ssman
Copy link
Contributor

M3ssman commented Oct 19, 2020

If the main reason to turn off logging totally is the behavior of tensorflow, question: It is possible to integrate tensowflow into the default logging conf file? If so, I'd prefer this, since this too documents that there's a problem with tensorflow logging if left alone.

@kba
Copy link
Member

kba commented Oct 19, 2020

It is possible to integrate tensowflow into the default logging conf file?

We should probably mirror the builtin defaults to the logging.conf, i.e.

        logging.getLogger('').setLevel(logging.INFO)
        #  logging.getLogger('ocrd.resolver').setLevel(logging.INFO)
        #  logging.getLogger('ocrd.resolver.download_to_directory').setLevel(logging.INFO)
        #  logging.getLogger('ocrd.resolver.add_files_to_mets').setLevel(logging.INFO)
        logging.getLogger('PIL').setLevel(logging.INFO)
        # To cut back on the `Self-intersection at or near point` INFO messages
        logging.getLogger('shapely.geos').setLevel(logging.ERROR)
        logging.getLogger('tensorflow').setLevel(logging.ERROR)

Maybe we should just move to using the builtin logging.conf if no logging.conf is found in the filesystem?

@bertsky
Copy link
Collaborator Author

bertsky commented Oct 19, 2020

If the main reason to turn off logging totally is the behavior of tensorflow, question: It is possible to integrate tensowflow into the default logging conf file?

No, not just TF. Any library that is imported by processors and already logs things at module init time is problematic for CLIs that want to control exactly what appears on stdout. We've also seen pyplot do this and some others (don't recall).

Maybe we should just move to using the builtin logging.conf if no logging.conf is found in the filesystem?

I don't really see a connection to the issue at hand, and I think it's useful to not rely on filesystem access for the defaults.

Maybe we should first try to pinpoint exactly what @M3ssman needs to import from core without breaking other logging components. For our processors, we have initLogging via CLI decorator. For the ocrd CLI we do it explicitly. In these contexts killing unwanted logging is sometimes necessary.

Maybe the problem is that we require initLogging when using getLogger?

@M3ssman
Copy link
Contributor

M3ssman commented Oct 19, 2020

@kba
So we should extend the default configuration with, say

[loggers]
keys=root,ocrd_tensorflow

...

[logger_ocrd_tensorflow]
level=ERROR
handlers=consoleHandler
qualname=tensorflow

and so on?

@kba
Copy link
Member

kba commented Mar 29, 2023

image

@bertsky:

Should be removed now IMO. (If you need to experiment on this, just configure the logger's level it in your local ocrd_logging.conf.)

@MehmedGIT:

That's what I tried but without success. When I use the ocrd_logging.conf file then it overwrites and suppresses all potential logs coming from loggers not mentioned in that configuration file. For example, uvicorn server logs. That's why we are using python logging here. @kba is working on improving the ocrd logging. Then we can remove python logging completely.

@bertsky

You can always configure default loggers (no qualname) with suitable level and handlers. And specific loggers, too. For "rogue" libraries that spam the root logger, there are workarounds. But for Uvicorn in particular, have a look at the log_config kwarg of its run method.


Posting here, since this is related, I really need to clean up the logging setup and re-align the standard-logging of #974 with the rest of the code base.

@kba kba mentioned this issue Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants