Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering chatty libraries #148

Closed
DanCardin opened this issue Sep 30, 2019 · 17 comments
Closed

Filtering chatty libraries #148

DanCardin opened this issue Sep 30, 2019 · 17 comments
Labels
feature Request for adding a new feature

Comments

@DanCardin
Copy link

In general, debug output is what I'm looking for, and I'd like to get it from all dependent libraries. Using the snippet for integrating with the logging library, all is mostly well.

However, if left unchecked, boto3 and requests are good examples of libraries which are too chatty and log a lot of records that are generally not useful. But it's not obvious to me how to add loguru-side sinks in such a way that I can specify, x, y, and z packages ought to be filtered to INFO logs, and otherwise unspecified packages ought to continue logging at DEBUG.

Would the canonical method be adding a filter param to every add call and doing a separate level comparison at the point where i'm comparing against the logger?

@Delgan
Copy link
Owner

Delgan commented Sep 30, 2019

Hi.

Could you do this on the standard logging side, by calling something like logging.getLogger("chatty.sub.module").setLevel("INFO")?

There is a way to to this with a Loguru filter, but as you said, this requires to add this for all configured handlers, while you can disable them just once from standard logging I think.

@DanCardin
Copy link
Author

i suppose I could, though i was originally hoping i'd be able to have my goal of a preconfigured importable logger, and the cake of a declarative configure equivalent that'd be the whole api i'd need.

what are the odds you'd wouldn't be opposed to (the equivalent of) InterceptHandler being an enablable option to configure, and/or growing the ability to more precisely (and concisely) specify logger levels?

My ideal interface would be (modulo names)

logger.configure(
    intercept_standard_logging=True,
    log_level_overrides={
        'foo': 'INFO',
        'foo.bar': 'DEBUG',
     }
)
logger.info('off to the races!')

@Delgan
Copy link
Owner

Delgan commented Sep 30, 2019

Well, I considered making loguru fully interoperable with standard logging in the past, but I finally discarded this idea. This is why I came with the InterceptHandler() and PropagateHandler() workarounds.

I decided that standard logging was not my business. I'm fine with that as long as "there is a way" to implement interoperability, but I don't want this to be a Loguru "builtin". This is both a practical (keep codebase and maintenance minimal) and opinionated (keep api and burden minimal) decision.

Sorry, I'm quite conservative on this. How did you plan to use your "preconfigured importable logger"? Using a dict? By copy-pasting code? Maybe we can find a way to make it less cumbersome for you.

@DanCardin
Copy link
Author

DanCardin commented Sep 30, 2019

sorry, "preconfigured importable logger" a good think you have! I meant from loguru import logger; logger.info('foo') vs the getLogger(__name__) rigamarole. I've previously hacked together things that allow me to do the equivalent with native logging, but none of the other logging libraries appear to have such a simple api.

does log_level_overrides (or equivalent) go against the library's ideals or design?

@Delgan
Copy link
Owner

Delgan commented Oct 1, 2019

sorry, "preconfigured importable logger" a good think you have! I meant from loguru import logger; logger.info('foo') vs the getLogger(name) rigamarole. I've previously hacked together things that allow me to do the equivalent with native logging, but none of the other logging libraries appear to have such a simple api.

Oh, yes, ok I see. :)

does log_level_overrides (or equivalent) go against the library's ideals or design?

Yeah, as discussed this parameter would be entirely related to standard logging, so it should not be part of the Loguru's API in my opinion.

I know this is less convenient that using .configure(), but this does not require much lines of code:

log_level_overrides = {
    'foo': 'INFO',
    'foo.bar': 'DEBUG',
}

for module, level in log_level_overrides.items():
    logging.getLogger(module).setLevel(level)

@DanCardin
Copy link
Author

DanCardin commented Oct 2, 2019

I can see how the log interceptor would be related to standard logging, but the log level portion seems like it'd be a more general feature. basically an extension to activation, if I'm understanding it correctly, where instead of a hard True/False, if a level is specified, you'd also check the log level.

In fact, given the way you set status, i could imagine a dirt-simple code change to enable this (if i'm reading the code correctly.

if one sets activation=[('foo', 'INFO'), ('foo.bar', 'DEBUG')], and you changed https://github.com/Delgan/loguru/blob/master/loguru/_logger.py#L1567 to roughly like

        try:
            enabled_status = Logger._enabled[name]
        except KeyError:
            ... # existing logic

        if not enabled_status or enabled_status == level_id:
            return

Would that not give me the same behavior I'm looking for?

@Delgan
Copy link
Owner

Delgan commented Oct 6, 2019

Hi @DanCardin. I thought a little about your suggestion. This is not a bad idea per se, but there are several things that bother me. I know it's not what you want to hear, but I prefer not to add such option, sorry. :/

You see, the activation argument is equivalent to the enable() and disable() methods. If activation would be extended to support logging levels, enable() and disable() should be too. There is ambiguities to be resolved between .enable("foo", "INFO") and .disable("foo.bar", "DEBUG"). It is not intuitive to understand which levels are enabled or disabled. It may require adding a new function for clarification.

Basically, I like the current simplicity of enable() and disable(): libraries using loguru should disable their logs by default to avoid bothering the user, but the user is able to re-enable it if needed. From that point, the user is responsible to configure its sinks and filters based on the incoming log messages as if the library logs were part of its own application.

The principal problem with this proposition is that it moves the level configuration away of the sinks setup. For example, logger.enable("foo", "INFO") definitely looks very much like logging.getLogger("foo").setLevel("INFO"). I think one of the main strength of loguru is that it is not cluttered with configuring one level per logger and one level per handler. By extending the API to support per-module level, I can no longer guarantee this simplicity.

There is also the possibility yo extend the level argument of the .add() method so that it supports a list of (module, level) but this does not looks well, this still requires the parameter to be shared among sinks, and this is anyway already possible using the filter argument.

@DanCardin
Copy link
Author

DanCardin commented Oct 7, 2019

I would personally consider .enable()/.disable() to be runtime methods for direct use of the activation feature with values of True or False. I personally wouldn't see any reason to extend them to accept a level because they're already shortcuts (e.g. you don't have .set_activation('foo', True/False)) for having a value of True/False. So if you wanted the ability to use the api I'm suggesting outside .configure, i think it'd make sense to create a method method for it like .enable_up_to_level('foo', 'INFO') or something...but I'd honestly just skip it. To me, it seems like a strict superset of the current behavior and the "complexity" would be entirely opt-in.

The main issue I have with it not being part of the api, is that I can choose my own code's logging calls' levels, and i can choose the logging level for everything. But I can't choose the logging levels, or the location and relative frequency of logging calls external to my code. Ironically, using your suggestion, i'll see better ergonomics using a library written using std logging vs loguru.

The principal problem with this proposition is that it moves the level configuration away of the sinks setup. For example, logger.enable("foo", "INFO") definitely looks very much like logging.getLogger("foo").setLevel("INFO").

this paragraph is the one i find the most convincing. It's definitely true that this would introduce the ability to change logging level outside of sinks. But, given the ease with which i (think i) could make this change, your activation feature is already effectively doing the same thing, just at a much less granular level. The activation feature as-is seems like a bit of a bandaid for the otherwise lack of ability to control dependent code's logging calls.

I can imagine why straight up disabling/enabling whole top-level packages could be useful by itself, but you already provide the ability to do that by letting me specify foo.bar.baz for enable/disablement. The only reason i can think that that would have been useful in the first place was to deal with the problem of overly chatty libraries. But it seems like a really blunt tool, given that some other person's choice of code organization (one big module vs a bunch of organized/nested/subpackages/modules) currently forces me to choose 2 different and wildly more/less ergonomic strategies for controlling what logs i see from their code.

I think one of the main strength of loguru is that it is not cluttered with configuring one level per logger and one level per handler. By extending the API to support per-module level, I can no longer guarantee this simplicity.

fwiw, I saw the primary advantages as being (in priority order):

  • single logger means you can just import it and it will infer __name__, which is what everyone basically all the time. but for some reason other libraries don't make easier.
  • I've never wanted to attach handlers to anything but the root logger, which is implicit in this api. But this is really a side-effect benefit of the first one. Again, other libraries ought to acknowledge that this is least really common even if not universal.
  • declarative .configure method
  • various bells and whistles, newstyle formatting, coloring, tracebacks, etc

EDIT:
if none of that is convincing, i guess you can close the issue because I've said everything I think I could say.

@Delgan Delgan added the feature Request for adding a new feature label Oct 8, 2019
@Delgan
Copy link
Owner

Delgan commented Nov 15, 2019

Hi @DanCardin.

First of all, I would like to apologize for the time it took me to answer you. I was not sure what to do with this issue, and I spent my time on other problems. You took the time to write a constructive answer and explain in detail your point of view, thanks for that! It was rude of me not to answer you more quickly...

So, I think we agree on these two points:

  • Extending the ability to assign level to "module logger" in addition to sinks is undesirable
  • There is currently no convenient way to deal with sub-modules and configure their levels

Which leads me to the next suggestion... Why not just extend the filter argument so that it accepts a dict?

level_per_module = {
    "foo": "DEBUG",
    "foo.bar": "INFO"
}

logger.add(sys.stderr, filter=level_per_module)

I'm much more willing to implement this solution. Actually, I think this approach would be more powerful: suppose you want to configure some chatty library so it only print warnings to the terminal, but logs everything to a file. At that point, using configure() with activation would not suffice, and you would in the end need to rely on per-sink filtering. So, moving the modules dict from .configure(activation=...) to .add(filter=...) makes perfect sense in my opinion.

What are you thought on this solution? Do you think that would suits your use case?

@DanCardin
Copy link
Author

DanCardin commented Nov 18, 2019

First of all, I would like to apologize

No problem! For the time being I've fully disabled those logs and that's alright. I dont expect a Rome to be built in a day for free.

(While I still think your activation interface is pretty much an (existing) ideal spot for this, filters might be okay as well) hmm. Do you see this as being an alternative to the existing filter functions, or simply a more specific way of targeting filter functions?

E.g. could i do?

def only_certain_logs(record):
    ...

loguru.configure(
    handlers=[
        dict(
            sink=sys.stderr,
            filter={
                "foo": "INFO",
                "foo.bar": only_certain_logs,
            }
        ),
    ]
)

If so, I'd maybe recommend having falsey values evaluate to filtering it out unconditionally. At that point you could effectively deprecate activation because this would be a strict superset of its behavior on an existing feature. (migration would literally be s/activation/filter in all cases).

One other thing, I guess in order to maintain something like the existing filter interface might be accepting a None or '' key to count as a catchall filter when none of the modules match?

If both of the above things, the total ruleset of filter would be something like:

  • If filter is callable call it (same as today)
  • if filter is a string only that module (same as today)
  • if filter is a mapping
    • identify if the log record is in the filter map
      • if the value of the map is a False filter out
      • if the value of the map is a log level, use that as the filter
      • if the value of the map is callable, call it.
    • else if the default "default" key exists, apply the same logic above
    • else apply no filter(?)

@Delgan
Copy link
Owner

Delgan commented Nov 18, 2019

(While I still think your activation interface is pretty much an (existing) ideal spot for this, filters might be okay as well)

He he. Although I think you got a point last time when you said that .enable() / .disable() is basically the same than filter but with less granularity, I'm still uncomfortable with the idea of level-based filtering handled by the logger. So, if you are fine with this being configured by the filter argument only, then I will choose that option. 🙂

Do you see this as being an alternative to the existing filter functions, or simply a more specific way of targeting filter functions?

Hum... I was planning to keep it very simple: a dictionary of sub-modules associated with their minimum level. I prefer not to add functions as valid values. If the handling of sub-modules is so complex that a function is needed, it should manage the whole filter-logic and be directly passed as the argument.

If so, I'd maybe recommend having falsey values evaluate to filtering it out unconditionally.

I hadn't thought to that, but it makes sense. We need a way to filter out, so we need False (and True) to be a valid value.

At that point you could effectively deprecate activation because this would be a strict superset of its behavior on an existing feature. (migration would literally be s/activation/filter in all cases).

Hum, yeah, using activation would be in practice equivalent to passing {"module": False} to all handler's filter. However, as I said in a previous post, .disable() was initially intended to be used to silent logs while publishing a library. If the user wants to re-allow them, he has to use .enable(). So, activation should remain. Even so, as the purpose are not the same (global filtering vs handler filtering), the implementations are different and permits some performance optimizations.

One other thing, I guess in order to maintain something like the existing filter interface might be accepting a None or '' key to count as a catchall filter when none of the modules match?

Indeed, actually, "" is already a valid value to be passed to filter: it allows everything (because it's the "parent" of all modules). It makes sense to use this key as the default level if no other module matches.
None would also be valid for another reason: there is a possibility that some module has record["name"] set to None (in some rare cases while using dask). So, one should use None to set the level of such module.
Note also that the filter will not suppress the level argument. The later is checked first and the log needs to have a greater severity before being re-checked by the filter argument. One case set level=0 anyway if filter should handle all logs.

If both of the above things, the total ruleset of filter would be something like: [...]

About the mapping, I would like to point out that it will check for the presence of a "parent" in the dict, not necessarily the exact same module name. I see this as an extension of current filter="some.module" which allows "some.module" and all its children.

  • if filter is a mapping
    • search for the closest parent of the log record module name
      • if the value of the map is False filter out, if it's True filter-in
      • if the value of the map is a string or an int, use that as the filter
    • else apply no filter

@DanCardin
Copy link
Author

DanCardin commented Nov 18, 2019

The main reason I suggested allowing callables as options to the dict, is that if someone using a filter for a preexisting reason wanted to make use of this feature, they'd need to drop their existing filter and switch to this new dict, or else write a function to perform the combination of the two. I guess my thought was that it composes somewhat nicely back into itself.

Instead of any combination of two simple filters devolving into a more complex filter function which requires you to introspect attributes off a record that you wouldn't have needed to before, you could basically conjoin any two orthogonal filters on a given sink without having to switch the method you use.

In any case, I'm not using filters at all right now, so I'd be happy with just the straight up dict with values accepting logging levels/False.

disable() was initially intended to be used to silent logs while publishing a library

this feels weird to me. i'd probably expect one to always need to add a sink; and that a library would simply be expected to not add its own sinks. but that's besides the point.

...parent modules...

yea i was mentally incorporating that but it's an important distinction.

@Delgan
Copy link
Owner

Delgan commented Nov 19, 2019

I understand your point, but I think that if someone is already using a filter function, there is probably no need for him to modify a working solution so that it suits this new feature. Anyway, I don't think there is much need for per-module filter functions, so I will stay with the basic dict solution as you and I are both happy with that. 🙂

this feels weird to me. i'd probably expect one to always need to add a sink; and that a library would simply be expected to not add its own sinks. but that's besides the point.

I think we need a way for libraries to explicitly disable logs, the standard logging uses the NullHandler for that. Otherwise, it means that as soon some user adds its own sink the logs coming from the libraries will pop out as there is nothing to filter them out. 😕

@DanCardin
Copy link
Author

I mean, I think you simply wouldn't register a logging handler in the normal logging library, and it wouldn't generate logs by itself unless an end-user added a handler. Having added a handler, I I would expect all of a library's logs; that's not surprising to me. And if I find their logging practices to be obnoxious, then this feature would give me the ability to filter some subset out.

But my trying to convince you of anything about that feature/adding an implicit sink is all besides the point (of this feature at least). I think even the simpler filter dict option will do nicely for now.

@Delgan
Copy link
Owner

Delgan commented Nov 19, 2019

Yeah, don't worry, I was just genuinely interested to know why you thought it was better to display all logs by default. I guess it entirely depends on the user. I think it might surprise some people to see logs which are not their own, this is why there is the possibility to .disable() them.

Anyway, I will implement the "dict-filtering" in the next few days and publish a new release shortly after that. 👍

@Delgan
Copy link
Owner

Delgan commented Nov 22, 2019

Implemented! Thanks again for your patience and your feedback. It will be available in the v0.4.0 coming soon. 👍

@nielsbom
Copy link

nielsbom commented May 6, 2024

(for people who come across this)
I wrapped a chatty library using the code below.
I actually needed to transform and display the "log messages", they were not real log messages.

from loguru import logger

def log_sink(message):
    # Handle any loguru calls from the libraries we're using
    print("loguru log message:", message)

if __name__ == "__main__":
    logger.remove()
    logger_format = "{message}"
    logger.add(sink=log_sink, format=logger_format, level="INFO")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Request for adding a new feature
Projects
None yet
Development

No branches or pull requests

3 participants