W1202 - logging-fstring-interpolation is not useful #2395

pzelnip · 2018-08-08T15:51:06Z

Is your feature request related to a problem? Please describe

In my experiments W1202 is misleading as f-strings performance is virtually equivalent to %-operator formatting in practice.

Describe the solution you'd like

W1202 be deprecated, as it can lead people to not use f-strings, which (if available) are the preferred mechanism for string interpolation.

Additional context

This is related to my inquiry around this warning: #2354

To verify my hypothesis that f-strings are virtually as performant as %-operator formatting I wrote the following snippet to explore timing related to log statements with the 3 approaches (f-strings, .format(), and %-operator). This was on a Django app I'm working on.

import timeit

def timeitex1(num_times):
    fstring = timeit.timeit(
        'logging.info(f"""this is a {thing}, and thing: {thing + "blah"}""")',
        setup="import logging ; thing='1234'",
        number=num_times,
    )

    formattim = timeit.timeit(
        'logging.info("this is a {}, and thing: {}".format(thing, thing + "blah"))',
        setup="import logging ; thing='1234'",
        number=num_times,
    )

    percentop = timeit.timeit(
        'logging.info("this is a %s, and thing: %s", thing, thing + "blah")',
        setup="import logging ; thing='1234'",
        number=num_times,
    )

    return (fstring, formattim, percentop)

With logging output disabled (ie in my Django settings I had: LOGGING["loggers"] = {}), %-formatting and f-strings are virtually identical.

num_times = 10,000:
(0.016536445124074817, 0.02152163698337972, 0.018616185057908297)

num_times = 100,000:
(0.16004435811191797, 0.20005284599028528, 0.1561291899997741)

num_times = 1,000,000:
(1.641325417906046, 2.0023047979921103, 1.6249939629342407)

Returned times are tuples of "time for f-strings", "time for the format() call", and "time for %-formatting".

That's a difference of ~1% (1.641325417906046 vs 1.6249939629342407 seconds) between f-strings & the %-operator on the largest run (.format calls being clearly slower).

With logging enabled, it's a total wash (technically f-strings was the winner in the larger runs but by such a small margin it's inconsequential). This is with log output going to standard out (so I only went up to 100,000 entries as I didn't want to dump 3 million log statements to my console):

num_times = 10,000:
(1.6245153648778796, 2.0137458329554647, 1.6029843359719962)

num_times = 100,000:
(4.39409149507992, 4.51585376705043, 4.532764581032097)

So basically unless you are emitting a ton of logging statements for a level that are never actually logged (say because the LOG_LEVEL is set high), then it doesn't matter. And if you're in that scenario, then really there's no noticeable difference between %-operator formatting or f-strings (just don't do .format() calls).

OTOH, if you are emitting the logging message, then it's likely that the f-string approach will be faster, so following W1202 will actually make your code (extremely slightly) slower.

The text was updated successfully, but these errors were encountered:

Steffan-Ennis · 2018-08-16T06:00:09Z

Have a read of #1788, fstrings are gone for good

FrankDMartinez · 2018-09-25T02:42:15Z

How about an option to exempt f-strings? By default, f-strings would be flagged; with the option, no.

mcallaghan-bsm · 2019-06-03T20:24:42Z

Do we have consensus on this? We too feel this should be deprecated.

As of Python 3.6 the recommended way to do this is with a formatted string literal. https://docs.python.org/3/reference/lexical_analysis.html#f-strings

Note that there are TWO warnings that could come ...

# python3 -m pylint blahblah.py 
************* Module blahblah
blahblah.py:8:13: W1202: Use % formatting in logging functions and pass the % parameters as arguments (logging-format-interpolation)
blahblah.py:11:0: W1203: Use % formatting in logging functions and pass the % parameters as arguments (logging-fstring-interpolation)

This is from:

MY_VAR_BAR = 'ywah'
_LOG.warning('foo %s', MY_VAR_BAR)
_LOG.warning('foo {}'.format(MY_VAR_BAR))
_LOG.warning('foo {MY_VAR_BAR}')
_LOG.warning("foo {MY_VAR_BAR}")
_LOG.warning(f"foo {MY_VAR_BAR}")

go

# python3 blahblah.py
foo ywah
foo ywah
foo {MY_VAR_BAR}
foo {MY_VAR_BAR}
foo ywah

Of note, in our .pylintrc file, we did already bump it to new format for logging-format-style...

[LOGGING]

# Format style used to check logging format string. `old` means using %
# formatting, while `new` is for `{}` formatting.
logging-format-style=new

earonesty · 2019-07-11T18:39:03Z

This is a per-project issue. We're writing a filesystem. There are certain points in the code where we have TRACE level logging. These are hit millions of times and interpolation can kill performance. Even DEBUG level f-strings can slaughter the functioning of low level code.

On the other hand, the readability of f-strings should generally be preferred when the number of interpolations isn't significant. In this case, providing developer with linter tools and options that help them make smart decisions is the right choice... not arbitrarily deprecating or enabling an option that has, at least for me personally, rescued seriously broken code.

Many pylint settings fall into this nuanced trade-off range, where considerations of readability, code complexity, etc. need to be balanced against each other and there are not easy choices. These generally cause controversy, and repeated issue tickets (this one is the 3rd!):

#2354
#1788

This could be an indicator that pylint needs more nuanced options to allow configuration of the linter depending on the nature of the project.

See: #3000 for an example.

mcallaghan-bsm · 2019-07-11T18:45:36Z

In addition to @earonesty 's comment above, another consideration MAY be that if we add clarity and documentation to these such nuanced warnings, the projects and users can make better educated decisions on their own.

(In this example, the information available at the moment for the fstring-interpolation warning leads to confusion and at worse developers NOT using this for non-performant required code which hurts readability unnecessarily)

tristanbrown · 2019-07-16T21:11:58Z

@earonesty, I think the point that's missed here is that the purpose of pylint is not to make every project's algorithmic decisions for them, but to make code generally consistent for readability purposes. The cases where the performance of interpolation matters seem particularly niche, whereas the cases where f-strings enhance readability are broad and varied.

In that sense, -in general- f-strings should be favored by default by style-checkers. The few projects where f-strings cause performance problems can set non-default configurations.

chdsbd · 2019-07-16T21:22:11Z

Here’s a copy of a comment I made on #1788. #1788 (comment)

It seems like everyone here is talking about performance. There are other benefits to letting logging do the interpolation.

if interpolation fails, it will be logged instead of raising an exception
things like Sentry can aggregate log messages. If you interpolate before logging, Sentry would see it as one event 0

Closes #2395

earonesty · 2019-10-21T20:39:02Z

I've come around to the idea that f-strings are almost always a bad idea, certainly in logging. The run-time implications and side-effects of the f-string eval() simply don't belong in logging statements.

tristanbrown · 2019-10-21T20:49:20Z

I've come around to the idea that f-strings are almost always a bad idea, certainly in logging. The run-time implications and side-effects of the f-string eval() simply don't belong in logging statements.

@earonesty Can you elaborate on those side-effects?

earonesty · 2019-10-21T21:22:39Z

As @chdsbd said, string interpolation is happening at run-time in a way that can lead to exceptions. And while it's abundantly clear what the potential side effects of log.debug("%s", str(a.thing)), it's less clear that log.debug(f"{a.thing}") is the same thing.

mcallaghan-bsm · 2019-10-24T13:42:32Z

@earonesty I really don't agree with this as a general rule of thumb. It completely depends on the criticality of your application and frequency of logging. It is definitely possible that some applications may wish to avoid this for robustness and performance reasons, but MOST ~80% of Python programs need not worry.

earonesty · 2019-10-24T15:37:19Z

Sure, they don't have to worry. But linters are all about best practices. And in this case, you have a choice of logging lines that either can or cannot cause exceptions... and either can or cannot be aggregated. Worse... if you're writing a library... you're making this decision on behalf of your users. f-strings just don't improve readability that much in log lines to justify the possibility of all 3 issues: run-time failures, aggregation incompatibility, and performance issues.

tristanbrown · 2019-10-24T16:46:12Z

@earonesty it's generally best practice and more pythonic to explicitly catch your exceptions. Trusting the logger to implicitly catch exceptions, but only when the code is contained within the logging statement, seems like a rather odd and unreliable approach.

I disagree about the extent of readability. f-strings put an expression exactly where it should go in a string statement. That's a monumental step up from positional %s replacements, improving readability and reducing the potential for bugs:

Suppose you have a logging string requiring 5 different variable replacements. It's already difficult to see which element of the tuple maps to which %s instance. Now you want to rearrange the message a little bit, or add a variable to the message, or remove one. Now are you sure you've rearranged all of the variables in the tuple correctly? The potential for error here is quite high, especially if you're working with someone else's code.

Most people should be using f-strings where possible. I don't deny that there are edge cases where %s or even .format can work better, but f-strings should be the default.

earonesty · 2019-10-24T20:25:09Z

@tristanbrown

i don't think you understand what kinds of exceptions are handled. the exceptions are those that result from unintentional invalid string conversions. of course, relying on it is absurd. it is equally absurd for your log lines to introduce bugs at runtime. "%s" conversion fixes some, but not all of this. ideally, you could pass f-like-strings for deferred evaluation when used. that way log lines could never break a working program. but python doesn't have that feature yet.
i agree, f-strings are easier to work with when you have larger templates with many variables. not that this is typical for logging. i cloned the top 10 'trending' repos from github just now and not one of them used more than 2 in a log line.
many, many people think that logging with f-strings results in deferred evaluation. i've fought that fire enough times at enough companies to know never to allow it.

thormick · 2020-10-23T12:04:08Z

I do see that this is an old, closed issue, but it's the first hit when I Googled this pylint warning, but it's still being referenced here and on SO.

When I test it the way it's tested in the original it seems to impose overhead with getting the logger that muddles the results, more than 40% of the measured runtime being getting the logger over and over. I don't quite see why, logging with logger directly this way should just create a root logger once and get it on each subsequent run, but there's still a significant overhead. Maybe timeit interferes somehow, or there's more to it anyhow. I also wanted to check how formatting with the % operator performs, as it's used on a project I'm working on, and it's a tiny bit faster than using format. There also isn't code in the f-strings on the projects I'm currently on, so I skipped the + "blah" part.

My whole test then becomes:

import timeit


def timeitex2(num_times):
    fstring = timeit.timeit(
        'logging.info(f"""this is a {thing}, and thing: {thing}""")',
        setup="import logging ; thing='1234'",
        number=num_times,
    )
    formattim = timeit.timeit(
        'logging.info("this is a {}, and thing: {}".format(thing, thing))',
        setup="import logging ; thing='1234'",
        number=num_times,
    )
    percentop = timeit.timeit(
        'logging.info("this is a %s, and thing: %s", thing, thing)',
        setup="import logging ; thing='1234'",
        number=num_times,
    )
    formatpercentop = timeit.timeit(
        'logging.info("this is a %s, and thing: %s" % (thing, thing))',
        setup="import logging ; thing='1234'",
        number=num_times,
    )
    fstring2 = timeit.timeit(
        'logger.info(f"""this is a {thing}, and thing: {thing}""")',
        setup="import logging ; thing='1234' ; logger = logging.getLogger(__name__)",
        number=num_times,
    )
    formattim2 = timeit.timeit(
        'logger.info("this is a {}, and thing: {}".format(thing, thing))',
        setup="import logging ; thing='1234' ; logger = logging.getLogger(__name__)",
        number=num_times,
    )
    percentop2 = timeit.timeit(
        'logger.info("this is a %s, and thing: %s", thing, thing)',
        setup="import logging ; thing='1234' ; logger = logging.getLogger(__name__)",
        number=num_times,
    )
    formatpercentop2 = timeit.timeit(
        'logger.info("this is a %s, and thing: %s" % (thing, thing))',
        setup="import logging ; thing='1234' ; logger = logging.getLogger(__name__)",
        number=num_times,
    )
    return (fstring, formattim, percentop, formatpercentop, fstring2, formattim2, percentop2, formatpercentop2)

So the four first results are using logging.info, the last four use the same string formatting style but use a logger object.

I ran this several times until I got a run where every result was lower than for all previous runs:

>>> print(timeitex2(10000000))
(4.543375627021305, 5.602089578984305, 4.252049928996712, 5.226050494937226, 2.43524838401936, 3.362805937998928, 1.9745253099827096, 3.079829377005808)

So while f-strings are only 6% slower in this test when using logger directly:

>>> print( 4.543375627021305 / 4.252049928996712 )
1.0685141761948531

They're 23% slower when testing using a logger instance:

>>> print( 2.43524838401936 / 1.9745253099827096 )
1.2333335874234423

This is still not much if there isn't much logging per second, of course, so I'd say the discussion about correctness vs. readability trumps this in the general case.

To clarify, this is of course when the logs aren't being emitted, when they are this is moot anyhow, and f-strings seem a slight bit more performant, as originally mentioned.

FrankDMartinez · 2021-02-12T16:52:48Z

For the record, conversations like this, where people get bogged down in what others should do in their projects instead of giving them the tools to do what they determine they should do and project maintainers appear to not have any interest in supporting said tool (in this case, an option), are why I have stopped using so many tools, pylint being no exception.

earonesty · 2021-02-12T17:24:55Z

@thormick that's fine for integers. what about complex structures where str() is extremely expensive. i've had to debug code where str() was hitting an endpoint. i'm surprised the effect is so large just for ints.

@FrankDMartinez it's easy enough to disable. i disable a lot of pylint things. my pylintrc:

disable=broad-except,
        wrong-import-position,
        wrong-import-order,
        trailing-newlines,
        unidiomatic-typecheck,
        no-else-raise,
        bad-continuation,
        line-too-long,
        too-few-public-methods,
        no-else-return,
        missing-final-newline,
        logging-fstring-interpolation,
        trailing-whitespace,
        len-as-condition,               # change this at your peril!, len!=0 is not the same as truthy for all sequences!
        fixme,
        ungrouped-imports,
        invalid-name,
        consider-using-in,
        too-many-ancestors

FrankDMartinez · 2021-02-12T18:41:34Z

@earonesty Thanks. How long have you known about the disabling aspect?

earonesty · 2021-02-12T20:34:45Z

i just thought everyone knew about it :(. bad to assume. didn't notice your first comment was about how to disable it.

couling · 2021-04-19T15:10:32Z

I'm surprised people spend so long arguing the performance case for this and not mention the code-injection attack this check prevents. Personally I'd have thought the issue of causing errors by using f-strings was of greater importance.

If you pass unsanitized input to logging you can get errors if that input happens to contain %s. So a code block as below is prone to error just because some user typed %s.

value = get_user_input()
logger.info(f"User wrote {value}")

Now I know this linter check doesn't catch the trivial logger.info(value). But in my experience it does catch 99/100 of those bugs just because logging some string without describing it is pretty uncommon.

Pierre-Sassoulas · 2021-04-19T15:51:57Z

The configuration for fstring is avaialble since pylint 2.4 and was introduced in eeca6cf.

See, http://pylint.pycqa.org/en/latest/whatsnew/2.4.html?highlight=fstring%20interpolation

Allow the option of f-strings as a valid logging string formatting method.

logging-fstring--interpolation has been merged into logging-format-interpolation to allow the logging-format-style option to control which logging string format style is valid. To allow this, a new fstr value is valid for the logging-format-style option.

I'm going to prevent further discussion here.

This was referenced Aug 21, 2018

W1203 with F-strings #2354

Closed

logging-format-interpolation for f-strings #1788

Closed

giovannipizzi mentioned this issue Jul 11, 2019

Fix bug with never resolving calculation job update futures aiidateam/aiida-core#3155

Merged

AWhetter added a commit that referenced this issue Sep 8, 2019

Allow f-strings as a valid logging string formatting choice

06a33da

Closes #2395

AWhetter mentioned this issue Sep 8, 2019

Allow f-strings as a valid logging string formatting choice #3095

Closed

AWhetter added a commit that referenced this issue Sep 9, 2019

Allow f-strings as a valid logging string formatting choice

bec1b6d

Closes #2395

PCManticore closed this as completed in eeca6cf Sep 10, 2019

belm0 mentioned this issue Jan 24, 2020

regression in new logging-format-style scheme #3361

Closed

AWhetter mentioned this issue Apr 6, 2020

Brought back logging-fstring-interpolation #3475

Merged

4 tasks

ldconejo mentioned this issue May 22, 2020

aaaaaaaaaaaaaaaaaaaaaahhhhhhhhhh UWPCE-PythonCert-ClassRepos/SP_Python220B_2019#722

Merged

tetienne mentioned this issue Jun 2, 2020

Tahoma dimmable light home-assistant/core#36258

Closed

20 tasks

TheCleric mentioned this issue Oct 6, 2020

True unit testing refactor jldbc/pybaseball#142

Merged

drew2a mentioned this issue Oct 7, 2020

Crawl random nodes Tribler/tribler#5611

Merged

svaningelgem mentioned this issue Feb 22, 2021

Use f-strings svenkreiss/pysparkling#144

Merged

droserasprout mentioned this issue Mar 3, 2021

Replace print statements with logging ones baking-bad/pytezos#163

Merged

KapJI mentioned this issue Mar 10, 2021

Use lazy % formatting in logging functions leikoilja/ha-google-home#83

Merged

mikebannis mentioned this issue Mar 31, 2021

Albedo clean up NREL/nsrdb#7

Closed

pylint-dev locked as resolved and limited conversation to collaborators Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

W1202 - logging-fstring-interpolation is not useful #2395

W1202 - logging-fstring-interpolation is not useful #2395

pzelnip commented Aug 8, 2018 •

edited

Loading

Steffan-Ennis commented Aug 16, 2018 •

edited

Loading

FrankDMartinez commented Sep 25, 2018

mcallaghan-bsm commented Jun 3, 2019 •

edited

Loading

earonesty commented Jul 11, 2019

mcallaghan-bsm commented Jul 11, 2019

tristanbrown commented Jul 16, 2019

chdsbd commented Jul 16, 2019

earonesty commented Oct 21, 2019

tristanbrown commented Oct 21, 2019

earonesty commented Oct 21, 2019 •

edited

Loading

mcallaghan-bsm commented Oct 24, 2019

earonesty commented Oct 24, 2019 •

edited

Loading

tristanbrown commented Oct 24, 2019

earonesty commented Oct 24, 2019 •

edited

Loading

thormick commented Oct 23, 2020

FrankDMartinez commented Feb 12, 2021

earonesty commented Feb 12, 2021 •

edited

Loading

FrankDMartinez commented Feb 12, 2021

earonesty commented Feb 12, 2021 •

edited

Loading

couling commented Apr 19, 2021

Pierre-Sassoulas commented Apr 19, 2021

W1202 - logging-fstring-interpolation is not useful #2395

W1202 - logging-fstring-interpolation is not useful #2395

Comments

pzelnip commented Aug 8, 2018 • edited Loading

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Additional context

Steffan-Ennis commented Aug 16, 2018 • edited Loading

FrankDMartinez commented Sep 25, 2018

mcallaghan-bsm commented Jun 3, 2019 • edited Loading

earonesty commented Jul 11, 2019

mcallaghan-bsm commented Jul 11, 2019

tristanbrown commented Jul 16, 2019

chdsbd commented Jul 16, 2019

earonesty commented Oct 21, 2019

tristanbrown commented Oct 21, 2019

earonesty commented Oct 21, 2019 • edited Loading

mcallaghan-bsm commented Oct 24, 2019

earonesty commented Oct 24, 2019 • edited Loading

tristanbrown commented Oct 24, 2019

earonesty commented Oct 24, 2019 • edited Loading

thormick commented Oct 23, 2020

FrankDMartinez commented Feb 12, 2021

earonesty commented Feb 12, 2021 • edited Loading

FrankDMartinez commented Feb 12, 2021

earonesty commented Feb 12, 2021 • edited Loading

couling commented Apr 19, 2021

Pierre-Sassoulas commented Apr 19, 2021

pzelnip commented Aug 8, 2018 •

edited

Loading

Steffan-Ennis commented Aug 16, 2018 •

edited

Loading

mcallaghan-bsm commented Jun 3, 2019 •

edited

Loading

earonesty commented Oct 21, 2019 •

edited

Loading

earonesty commented Oct 24, 2019 •

edited

Loading

earonesty commented Oct 24, 2019 •

edited

Loading

earonesty commented Feb 12, 2021 •

edited

Loading

earonesty commented Feb 12, 2021 •

edited

Loading