Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize textwrap.indent() #107369

Closed
methane opened this issue Jul 28, 2023 · 1 comment
Closed

Optimize textwrap.indent() #107369

methane opened this issue Jul 28, 2023 · 1 comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@methane
Copy link
Member

methane commented Jul 28, 2023

Current code:

def indent(text, prefix, predicate=None):
    """Adds 'prefix' to the beginning of selected lines in 'text'.

    If 'predicate' is provided, 'prefix' will only be added to the lines
    where 'predicate(line)' is True. If 'predicate' is not provided,
    it will default to adding 'prefix' to all non-empty lines that do not
    consist solely of whitespace characters.
    """
    if predicate is None:
        def predicate(line):
            return line.strip()

    def prefixed_lines():
        for line in text.splitlines(True):
            yield (prefix + line if predicate(line) else line)
    return ''.join(prefixed_lines())
  • predicate = str.strip is faster than def predicate(line)
  • ''.join(x) converts input iterable to sequence. Using generator just makes overhead.
  • creating temporary prefix + line is avoidable.

Linked PRs

@methane methane added type-feature A feature request or enhancement performance Performance or resource usage stdlib Python modules in the Lib dir and removed type-feature A feature request or enhancement labels Jul 28, 2023
@methane
Copy link
Member Author

methane commented Jul 28, 2023

import timeit
import textwrap

with open("Objects/unicodeobject.c") as f:
    text = f.read()

it = timeit.Timer(lambda: textwrap.indent(text, " "*4))
result = it.repeat(number=1000)
result.sort()
print(f"indent {len(text.splitlines())} lines.")
print(f"{result[0]:.4f}msec")

before: 2.8msec
after: 2.1msec (-25%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir
Projects
None yet
Development

No branches or pull requests

2 participants