Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TextWrapper fails to split 'two-and-a-half-hour' correctly #69946

Closed
samwyse mannequin opened this issue Nov 29, 2015 · 2 comments
Closed

TextWrapper fails to split 'two-and-a-half-hour' correctly #69946

samwyse mannequin opened this issue Nov 29, 2015 · 2 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@samwyse
Copy link
Mannequin

samwyse mannequin commented Nov 29, 2015

BPO 25760
Nosy @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2015-11-29.06:20:55.852>
created_at = <Date 2015-11-29.02:07:24.100>
labels = ['type-bug', 'library']
title = "TextWrapper fails to split 'two-and-a-half-hour' correctly"
updated_at = <Date 2015-11-29.06:20:55.834>
user = 'https://bugs.python.org/samwyse'

bugs.python.org fields:

activity = <Date 2015-11-29.06:20:55.834>
actor = 'serhiy.storchaka'
assignee = 'none'
closed = True
closed_date = <Date 2015-11-29.06:20:55.852>
closer = 'serhiy.storchaka'
components = ['Library (Lib)']
creation = <Date 2015-11-29.02:07:24.100>
creator = 'samwyse'
dependencies = []
files = []
hgrepos = []
issue_num = 25760
keywords = []
message_count = 2.0
messages = ['255558', '255561']
nosy_count = 2.0
nosy_names = ['samwyse', 'serhiy.storchaka']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue25760'
versions = ['Python 2.7', 'Python 3.2', 'Python 3.3', 'Python 3.4']

@samwyse
Copy link
Mannequin Author

samwyse mannequin commented Nov 29, 2015

Single character words in a hyphenated phrase are not split correctly. The root issue it the wordsep_re class variable. To reproduce, run the following:

>>> import textwrap
>>> textwrap.TextWrapper.wordsep_re.split('two-and-a-half-hour')
['', 'two-', 'and-a', '-half-', 'hour']

It works if 'a' is replaces with two or more alphabetic characters.

>>> textwrap.TextWrapper.wordsep_re.split('two-and-aa-half-hour')
['', 'two-', '', 'and-', '', 'aa-', '', 'half-', 'hour']

The problem is in this part of the pattern: (?=\w+[^0-9\\W])

I confess that I don't understand the situation that would require that complicated of a pattern. Why wouldn't (?=\w) would work?

@samwyse samwyse mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Nov 29, 2015
@serhiy-storchaka
Copy link
Member

Already fixed in bpo-22687.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant