Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some input chars (i.e. '++') break re.match #66317

Closed
jpfisher mannequin opened this issue Aug 1, 2014 · 2 comments
Closed

Some input chars (i.e. '++') break re.match #66317

jpfisher mannequin opened this issue Aug 1, 2014 · 2 comments
Labels
topic-regex type-bug An unexpected behavior, bug, or error

Comments

@jpfisher
Copy link
Mannequin

jpfisher mannequin commented Aug 1, 2014

BPO 22119
Nosy @ezio-melotti

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-08-01.21:27:18.635>
created_at = <Date 2014-08-01.20:44:52.594>
labels = ['expert-regex', 'type-bug', 'invalid']
title = "Some input chars (i.e. '++') break re.match"
updated_at = <Date 2014-08-01.21:27:18.634>
user = 'https://bugs.python.org/jpfisher'

bugs.python.org fields:

activity = <Date 2014-08-01.21:27:18.634>
actor = 'ezio.melotti'
assignee = 'none'
closed = True
closed_date = <Date 2014-08-01.21:27:18.635>
closer = 'ezio.melotti'
components = ['Regular Expressions']
creation = <Date 2014-08-01.20:44:52.594>
creator = 'jpfisher'
dependencies = []
files = []
hgrepos = []
issue_num = 22119
keywords = []
message_count = 2.0
messages = ['224518', '224521']
nosy_count = 3.0
nosy_names = ['ezio.melotti', 'mrabarnett', 'jpfisher']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue22119'
versions = ['Python 2.7']

@jpfisher
Copy link
Mannequin Author

jpfisher mannequin commented Aug 1, 2014

Some characters repeated in the pattern break re.match:

Linux python 2.7.6
###################################
# test.py
import re

#diffitem = "libstdc+"   succeeds
#diffitem = "libstdc++"  fails
#diffitem = "libstdc**"  fails
#diffitem = "libstdc.."  succeeds
diffitem = "libstdc+\+"  succeeds
line = "time  1.7-23build1"

result = re.match(diffitem, line)
print result
###################################
$ python  test.py
Traceback (most recent call last):
  File "test.py", line 9, in <module>
    result = re.match(diffitem, line)
  File "/usr/lib/python2.7/re.py", line 137, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python2.7/re.py", line 244, in _compile
    raise error, v # invalid expression
sre_constants.error: multiple repeat

@jpfisher jpfisher mannequin added build The build process and cross-build topic-regex labels Aug 1, 2014
@mrabarnett
Copy link
Mannequin

mrabarnett mannequin commented Aug 1, 2014

In a regex, '+' is a metacharacter meaning "repeated one or more times".

"libstdc+" will match "libstd" followed by "c" repeated one or more times.

"libstdc++" will match "libstd" followed by "c" repeated one or more times, but then there's another "+", which it takes to mean that you want the repeat to be repeated, hence the exception.

'*' is also a metacharacter, this one meaning "repeated zero or more times".

In summary, not a bug.

@ezio-melotti ezio-melotti added invalid type-bug An unexpected behavior, bug, or error and removed build The build process and cross-build labels Aug 1, 2014
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-regex type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant