Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A bug related to matching the empty string #54741

Closed
lanyjie mannequin opened this issue Nov 25, 2010 · 3 comments
Closed

A bug related to matching the empty string #54741

lanyjie mannequin opened this issue Nov 25, 2010 · 3 comments
Labels
topic-regex type-bug An unexpected behavior, bug, or error

Comments

@lanyjie
Copy link
Mannequin

lanyjie mannequin commented Nov 25, 2010

BPO 10532
Nosy @ned-deily

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2010-11-26.20:31:54.710>
created_at = <Date 2010-11-25.16:39:12.326>
labels = ['expert-regex', 'type-bug', 'invalid']
title = 'A bug related to matching the empty string'
updated_at = <Date 2010-11-26.20:31:54.708>
user = 'https://bugs.python.org/lanyjie'

bugs.python.org fields:

activity = <Date 2010-11-26.20:31:54.708>
actor = 'ned.deily'
assignee = 'none'
closed = True
closed_date = <Date 2010-11-26.20:31:54.710>
closer = 'ned.deily'
components = ['Regular Expressions']
creation = <Date 2010-11-25.16:39:12.326>
creator = 'lanyjie'
dependencies = []
files = []
hgrepos = []
issue_num = 10532
keywords = []
message_count = 3.0
messages = ['122380', '122384', '122476']
nosy_count = 3.0
nosy_names = ['ned.deily', 'mrabarnett', 'lanyjie']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue10532'
versions = ['Python 3.2']

@lanyjie
Copy link
Mannequin Author

lanyjie mannequin commented Nov 25, 2010

Here are some puzzling results I have got (I am using Python 3, I suppose similar results for python 2).

When I do the following, I got an exception:
>>> re.findall('(d*)*', 'adb')
>>> re.findall('((d)*)*', 'adb')

When I do this, I am fine but the result is wrong:
>>> re.findall('((.d.)*)*', 'adb')
[('', 'adb'), ('', '')]

Why is it wrong?

The first mactch of groups:
('', 'adb')
indicates the outer group ((.d.)*) captured
the empty string, while the inner group (.d.)
captured 'adb', so the outer group must have
captured the empty string at the end of the
provided string 'adb'.

Once we have matched the final empty string '',
there should be no more matches, but we got
another match ('', '')!!!

So, findall matched the empty string in
the end of the string twice!!!

Isn't this a bug?

Yingjie

@lanyjie lanyjie mannequin added topic-regex type-bug An unexpected behavior, bug, or error labels Nov 25, 2010
@mrabarnett
Copy link
Mannequin

mrabarnett mannequin commented Nov 25, 2010

The spans say this:

>>> for m in re.finditer('((.d.)*)*', 'adb'):
    print(m.span())

(0, 3)
(3, 3)

There's an non-empty match followed by an empty match.

IHMO, not a bug.

@ned-deily
Copy link
Member

Closing this issue since it appears to not be a bug.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-regex type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant