Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad parsing of compiling regex with re.MULTILINE #47837

Closed
misha mannequin opened this issue Aug 18, 2008 · 2 comments
Closed

Bad parsing of compiling regex with re.MULTILINE #47837

misha mannequin opened this issue Aug 18, 2008 · 2 comments
Labels
topic-regex type-bug An unexpected behavior, bug, or error

Comments

@misha
Copy link
Mannequin

misha mannequin commented Aug 18, 2008

BPO 3587
Nosy @pitrou

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2008-08-18.13:25:09.890>
created_at = <Date 2008-08-18.12:02:22.428>
labels = ['expert-regex', 'type-bug', 'invalid']
title = 'Bad parsing of compiling regex with re.MULTILINE'
updated_at = <Date 2008-08-18.13:25:09.884>
user = 'https://bugs.python.org/misha'

bugs.python.org fields:

activity = <Date 2008-08-18.13:25:09.884>
actor = 'pitrou'
assignee = 'none'
closed = True
closed_date = <Date 2008-08-18.13:25:09.890>
closer = 'pitrou'
components = ['Regular Expressions']
creation = <Date 2008-08-18.12:02:22.428>
creator = 'misha'
dependencies = []
files = []
hgrepos = []
issue_num = 3587
keywords = []
message_count = 2.0
messages = ['71323', '71326']
nosy_count = 2.0
nosy_names = ['pitrou', 'misha']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue3587'
versions = ['Python 2.5', 'Python 2.4']

@misha
Copy link
Mannequin Author

misha mannequin commented Aug 18, 2008

import re
regex = r"[\w]+"
# Normal behaviour:
>>> re.findall(regex, "hello world", re.M)
['hello', 'world']
>>> re.compile(regex).findall("hello world")
['hello', 'world']

# Bug behaviour:
>>> re.compile(regex).findall("hello world", re.M)
['rld']

@misha misha mannequin added topic-regex type-bug An unexpected behavior, bug, or error labels Aug 18, 2008
@pitrou
Copy link
Member

pitrou commented Aug 18, 2008

The re.M flag is an attribute of the compiled pattern, and as such it
must be passed to compile(), not to findall().

These all work:

>>> re.compile(r"[a-z]+").findall("hello world")
['hello', 'world']
>>> re.compile(r"[a-z]+", re.M).findall("hello world")
['hello', 'world']
>>> re.compile(r"(?m)[a-z]+").findall("hello world")
['hello', 'world']

The second argument to the findall() method of compile objects is the
start position to match from (see
http://docs.python.org/lib/re-objects.html). This explains the behaviour
you are witnessing:

>>> re.M
8
>>> re.compile(r"[a-z]+").findall("hello world", 8)
['rld']

@pitrou pitrou closed this as completed Aug 18, 2008
@pitrou pitrou added the invalid label Aug 18, 2008
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-regex type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant