-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
re.groupindex is available for modification and continues to work, having incorrect data inside it #58468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
>>> import re
>>> p = re.compile(r'abc(?P<n>def)')
>>> p.sub(r'\g<n>', 'abcdef123abcdef')
'def123def'
>>> p.groupindex['n'] = 2
>>> p.sub(r'\g<n>', 'abcdef123abcdef')
'def123def'
>>> p.groupindex
{'n': 2}
>>> |
The re module creates the dict purely for the benefit of the user, and as it's a normal dict, it's mutable. An alternative would to use an immutable dict or dict-like object, but Python doesn't have such a class, and it's probably not worth writing one just for this use-case. |
Matthew Barnett wrote:
this dict affects on regex.sub() >>> import re
>>> p = re.compile(r'abc(?P<n>def)')
>>> p.groupindex
{'n': 1}
>>> p.groupindex['n'] = 2
>>> p.sub(r'\g<n>', 'abcdef')
Traceback (most recent call last):
File "/usr/local/lib/python3.2/sre_parse.py", line 811, in expand_template
literals[index] = s = g(group)
IndexError: no such group
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.2/re.py", line 286, in filter
return sre_parse.expand_template(template, match)
File "/usr/local/lib/python3.2/sre_parse.py", line 815, in expand_template
raise error("invalid group reference")
sre_constants.error: invalid group reference
>>> |
It appears I was wrong. :-( The simplest solution in that case is for it to return a _copy_ of the dict. |
But regex.sub is affected only if you manually muck with the dict, right? If so, then it looks like a case of “it hurts when I do this” (the doctor’s reply: “Don’t do this.”) |
the first message shows how it can work with a broken dict Éric Araujo wrote:
I can get code from anywhere |
|
I take someone's code now it is depending on the cache, when this exception is raised Éric Araujo wrote:
when people do something wrong, python should raise an exception |
Looks like a case for a read-only dict/dictproxy :) |
I fully agree with Éric. Just don't do this. |
I'm not so sure. If dicts or classes are used for configuration An example of the first is the decimal context, where it was possible The mistake here is maybe less likely, but I agree with Georg that |
I propose using a MappingProxy type in 3.4 and add an example to the docs for stable versions. |
Copy or proxy may affect performance. We will need to make benchmarks to see how much. |
Here are two patches which implement two alternative solutions. They are based on regex code. Dict copying patch matches current regex behavior and needs modifying other code to avoid small slowdown. Artificial example: $ ./python -m timeit -s 'import re; n = 100; m = re.match("".join("(?P<g%d>.)" % g for g in range(n)), "x" * n); t = ",".join(r"\g<g%d>" % g for g in range(n))' -- 'm.expand(t)' Without patch: 7.48 msec per loop While stdlib code can be modified, this patch can cause small slowdown of some third-party code. Dict proxying patch has no performance effect, but it is slightly less compatible. Some code can accept dict but not dict-like object. |
Ping. |
What approach looks better, a copy or a read-only proxy? |
@serhiy Storchaka
ISTM, your proxy patch is better, because it expects an exception rather than silence. |
New changeset 4d5826fa77a1 by Serhiy Storchaka in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: