-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate the use of flags not at the start of regular expression #66683
Comments
The meaning of inline flags not at the start of regular expression is ambiguous. Current re implementation and regex in the V0 mode enlarge the scope to all expression. In V1 mode in regex they affect only the end of the expression. I propose to deprecate (and then forbid in 3.7) the use of inline flags not at the start of regular expression. This will help to change the meaning of inline flags in the middle of the expression in future (in 3.8 or later). |
Here is alternative, much simpler, patch. It deprecates only flags in nested subpatterns. No changes needed in tests and other stdlib modules. It is very unlikely that it is used in third party code. |
That sounds a bit random. It wouldn't totally address the discrepancy with regex, would it? |
I think the simplest and clearest approach from the user's point of view is just to deprecate inline flags that are not at the start of the pattern. In practice, they almost invariably occur at the start anyway, although I do remember once seeing a pattern in which the inline flag was at the end! |
No, it will not totally address the discrepancy with regex, but at least it will allow as to change the behavior of flags in subpatterns. And we always can convert a pattern to a subpattern (surround by "(?:" and ")"). For now Python re module is only one regular expression implementation in which flags in the middle of the expression affect all expression. [1] |
Updated patch that deprecates flags not at the start. fnmatch.translate() now uses scoped flags (bpo-433028). |
New changeset 31f8af1c3567 by Serhiy Storchaka in branch 'default': |
Could we include the offending pattern in the deprecation message? I'm attaching a proposed patch. With that patch I can more easily find the offending pattern, whereas before I had no idea: django/django/urls/resolvers.py:101: DeprecationWarning: Flags not at the start of the expression ^(?i)test/2/?$ |
And on further investigation, I'm not sure how to fix the deprecation warnings in Django. We have a urlpattern like this: url(r'^(?i)CaseInsensitive/(\w+)', empty_view, name="insensitive"), The regex string r'^(?i)CaseInsensitive/(\w+)' is later substituted in this line in Django's URL resolver as the if re.search('^%s%s' % (re.escape(_prefix), pattern), candidate_pat % candidate_subs, re.UNICODE): It seems Django would need to extract any flags from |
@tim: Why are you using re.search with '^'? Does the pattern that's passed in ever contain '(?m)'? If not, re.match without '^' is better. |
Looks like we could remove the '^', but it doesn't resolve the deprecation warnings. The inline flags in |
Thanks Tim, this is great idea! I consider this as usability bug and going to apply a fix to 3.6. But regular expression can be generated and be very long. I think it should be truncated before including in a warning message. As for Django, you can use (?i:CaseInsensitive) in 3.7 (unless _prefix is case sensitive, but you want to make it case sensitive if pattern is case sensitive). |
In tests you can either add re.escape(), or escape special characters manually (r'\(\?i\)'). What you prefer. |
Adding an updated patch. I guess the (?i:CaseInsensitive) syntax isn't merged yet? I tried it but it didn't work. It might be premature to proceed with this deprecation if that alternative isn't already present. Is there an issue for it? |
I downloaded Python 3.6.0b1 not long after it was released and it works for me: >>> re.match('(?i:CaseInsensitive)', 'caseinsensitive')
<_sre.SRE_Match object; span=(0, 15), match='caseinsensitive'> |
Yes, I found that Django needs an update to support that syntax in URLpatterns. Thanks. |
New changeset c35a528268fd by Serhiy Storchaka in branch '3.6': New changeset 9d0f4da4d531 by Serhiy Storchaka in branch 'default': |
Patch LGTM (but I changed tests a little). Thanks Tim! |
New changeset c04a56b3a4f2 by Serhiy Storchaka in branch '3.6': New changeset ded9a3c3bbb6 by Serhiy Storchaka in branch 'default': |
Maybe there should be introduced some method to merge patterns as just piping patterns will not work: |
See also https://bugs.python.org/issue33658 |
Misc/NEWS
so that it is managed by towncrier #552Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: