-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Wrong keyword parameter name in regex pattern methods #64482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Documented (in docstring and in ReST documentation) signatures of the match, search and (since 3.4) fullmatch methods of regex pattern object are: match(string[, pos[, endpos]])
search(string[, pos[, endpos]])
fullmatch(string[, pos[, endpos]]) However in implementation the first keyword argument by mistake named "pattern". This looks as nonsense. The pattern is object itself, and first argument is a string. First arguments in other methods (split, findall, etc) named "string", and module-level functions have both "pattern" and "string" parameters: match(pattern, string, flags=0)
search(pattern, string, flags=0) I think we should fix this mistake. The "pattern" name is obviously wrong and is not match the documentation. |
How nasty. I agree that this is a code bug. Unfortunately in this case, the C code does keyword matching of arguments and 'corrects' the doc for anyone who tries 'string='. >>> pat.search(string='xabc', pos=1)
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
pat.search(string='xabc', pos=1)
TypeError: Required argument 'pattern' (pos 1) not found
>>> pat.search(pattern='xabc', pos=1)
<_sre.SRE_Match object; span=(1, 4), match='abc'> I think we should only change this in 3.4 (and should do so in 3.4). |
Actually, several other methods also have wrong parameter name, "source" instead of "string". |
If no one else pipes up here, perhaps ask on pydef about changing C names to match documented names. |
Here is patch for 3.3 which adds alternative parameter name. Now both keyword names are allowed, but deprecation warning is emitted if old keyword name is used. >>> import re
>>> p = re.compile('')
>>> p.match()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Required argument 'string' (pos 1) not found
>>> p.match('')
<_sre.SRE_Match object at 0xb705c598>
>>> p.match(string='')
<_sre.SRE_Match object at 0xb705c720>
>>> p.match(pattern='')
__main__:1: DeprecationWarning: The 'pattern' keyword parameter name is deprecated. Use 'string' instead.
<_sre.SRE_Match object at 0xb705c758>
>>> p.match('', string='')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Argument given by name ('string') and position (1)
>>> p.match('', pattern='')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Argument given by name ('pattern') and position (1) |
Great. Old and new both in at least one release, when possible, is best. I should have thought of asking if that would be possible. In this case, I think the (undocumented) old should disappear in 3.5. Since the mistaken 'pattern' name is not documented now, I would not add anything to the doc. I would augment the the warning The patch did not upload correctly. I just see "Modules/_sre.c | 64 �[32m+++++++++++++++++++++++++++++++++++++++�[0;39m�[36m!!!!!!!!!!!!!!!!!!�[0;39m |
Oh, sorry. Here is correct patch. I propose to apply "soft" patch (which preserves support for old keyword parameter name) to 2.7 and 3.3, and apply "hard" patch (which just renames keyword parameter name) to 3.4. Or we can just apply "hard" patch (it's much simpler) to all versions. |
For 3.3 I prefer the "soft" patch. |
Georg: you're accepting this patch into 3.3? I'm surprised. I would only want the "soft" approach. But I haven't said "yes" yet. I want to discuss it a little more. (Hey, it's python core dev. Discussing things endlessly is our job.) |
If you want the "soft" approach, then you should revert your changes to |
You can do it, if I accept the patch for 3.4. There's no point in doing it in two stages. |
Alternatively, we could use this cheap hack: /*[python input] [python start generated code]*/ /*[clinic input] _sre.SRE_Pattern.match as pattern_match
... |
Larry, so what is your decision?
|
Use #3. |
"pattern" should be keyword-only, and if used the function should generate a DeprecationWarning. |
Here is a patch with the show_in_signature hack for 3.4. |
The patch sre_deprecate_pattern_keyword-3.4.patch looks good to me. I *think* that Larry has pre-approved it for 3.4. If it is applied, and if people still think that 2.7 and 3.3 need to be changed, the release-critical status should be removed from the issue. |
The disadvantage of sre_deprecate_pattern_keyword-3.4.patch is that it creates |
Why can't you remove the "= NULL" from the Clinic input for "string"? |
Because this will prohibit the use of "pattern" as keyword argument. |
We are close to Python 3.4 final, so what is the status of this issue? I don't see any commit and nothing to cherry-pick in Larry's 3.4.0 repository. |
Since there is no consensus on how to resolve this issue, I'm dropping the release-critical status for it; people should now consider whether a future agreed-upon solution could apply to 3.4.1 or just to 3.5. |
Serhiy: the patch is incomplete; it lacks test cases. |
Here is a test. |
New changeset 52743dc788e6 by Serhiy Storchaka in branch '3.3': New changeset f4d7abcf8080 by Serhiy Storchaka in branch 'default': |
New changeset 52256a5861fa by Serhiy Storchaka in branch '2.7': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: