-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
make bytes/bytearray translate's delete a keyword argument #71693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Write a patch to make bytes/bytearray.translate's delete argument support acting as keyword arguments. This won't break any backwards compatibility and make the method more flexible to use. Besides, in the C code level, it stops using argument clinic's legacy optional group feature and removes the unnecessary group_right_1 parameter. |
Hmm, David, that may be not quite right. Users only reading the doc never know it's deletechars not delete. The doc is always delete, though conflicting with __doc__. >>> print(bytes.translate.__doc__)
B.translate(table[, deletechars]) -> bytes
... I deliberately change deletechars to delete to keep consistent with doc. But actually I think using deletechars won't break backwards compatibility too. |
Ah, I was looking at the 2.7 docs. |
Please review the new version. It makes two changes comparing with the last one.
|
Instead of allowing delete=None (which is not in the RST documentation), perhaps it is possible to change the doc string. I can’t remember the details, but I think Argument Clinic allows a virtual Python-level default, something like “object(py_default=b"") = NULL”. Also, I think I like the change. What do you think about making the first argument optional (default to None), allowing calls like x.translate(delete=b'aeiou')? |
Thanks for your comment Martin. I'll apply them later when we reach agreement on functions. I have already used object = NULL, the C default is not necessary here, and it works as you like I think. In patch version 1, b'abc'.translate(None, None) raises exception as before. I change it in patch version 2 because argument clinic generates function signature as "($self, table, /, delete=None)". So I don't want users get surprised when they provide None as the signature but get an exception. And using None as a placeholder for a keyword argument is normal in Python. But I'm OK to keep the previous behaviour and actually I prefer that. As for making the first argument optional, I don't quite like that since the doc seems to encourage users to set None explicitly. |
This patch is what I had in mind for setting the documented default as delete=b'', but using NULL internally. I also changed it to allow the table argument to be omitted. We can change the documentation accordingly. These are just suggestions; use either or both aspects as you please :) |
LGTM. Using b'' instead of the None as the default value of *delete* looks better since it doesn't break backwards compatibility. As for the first argument optional or not, actually it's both okay. You have changed the doc accordingly. |
Serhiy, you assigned this to yourself. What do you think of my patch? |
PyArg_ParseTupleAndKeywords can be slower than PyArg_ParseTuple even for positional arguments. We need benchmarking results (especially after committing a patch for bpo-27574). What is the purpose of adding support of the delete argument as keyword arguments? It looks to me, that the only purpose is allowing to specify the delete argument without specifying the table argument. There are two alternative ways to achieve this: make translate() accepting some special value (e.g. None) as the default value for the first argument:
or make translate() accepting the delete argument as keyword argument:
The patch does both things, but only one is needed. If add the support of the delete argument as keyword argument, I would prefer to not add the support of None as the first argument, but would specify its default value as bytes(range(256)):
I don't know why optional group was used here, the function could be implemented without it. |
I agree it would be worth checking for a slowdown. As well as giving the option of omitting the table argument, it would make call sites easier to read. It would avoid suggesting that the first argument is translated to the second, like maketrans(). data = data.translate(YENC_TABLE, delete=b"\r\n") Translate() already accepts None as the first argument; this is not new: >>> b"hello".translate(None, b"l")
b'heo' I guess the optional group was used as a way of making the second argument optional without a specific default value. |
So let's do a simple benchmark. # without patch ./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete)' # with patch ./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete)' # keyword specified ./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete=delete)' From my observation, the difference between PyArg_ParseTupleAndKeywords and PyArg_ParseTuple when parsing positional arguments is very small. This means it won't make old code slowdown by a large percent. And when keyword argument is specified, there is a degrade. But I think this happens everywhere using PyArg_ParseTupleAndKeywords. |
Technically the patch looks correct to me. Added just few minor comments on Rietveld. I don't think there is a large need in adding the support of keyword argument. But since the overhead is small and somebody needs this, adding this doesn't do a harm. Left it on you Martin. |
I can look at enhancing the tests at some stage, but it isn’t a high priority for me. Regarding translate() with no arguments, it makes sense if you see it as a kind of degenerate case of neither using a translation table, nor any set of bytes to delete: x.translate() == x.translate(None, b"") I admit it reads strange and probably isn’t useful. If people dislike it, it might be easiest to just add the keyword support and keep the first parameter as mandatory: without_nulls = bytes_with_nulls.translate(None, delete=b"\x00") |
Martin, I write the v3 patch to apply the comments. It preserves *table* as mandatory and move the test_translate to BaseBytesTest to remove duplicates. |
Looks pretty good thanks Xiang. There’s one English grammar problem in a comment (see review), but I can fix that when I commit. |
New changeset 6ab1b54245d5 by Martin Panter in branch 'default': |
Yay, thanks for your work, Martin. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: