-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bytes.empty_buffer and deprecate bytes(17) for the same purpose #65094
Comments
--> bytes(7) results in a bytes object containing that many zeroes. I propose that this behavior be deprecated for eventual removal, and a class method be created to take its place. |
Class method is not needed. This is just b'\0' * 7. |
I don't have a strong opinion on this, but I think you are going to have to articulate a good use/usability case for the deprecation. I'm sure this is used in the wild, and we don't just gratuitously break things :) |
I would think the argument for deprecation is that usually, people type bytes(7) or bytes(somesmallintvalue) expecting to create a length one bytes object using that value (happens by accident if you iterate a bytes object and forget it's an iterable of ints, not an iterable of len 1 bytes). It's really easy to forget to make it bytes([7]) or bytes((7,)) or what have you. If you make the same mistake with str, list, tuple, etc., you get an error, because they only accept iterables. But bytes silently behaves in a way that is inconsistent with the other sequence types. Given that b'\0' * 7 is usually faster in any event (by avoiding lookup costs to find the bytes constructor) and more intuitive to people familiar with the Python sequence idiom, I could definitely see this as a redundancy that does nothing but confuse. |
I agree with Serhiy that the method is not needed in any case. I was about to post the same missing rationale: people misunderstand 'bytes(7)' and write it expecting to get bytes([7]) == b(\x07'), so it would be better to make bytes(7) raise instead of silently accepting a buggy usage. I was thinking that one rationale for bytes(n) might be that it is faster than b'\0' * n. Since Josh claimed the contrary, I tried to test with timeit.repeat (both console and Idle) and got this error message I think this issue should be closed. Deprecation ideas should really be posted of python-ideas and ultimately pydev for discussion and approval. If Ethan wants to pursue the idea, he should research the design discussions for bytes() (probably on the py3k list) and whether Guido directly approved of bytes(n) or if someone else 'snuck' it in after the initial approval. |
Terry: You forgot to use a raw string for your timeit.repeat check, which is why it blew up. It was evaluating the \0 when you defined the statement string itself, not the contents. If you use r'b"\0" * 7' it works just fine by deferring backslash escape processing until the string is actually eval-ed, rather than when you create the string. For example, on my (admittedly underpowered) laptop (Win7 x64, Py 3.3.0 64-bit): >>> min(timeit.repeat(r'b"\0" * 7'))
0.07514287752866267
>>> min(timeit.repeat(r'bytes(7)'))
0.7210309422814021
>>> min(timeit.repeat(r'b"\0" * 7000'))
0.8994351749659302
>>> min(timeit.repeat(r'bytes(7000)'))
2.06750710129117 For a short bytes, the difference is enormous (as I suspected, the lookup of bytes dominates the runtime). For much longer bytes, it's still winning by a lot, because the cost of having the short literal first, then multiplying it, is still trivial next to the lookup cost. P.S. I made a mistake: str does accept an int argument (obviously), but it has completely different meaning. |
I'm inclined to leave it open while I do the suggested research. Thanks for the tips, Terry, and the numbers, Josh. |
AFAIK, bytes(int) is a remnant from times when bytes was mutable. Then bytes was split to non-mutable bytes and mutable bytearray and this constructor was forgotten. I'm +0 for deprecation. |
Python 2.7.3 (default, Sep 26 2012, 21:51:14) --> bytes(5) --> bytearray(5) Creating a buffer of null bytes makes sense for bytearray, which is mutable; it does not make sense, and IMHO only causes confusion, to have bytes return an /immutable/ sequence of zero bytes. |
Bringing over Barry's suggestion from the current python-ideas thread [1]: @classmethod
def fill(cls, length, value=0):
# Creates a bytes of given length with given fill value [1] https://mail.python.org/pipermail/python-ideas/2014-March/027305.html |
Why would we need bytes.fill(length, value)? Is b'\xVV' * length (or if value is a variable containing int, bytes((value,)) * length) unreasonable? Similarly, bytearray(b'\xVV) * length or bytearray((value,)) * length is both Pythonic and performant. Most sequences support multiplication so simple stuff like this can be done easily and consistently; why invent a new approach unique to bytes/bytearrays? |
Also, to me 'fill' implies something is being filled, not that something is being created. |
The fill() name makes more sense for the bytearray variant, it is just provided on bytes as well for consistency. As Serhiy notes above, the current behaviour is almost certainly just a holdover from the original "mutable bytes" design that didn't survive into the initial 3.0 release. |
Under the name "from_len", this is now part of a larger proposal to improve the consistency of the binary APIs: http://www.python.org/dev/peps/pep-0467/ |
May we close this as superceded by PEP-467? |
Superseded by PEP-467. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: