index() and count() methods of bytes and bytearray should accept byte ints #56379
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee = None closed_at = <Date 2011-10-20.21:58:46.441> created_at = <Date 2011-05-24.19:49:09.812> labels = ['interpreter-core', 'type-bug'] title = 'index() and count() methods of bytes and bytearray should accept byte ints' updated_at = <Date 2012-06-26.07:28:04.227> user = 'https://bugs.python.org/max-alleged'
activity = <Date 2012-06-26.07:28:04.227> actor = 'python-dev' assignee = 'none' closed = True closed_date = <Date 2011-10-20.21:58:46.441> closer = 'pitrou' components = ['Interpreter Core'] creation = <Date 2011-05-24.19:49:09.812> creator = 'max-alleged' dependencies =  files = ['22733', '23465'] hgrepos =  issue_num = 12170 keywords = ['patch', 'needs review'] message_count = 22.0 messages = ['136786', '136787', '136878', '136883', '140903', '141016', '141033', '141216', '141490', '141491', '145797', '145799', '145929', '145931', '146055', '146056', '148232', '148237', '148287', '149722', '149723', '164054'] nosy_count = 12.0 nosy_names = ['rhettinger', 'terry.reedy', 'jcea', 'pitrou', 'vstinner', 'ezio.melotti', 'eric.araujo', 'flox', 'xuanji', 'max-alleged', 'python-dev', 'petri.lehtinen'] pr_nums =  priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue12170' versions = ['Python 3.3']
The text was updated successfully, but these errors were encountered:
Bytes objects when indexed provide integers, but do not accept them to many functions, making them inconsistent with other sequences.
Basic example: >>> test = b'012' >>> n = test >>> n 49 >>> n in test True >>> test.index(n) TypeError: expected an object with the buffer interface.
It is certainly unusual for n to be in the sequence, but not to be able to find it. I would expect the result to be 1. This set of commands with list, strings, tuples, but not bytes objects.
I suspect, from issue bpo-10616, that all the following functions would be affected:
It would make more sense to me that instead of only supporting buffer interface objects, they also accept a single integer, and treat it as if it were provided a length-1 bytes object.
The use case I came across this problem was something like this:
Given seq1 and seq2, sequences of the same type:
This works for strings, lists, tuples, but not bytes.
Agreed. Doc Lib: 4.6. Sequence Types — str, bytes, bytearray, list, tuple, range says '''
>>> test = b'0120' >>> z = b'0' >>> zo = ord(z) >>> z in test True >>> zo in test True >>> test.index(z) 0 >>> test.index(zo) ... TypeError: expected an object with the buffer interface >>> test.count(z) 2 >>> test.count(zo) ... TypeError: expected an object with the buffer interface # longer subsequences like b'01' also work
So I think the code for 3.2+ bytes.count() and bytes.index() should do the same branching as the code for bytes.__contains__.
The other functions you list, including .rindex are not general sequence functions but are string functions defined as taking subsequences as inputs. So they would never be used in generic code like .count and .index can be.
I think it would make sense for the string methods to also accept single ints where possible as well:
For haystack and needles both strings:
For both bytes, it's a bit contortionist:
One ends up doing a lot of the [i:i+1] bending when using bytes functions.
Attached a patch with the following changes:
Allow an integer argument in range(0, 256) for the following bytes and
The bytes methods were changed to use the new buffer protocol instead
Tests for all the modified functions were expanded to cover the new
A paragraph describing the additional semantics of the five methods
The error messages of index and rindex were left untouched
The docstrings were also left unchanged, as I couldn't find a good
And finally, there's one thing that I'm unsure of:
When an integer out of range(0, 256) is passed as the first argument,
ValueError = Inappropriate argument value (of correct type). TypeError = Inappropriate argument type.
Then the users should check if the value is in range(256) before passing it to (r)index.
That sounds reasonable. OverflowError would have been another choice, but I agree that consistency with __contains__ is sensible.
Doc/library/stdtypes.rst needs a "versionadded" tag for the additional semantics.
Also, the patch doesn't compile fine on current default:
In file included from Objects/unicodeobject.c:487:0:
I'd say you need to either define your function as STRINGLIB(parse_args_finds_byte) (to avoid name collisions), or avoid defining it if STRINGLIB_IS_UNICODE.