Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result type depends on order of operands for bytes and bytearray #57507

Open
ncoghlan opened this issue Oct 31, 2011 · 6 comments
Open

Result type depends on order of operands for bytes and bytearray #57507

ncoghlan opened this issue Oct 31, 2011 · 6 comments
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@ncoghlan
Copy link
Contributor

BPO 13298
Nosy @terryjreedy, @ncoghlan, @pitrou, @merwok, @florentx, @meadori, @akheron, @serhiy-storchaka, @iritkatriel

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2011-10-31.00:14:37.194>
labels = ['interpreter-core', 'type-bug', '3.11']
title = 'Result type depends on order of operands for bytes and bytearray'
updated_at = <Date 2021-10-25.21:23:53.485>
user = 'https://github.com/ncoghlan'

bugs.python.org fields:

activity = <Date 2021-10-25.21:23:53.485>
actor = 'iritkatriel'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2011-10-31.00:14:37.194>
creator = 'ncoghlan'
dependencies = []
files = []
hgrepos = []
issue_num = 13298
keywords = []
message_count = 6.0
messages = ['146669', '146890', '146897', '146901', '261281', '405001']
nosy_count = 9.0
nosy_names = ['terry.reedy', 'ncoghlan', 'pitrou', 'eric.araujo', 'flox', 'meador.inge', 'petri.lehtinen', 'serhiy.storchaka', 'iritkatriel']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue13298'
versions = ['Python 3.11']

@ncoghlan
Copy link
Contributor Author

In a recent python-ideas discussion of the differences between concatenation and augmented assignment on lists, I pointed out the general guiding principle behind Python's binary operation semantics was that the type of a binary operation should not depend on the order of the operands. That is "X op Y" and "Y op X" should either consistently create results of the same type ("1 + 1.1", "1.1 + 1") or else throw an exception ("[] + ()", "() + []").

This principle is why list concatenation normally only works with other lists, but will accept arbitrary iterables for augmented assignment. collections.deque exhibits similar behaviour (i.e. strict on the binary operation, permissive on augmented assignment).

However, bytes and bytearray don't follow this principle - they accept anything that implements the buffer interface even in the binary operation, leading to the following asymmetries:

>>> b'' + bytearray()
b''
>>> b'' + memoryview(b'')
b''
>>> bytearray() + b''
bytearray(b'')
>>> bytearray() + memoryview(b'')
bytearray(b'')
>>> memoryview(b'') + b''
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'memoryview' and 'bytes'
>>> memoryview(b'') + bytearray(b'')
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'memoryview' and 'bytearray'

Now, the latter two cases are due to a known problem where returning NotImplemented from sq_concat or sq_repeat doesn't work properly (so none of the relevant method implementations in the stdlib even try), but the bytes and bytearray interaction is exactly the kind of type asymmetry the operand order independence guideline is intended to prevent.

My question is - do we care enough to try to change this? If we do, then it's necessary to decide on more appropriate semantics:

  1. The "list" solution, permitting only the same type in binary operations (high risk of breaking quite a lot of code)
  2. Don't allow arbitrary buffers, but do allow bytes/bytearray interoperability
    2a. always return bytes from mixed operations
    2b. always return bytearray from mixed operations
    2c. return the type of the first operand (ala set.__or__)

Or just accept that this really is more of a guideline than a rule and adjust the documentation accordingly.

@ncoghlan ncoghlan added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Oct 31, 2011
@pitrou
Copy link
Member

pitrou commented Nov 3, 2011

I think the current behaviour is fine, in that the alternatives are not better at all. In the absence of a type inherently "superior" to the others (as float can be to int, except for very large integers :-)), it makes sense to keep the type of the left-hand argument.

Note that .join() has a slightly different behaviour:

>>> b"".join([bytearray(), b""])
b''
>>> bytearray().join([bytearray(), b""])
bytearray(b'')
>>> b"".join([bytearray(), memoryview(b"")])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence item 1: expected bytes, memoryview found

@akheron
Copy link
Member

akheron commented Nov 3, 2011

> Note that .join() has a slightly different behaviour:
> 
> >>> b"".join([bytearray(), b""])
> b''
> >>> bytearray().join([bytearray(), b""])
> bytearray(b'')
> >>> b"".join([bytearray(), memoryview(b"")])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: sequence item 1: expected bytes, memoryview found

I thinks this is worth fixing. Is there an issue already?

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Nov 3, 2011

We can just use this one - it was more in the nature of a question "is there anything we want to change about the status quo?" than a request for any specific change.

I'm actually OK with buffer API based interoperability, but if we're going to offer that, we should be consistent:

  1. bytes and bytearray should interoperate with anything supporting the buffer interface (which they already mostly do)
  2. When they encounter each other, LHS wins (as with set() and frozenset())
  3. We should fix the operand coercion bug for C level concatenation slot implementations (already covered by issue bpo-11477)
  4. Update the documentation as needed

Since we're tinkering with builtin behaviour, 1 & 2 should probably be brought up on python-dev once someone checks if there is anything other than .join() that needs updating.

@serhiy-storchaka
Copy link
Member

An issue with bytes.join() is already fixed (bpo-15958).

@iritkatriel
Copy link
Member

Reproduced on 3.11.

@iritkatriel iritkatriel added the 3.11 only security fixes label Oct 25, 2021
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants