Should we allow "bytearray" #27

GoogleCodeExporter · 2015-04-10T16:42:24Z

We have to decide what to do with bytearray in
Transport.write() and sendto() methods.

Should we allow bytearray?

Original issue reported on code.google.com by fafhr...@gmail.com on 27 Mar 2013 at 11:41

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2015-04-10T16:42:25Z

I think we should support bytearray in all places where we support bytes, but 
in all cases where we hold on to it after returning from the function we should 
make a copy -- either extend another bytearray or append a bytes() copy to a 
list.

Original comment by gvanrossum@gmail.com on 27 Mar 2013 at 11:43

GoogleCodeExporter · 2015-04-10T16:42:25Z

Original comment by gvanrossum@gmail.com on 16 Oct 2013 at 4:42

Added labels: Priority-Low
Removed labels: Priority-Medium

GoogleCodeExporter · 2015-04-10T16:42:25Z

Issue 87 has been merged into this issue.

Original comment by gvanrossum@gmail.com on 21 Nov 2013 at 12:54

GoogleCodeExporter · 2015-04-10T16:42:25Z

To explain the issue to the creator of issue 87: if we allowed bytearray or 
e.g. memoryview or arrays, we'd have to copy them before storing them in the 
buffer, because otherwise the bytearray's contents might change while we're 
waiting for the I/O to happen. Given that we have to copy anyway, the caller 
might as well do that -- and this restriction on the API then becomes a gentle 
hint that using bytearray may actually be *less* efficient than using bytes.

So I am actually thinking of closing this as won't fix, but I'll entertain some 
discussion first.

(Note that you may be able to avoid the copy if the send() happens inline, 
because in that case the data is never added to the buffer; but (a) we only do 
that for TCP and UDP using selectors; we always append to the buffer when using 
SSL/TLS, and the Windows Proactor also always appends to the buffer. So, again 
the seeming optimization might backfire depending on your transport type.)

Original comment by gvanrossum@gmail.com on 21 Nov 2013 at 1:01

GoogleCodeExporter · 2015-04-10T16:42:25Z

In my case, I use bytearray for performance reasons, so I won't like to have an 
extra copy every time I sent data. Would it make sense to accept memoryview 
objects as well in all write methods? This would also avoid the slice copy when 
the socket only writes part of the bytearray. I would leave to the user the 
responsibility to not alter the data until the bytearray has been completely 
sent. A new method in Transport to wait for all writers in Transport._sock to 
be cleared would help. If the idea makes sense I can submit a patch for it.

Original comment by llpam...@gmail.com on 21 Nov 2013 at 1:35

GoogleCodeExporter · 2015-04-10T16:42:25Z

Can you tell me how you create the bytearray? Do you know how much the copy 
would cost you? (It's easy enough to benchmark asyncio with the assert taken 
out if you want to. :-)

Leaving the user the responsibility to not modify the data is unsatisfactory -- 
plus, the user could modify the *size* of the data which could break 
invariants/assumptions of the transport.

Waiting until the buffer is completely flushed can be done using a combination 
of calling transport.set_write_buffer_limits(0) and defining a 
protocol.resume_writing() callback. (Check the definition of drain() in 
streams.py for a usage example.)

Original comment by gvanrossum@gmail.com on 21 Nov 2013 at 4:48

GoogleCodeExporter · 2015-04-10T16:42:26Z

I'll do the benchmark. 

In the meanwhile, I also detected that you cannot write memoryview objects to a 
transport due the "b''.join()" line concatenating the bytes in the buffer. I 
solved that patching the _write_ready method as follows:

    def _write_ready(self):
        while self._buffer:
            data = self._buffer.popleft()
            try:
                n = self._sock.send(data)
            except (BlockingIOError, InterruptedError):
                self._buffer.appendleft(data)  # Still need to write this.
                break
            except Exception as exc:
                self._loop.remove_writer(self._sock_fd)
                self._fatal_error(exc)
            else:
                data = data[n:]
                if data:
                    self._buffer.appendleft(data)  # Still need to write this.
                    break

        self._maybe_resume_protocol()
        if not self._buffer:
            self._loop.remove_writer(self._sock_fd)
            if self._closing:
                self._call_connection_lost(None)
            elif self._eof:
                self._sock.shutdown(socket.SHUT_WR)

It avoids some extra copies when the buffer grows.
Does it break any internal compatibility?

Original comment by llpam...@gmail.com on 21 Nov 2013 at 6:57

GoogleCodeExporter · 2015-04-10T16:42:26Z

There's movement here. We may end up changing the buffer to be a bytearray 
rather than a deque. See recent discussion on python-tulip list and this code 
review: https://codereview.appspot.com/33490043/

Original comment by gvanrossum@gmail.com on 27 Nov 2013 at 5:36

GoogleCodeExporter · 2015-04-10T16:42:26Z

This issue was closed by revision c50e09feb51f.

Original comment by gvanrossum@gmail.com on 27 Nov 2013 at 10:13

Changed state: Fixed

GoogleCodeExporter added auto-migrated Type-Defect Priority-Low labels Apr 10, 2015

GoogleCodeExporter closed this as completed Apr 10, 2015

GoogleCodeExporter mentioned this issue Apr 10, 2015

transport.write doesn't accept bytearray objects #87

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we allow "bytearray" #27

Should we allow "bytearray" #27

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

Should we allow "bytearray" #27

Should we allow "bytearray" #27

Comments

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015