Allow for fetching entire `[]byte` slice as `bytes` #323

deuill · 2023-04-03T15:14:35Z

Currently, []byte slices in Go are presented in Python as a custom type, Slice_byte, which allows for iterating over and fetching individual byte values from the underlying slice. However, this is rather inefficient for large slices, as each iteration requires FFI calls between Python, C, and Go (i.e. against the Slice_byte_elem function).

It might, therefore, be more efficient to allow for returning the entire []byte slice as a Python bytes object, as a single FFI call, perhaps as a C.CString. Even if each (or the first) call creates a copy, the latency and CPU time tradeoff should be a sufficient improvement overall.

The text was updated successfully, but these errors were encountered:

rcoreilly · 2023-04-03T15:19:47Z

This could be achieved by using a string type presumably? The semantics of the []byte would be lost if it was not directly writable on the python side.

deuill · 2023-04-03T16:46:41Z

My understanding is that a string type in Go would automatically map to an str type in Python, which is defined as being composed only of UTF-8 bytes. Indeed, trying something similar out seems to return an error on the autogenerated Get method:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

I suppose there's no way of forcing a Go string to be handled as a bytes type instead? My assumption here is that there's not (or that any conversion here would be of equivalent effort as the original ask, in GoPy itself), and it's also partially the reason why I specified bytes rather than bytearray -- I assume that mapping the mutating aspects of the Python bytearray back to the Go []byte may not be feasible, hence it might simply be worth adding a function that returns an immutable copy of the entire byte slice instead.

rcoreilly · 2023-04-06T06:39:41Z

I'm not sufficiently up on the relevant standards in Python for how this all works, so I can't really judge, but it sounds like we might want to have it work in different ways depending on the use case.. We do have the ability to flag things with some kind of comment directive I believe, so that might be an option. I can't quite remember where this is used but I believe it determines how an interface{} is treated or something to that effect.

rcoreilly · 2023-05-08T17:14:51Z

This is a great issue for someone to work on! The current model is that Go owns all the data structures exposed by gopy, which continue to be managed by its GC etc, and are accessed exclusively by the (auto generated) handle. To do something more efficient, gopy could expose a method that returns an unsafe pointer and a length into any slice's raw memory (&slice[0]), which then shows up as a python bytes object, with whatever proper steps / warnings or whatever to ensure that this raw memory is copied immediately into the Python side of things and the dangling pointer is not kept around after this initial call, as it will become increasingly likely to become invalid. Presumably the python wrapper just does the copy immediately in the course of calling this method.

rcoreilly · 2024-04-22T22:09:22Z

Fixed by #342

morgenroth mentioned this issue May 2, 2023

Best way to pass data chunks (bytes, bytearray)? #328

Closed

NoamK-CR mentioned this issue Nov 19, 2023

Efficient copies of bytes #342

Merged

rcoreilly closed this as completed Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow for fetching entire `[]byte` slice as `bytes` #323

Allow for fetching entire `[]byte` slice as `bytes` #323

deuill commented Apr 3, 2023

rcoreilly commented Apr 3, 2023

deuill commented Apr 3, 2023

rcoreilly commented Apr 6, 2023

rcoreilly commented May 8, 2023

rcoreilly commented Apr 22, 2024

Allow for fetching entire []byte slice as bytes #323

Allow for fetching entire []byte slice as bytes #323

Comments

deuill commented Apr 3, 2023

rcoreilly commented Apr 3, 2023

deuill commented Apr 3, 2023

rcoreilly commented Apr 6, 2023

rcoreilly commented May 8, 2023

rcoreilly commented Apr 22, 2024

Allow for fetching entire `[]byte` slice as `bytes` #323

Allow for fetching entire `[]byte` slice as `bytes` #323