Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting a slice of memory without having to enumerate #81

Closed
dsyer opened this issue Feb 17, 2022 · 15 comments
Closed

Setting a slice of memory without having to enumerate #81

dsyer opened this issue Feb 17, 2022 · 15 comments

Comments

@dsyer
Copy link

dsyer commented Feb 17, 2022

It's nice that the WASM memory can be inspected with a slice operator:

from wasmtime import *
engine = Engine()
store = Store(engine)
module = Module.from_file(engine, "./hello.wasm")
instance = Instance(store, module, [])
exports = instance.exports(store)
memory = exports["memory"]

Then

>>> memory.data_ptr(store)[1024:1029]
[114, 97, 116, 115, 0]

but it's not possible to set it that way which seems like a bug, or at least a missing feature:

>>> data = memory.data_ptr(store)
>>> data[1024:1028] = [114, 97, 116, 116]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence index must be integer, not 'slice'

You can work around it by iterating:

for i,v in enumerate("fats".encode('utf-8')): data[i] = v
@alexcrichton
Copy link
Member

Ah sorry I don't know much about python bulk array assignment, but if a different type should be returned by the methods here that's more idiomatic that seems like a reasonable thing to do!

@dsyer
Copy link
Author

dsyer commented Feb 17, 2022

Agree, but I have no idea how to achieve it. All I know is bulk assignment works with an actual list of bytes, and wasmtime._memory.Memory .data_ptr() is masquerading as a list of bytes. So it should be possible.

@dsyer
Copy link
Author

dsyer commented Feb 17, 2022

This seems to work:

>>> import ctypes
>>> data = (ctypes.c_ubyte*memory.data_len(store)).from_address(ctypes.addressof(memory.data_ptr(store).contents))
>>> data[0:10] = "helloworld".encode('utf-8')
>>> data[0:10]
[104, 101, 108, 108, 111, 119, 111, 114, 108, 100]
>>> memory.data_ptr(store)[0:10]
[104, 101, 108, 108, 111, 119, 111, 114, 108, 100]

@Tinche
Copy link

Tinche commented Dec 9, 2022

I've also been playing around with wasmtime, got a toy program running but I was reading and writing memory one byte at a time.

Glad to see there's a bulk read method available (it didn't occur to me to use a slice), a bulk write would also be very welcome!

@Tinche
Copy link

Tinche commented Dec 9, 2022

Playing around with this a little more, memory.data_ptr(store)[:res_length] returns an actual list of integer objects. You can call bytes on it, but it's very inefficient. Realistically no one's going to be treating this as a list of integers, so I'd suggest changing the API to just return a bytes or bytearray instance directly.

@muayyad-alsadi
Copy link
Contributor

this is very important as it's the most obvious bottleneck
python have many ways of doing it, for example
bytearray(b'helloworld') or array.array('B', b'helloworld')

I believe if set_item is passed a slice instead of index, it should iterate instead of raising exception

    ptr = memory.data_len(store)
    ptr[start:end]=bytearray(b'helloworld')
TypeError: sequence index must be integer, not 'slice'
>>> buff=bytearray(b'hello world')
>>> buff[0:5]=b'HELLO'
>>> for i in buff: print(i)
72
69
76
76
79
32
119
111
114
108
100

@muayyad-alsadi
Copy link
Contributor

muayyad-alsadi commented Mar 4, 2023

for comparison, on my example of slice setting of 15MB from images

  • wasm3 slice set took ~1ms
  • wasmtime it took ~900ms

this is 900x slower!!

@muayyad-alsadi
Copy link
Contributor

here they suggest to cast the C pointer to memoryview

memory_ptr = memory.data_ptr(store)
buff = memoryview(ctypes.cast(memory_ptr, ctypes.POINTER(ctypes.c_ubyte*size)))
buff_bytes = buff.bytes()

https://stackoverflow.com/questions/43810423/what-is-the-most-efficient-way-to-copy-an-externally-provided-buffer-to-bytes

but I got endianess problems

https://bugs.python.org/issue39610

NotImplementedError: memoryview: unsupported format &<B

any clue how to set a slice to bytearray or bytes or similar list of bytes?

@muayyad-alsadi
Copy link
Contributor

@dsyer the following line was too slow, it was even slower than setting the slice

>>> data = (ctypes.c_ubyte*memory.data_len(store)).from_address(ctypes.addressof(memory.data_ptr(store).contents))

how can we make it into memoryview without creating endianess problems?

@muayyad-alsadi
Copy link
Contributor

here is my implementation which is 300x times faster than @dsyer

def set_slice(val, start=0, end=None):
    size = memory.data_len(store)
    if end is None: end=start+len(val)
    val_size = len(val)
    if end-start>val_size or end>size:
        raise IndexError("out of memory size")
    src_ptr = (ctypes.c_ubyte * val_size).from_buffer(val)
    dst_ptr = ctypes.addressof((ctypes.c_ubyte*val_size).from_address(ctypes.addressof(memory_ptr.contents)+start))
    ctypes.memmove(dst_ptr, src_ptr, val_size)
    return

it seems that there was endianess difference thats why @dsyer was slower
and that's why memoryview was failing

muayyad-alsadi added a commit to muayyad-alsadi/wasmtime-py that referenced this issue Mar 5, 2023
muayyad-alsadi added a commit to muayyad-alsadi/wasmtime-py that referenced this issue Mar 5, 2023
muayyad-alsadi added a commit to muayyad-alsadi/wasmtime-py that referenced this issue Mar 5, 2023
@thewtex
Copy link
Contributor

thewtex commented Mar 11, 2023

Hi folks, a related note here: it would be very nice if wasmtime-py could expose the wasm module memory with Python's Buffer Protocol. This would enable efficient binary array access, slicing and more, via the standard numpy interface. wasmer-python has this implemented.

@j-blue-arz
Copy link

Hi folks, a related note here: it would be very nice if wasmtime-py could expose the wasm module memory with Python's Buffer Protocol. This would enable efficient binary array access, slicing and more, via the standard numpy interface. wasmer-python has this implemented.

While I agree with the above comment, I think it should not be in the scope of this issue here and the already in-progress PR. @thewtex Maybe you can create a new issue?

@muayyad-alsadi
Copy link
Contributor

wasmer-python has this implemented.

@thewtex no, it was worse in terms of performance, please check my article here

@thewtex
Copy link
Contributor

thewtex commented Mar 14, 2023

While I agree with the above comment, I think it should not be in the scope of this issue here and the already in-progress #134. @thewtex Maybe you can create a new issue?

@j-blue-arz yes, that's right. This would be in addition to @muayyad-alsadi nice setitem / slicing improvements. Issue created in #135 .

please check my article here

@muayyad-alsadi awesome article!!

Sorry if I was not clear, I meant the use of the buffer protocol with numpy instead of setitem / slicing. An example is here:

muayyad-alsadi/wasm-demos#1

The results:

Slicing
_______
img2bytearray took 9 ms
grow:  238
slice set took 352 ms
wasm took 10 ms
slice to array 78 ms
array to image 7 ms
all took 458 ms

img2bytearray took 7 ms
slice set took 355 ms
wasm took 9 ms
slice to array 77 ms
array to image 7 ms
all took 460 ms

Buffer Protocol
_______
img2bytearray took 9 ms
slice set took 0 ms
wasm took 9 ms
slice to array 0 ms
array to image 8 ms
all took 28 ms

img2bytearray took 5 ms
slice set took 0 ms
wasm took 9 ms
slice to array 0 ms
array to image 8 ms
all took 24 ms

@muayyad-alsadi
Copy link
Contributor

@thewtex thank you that was very useful

here is the wasmtime code for

memory = instance.exports(store)['memory']
ptr_type = ctypes.c_ubyte * (memory.data_len(store))
memory_ptr = (ptr_type).from_address(ctypes.addressof(memory.data_ptr(store).contents))
np_mem = np.frombuffer(memory_ptr, dtype=np.uint8, offset=hb)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants