Optimize server conversions for client-server array transfers #880

ronawho · 2021-07-12T15:01:30Z

Improve the performance of ak.array() and pdarray.to_ndarray() by
optimizing how the server converts between bytes and pdarrays.
Previously, the server would write to and read from a big-endian memory
mapped file to convert between bytes and arrays but this is fairly
slow. Optimize this conversion by directly interpreting the underlying
memory as the type we're converting to. In the ak.array() case we
create the local array with makeArrayFromPtr, which will create an
array from the existing bytes without any copies. For the to_ndarray()
case makeArrayFromPtr is also used on some local memory, which is then
reinterpreted as bytes with createBytesWithOwnedBuffer.

Here's the performance improvement for 16-node-xc:

config	to_ndarray	ak.array
before	33 MiB/s	50 MiB/s
after	410 MiB/s	175 MiB/s

Where ak.array() is slower because there are more copies compared to
to_ndarray() on both the server and client side. Optimizing those
copies out is future work.

Part of #794

Improve the performance of `ak.array()` and `pdarray.to_ndarray()` by optimizing how the server converts between bytes and pdarrays. Previously, the server would write to and read from a big-endian memory mapped file to convert between bytes and arrays but this is fairly slow. Optimize this conversion by directly interpreting the underlying memory as the type we're converting to. In the `ak.array()` case we create the local array with `makeArrayFromPtr`, which will create an array from the existing bytes without any copies. For the `to_ndarray()` case `makeArrayFromPtr` is also used on some local memory, which is then reinterpreted as bytes with `createBytesWithOwnedBuffer`. Here's the performance improvement for 16-node-xc: | config | to_ndarray | ak.array | | ------ | ---------: | ---------: | | before | 33 MiB/s | 50 MiB/s | | after | 410 MiB/s | 175 MiB/s | Where `ak.array()` is slower because there are more copies compared to `to_ndarray()` on both the server and client side. Optimizing those copies out is future work.

glitch

(caveat: For as much as I understand the chapel stuff) this looks good.

reuster986

looks great!

ronawho requested review from reuster986 and mhmerrill July 12, 2021 15:01

glitch linked an issue Jul 14, 2021 that may be closed by this pull request

Investigate performance of client-server array transfers #794

Closed

glitch approved these changes Jul 14, 2021

View reviewed changes

reuster986 approved these changes Jul 14, 2021

View reviewed changes

reuster986 merged commit 2df9393 into Bears-R-Us:master Jul 14, 2021

ronawho deleted the opt-array-transfer-server branch July 14, 2021 14:00

This was referenced Aug 5, 2021

Should we have factory functions to create arrays? chapel-lang/chapel#18163

Open

Investigate performance of client-server array transfers #794

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize server conversions for client-server array transfers #880

Optimize server conversions for client-server array transfers #880

ronawho commented Jul 12, 2021

glitch left a comment

reuster986 left a comment

Optimize server conversions for client-server array transfers #880

Optimize server conversions for client-server array transfers #880

Conversation

ronawho commented Jul 12, 2021

glitch left a comment

Choose a reason for hiding this comment

reuster986 left a comment

Choose a reason for hiding this comment