Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue on Mac with M1 pro #224

Closed
HuakunShen opened this issue Apr 29, 2024 · 3 comments
Closed

Performance issue on Mac with M1 pro #224

HuakunShen opened this issue Apr 29, 2024 · 3 comments

Comments

@HuakunShen
Copy link
Contributor

HuakunShen commented Apr 29, 2024

I tested with the rust and python client, as well as golang implementation at https://github.com/psanford/wormhole-william/

To get a consistent speed measurement, python client is always used for receiving and speed measurement.

Under my 1000mbps network.

Sender Computer Sender Client Receiver Computer Receiver Client Speed
M1 pro Mac python Ubuntu i7 13700K python 112MB/s
M1 pro Mac rust Ubuntu i7 13700K python 73MB/s
M1 pro Mac golang Ubuntu i7 13700K python 117MB/s
Ubuntu i7 13700K python M1 pro Mac python 115MB/s
Ubuntu i7 13700K rust M1 pro Mac python 116MB/s
Ubuntu i7 13700K golang M1 pro Mac python 117MB/s
Ubuntu i7 13700K python Kali VM (on Mac) python 119MB/s
Kali VM (on Mac) python Ubuntu i7 13700K python 30MB/s
Ubuntu i7 11800H rust Ubuntu i7 13700K python 116MB/s
Ubuntu i7 13700K rust Ubuntu i7 11800H python 116MB/s

So only sending on M1 mac with rust wormhole has performance issue.

Wonder what could be the bottleneck. Given the Python and golang version can eat full bandwidth of my network, there must be something wrong with the rust implementation.

I am guessing Ubuntu doesn't have problem, due to its much higher single core frequency.

@meejah
Copy link
Member

meejah commented Apr 29, 2024

Cool, those might make some numbers suitable to add to "the ecosystem doc" (https://github.com/magic-wormhole/magic-wormhole/blob/master/docs/ecosystem.rst) or somewhere in the general protocols repo

@felinira
Copy link
Collaborator

Wonder what could be the bottleneck.

There could be many reasons why the mac implementation is slower in your specific case. It's impossible to tell, unless you profile it. Could be interesting to know. Just to make sure: You compile against aarch64, not against x86_64?

@HuakunShen
Copy link
Contributor Author

HuakunShen commented Apr 30, 2024

I found the potential issue. I reviewed the code of the python and golang version with a debugger and found that the rust implementation has a different chunk size for each send. Python and Golang implementations both send 16KB at a time while the rust version sends 4KB at a time.

After setting chunk size to 16KB I get full speed.

Python

File sending starts from here https://github.com/magic-wormhole/magic-wormhole/blob/02407c4aa4cc3f8d8cd01d549fdc72a5f5d77010/src/wormhole/cli/cmd_send.py#L442

fs = basic.FileSender()

with self._timing.add("tx file"):
    with progress:
        if filesize:
            # don't send zero-length files
            yield fs.beginFileTransfer(
                self._fd_to_send,
                record_pipe,
                transform=_count_and_hash)

The chunk size is defined in twisted package twisted.protocols.basic.FileSender

File chunks are read here https://github.com/twisted/twisted/blob/02a2b658cd1ade5d7f41f97d898913686313e615/src/twisted/protocols/basic.py#L892

CHUNK_SIZE is a constant defined as CHUNK_SIZE = 2**14 (which is 16384bytes, 16kB)
at https://github.com/twisted/twisted/blob/02a2b658cd1ade5d7f41f97d898913686313e615/src/twisted/protocols/basic.py#L857

Golang

The golang implementation also defines chunk size to be 16KB (recordSize := (1 << 14))

See https://github.com/psanford/wormhole-william/blob/68dc3447a8585b060fb1e6836a23847700ab9207/wormhole/send.go#L363

Rust

Rust is using 4KB.

let mut plaintext = Box::new([0u8; 4096]);

I changed 4096 to 16384 and build a release build that gives me 117MB/s when sender is M1 pro Mac.

I think the rust implementation can also use 16KB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants