Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hangs on async benchmarks #11

Closed
sbarral opened this issue Oct 17, 2022 · 14 comments
Closed

Hangs on async benchmarks #11

sbarral opened this issue Oct 17, 2022 · 14 comments

Comments

@sbarral
Copy link

sbarral commented Oct 17, 2022

kanal hangs for me 99% of the time in all async benchmarks, both with the official benchmarks and with Tachyobench.

I was initially suspecting a problem with async notifications, but since it uses 100% CPU on all threads when hanging, I start to think it might actually be due to the use of spin locks, though I don't know for sure.

@fereidani
Copy link
Owner

thank you for your report, whats your hardware specs?

@sbarral
Copy link
Author

sbarral commented Oct 17, 2022

thank you for your report, whats your hardware specs?

I only checked on my laptop so far, a i5-7200U (kaby lake, 2/4 cores).

@fereidani
Copy link
Owner

fereidani commented Oct 17, 2022

which rust version? is it hang on any specific test?

@sbarral
Copy link
Author

sbarral commented Oct 17, 2022

which rust version? is it hang on any specific test?

This is with rust 1.64.
The MPMC, MPSC and SPSC all have this issue. Not sure about the sequential test, i can check tomorrow (I'm on mobile now).

@fereidani
Copy link
Owner

ok, thanks, currently i can't reproduce it on my PC and laptop.

@sbarral
Copy link
Author

sbarral commented Oct 18, 2022

ok, thanks, currently i can't reproduce it on my PC and laptop.

Well, this is weird... Do you have intel hardware?

I tried it on an EC2 instance and it had the exact same problem, so the issue is probably widespread.
This was an EC2 t2.micro instance (exact hardware was Intel Xeon E5-2676 v3) so you can reproduce it using a free AWS starter account.

Investigating further I noticed that tests that reserve enough capacity for all messages, such as:

run_async!("bounded_mpsc(empty)", mpsc::<BenchEmpty>(Some(MESSAGES)));`

actually run.

In general, increasing the capacity seems to decrease the probability of hanging. Also, the SPSC test sometimes works even with small capacity.
I managed to sometimes get the tests running with capacities of 10 or 100, but then the perf of kanal-async is not that good compared to flume-async or async-channel, which also suggests some issues with the notification primitives.

EDIT: I see that the Cargo.toml of the benchmark does not pin the version of tokio (it probably should, to improve reproducibility). My Cargo.lock says I'm using tokio-1.21.2.

@fereidani
Copy link
Owner

Yes, there is UB somewhere in code and I'm investigating it, I think there is a problem with our stack borrow rules, I should fix that first, I'm gonna keep you updated to run your tests again then.
Are using nightly or stable Rust?

@sbarral
Copy link
Author

sbarral commented Oct 18, 2022

Yes, there is UB somewhere in code and I'm investigating it, I think there is a problem with our stack borrow rules, I should fix that first, I'm gonna keep you updated to run your tests again then. Are using nightly or stable Rust?

Using stable only.

@fereidani
Copy link
Owner

Could you please rerun your tests, with the latest benchmark repo and the latest Kanal repo?

@sbarral
Copy link
Author

sbarral commented Oct 22, 2022

I could only test on my laptop for now, but it seems to work fine with commit #774a05f :-)

Likewise on tachyobench, so I plan to merge support for kanal once you make a release.

@fereidani
Copy link
Owner

Thank you!

@sbarral
Copy link
Author

sbarral commented Oct 23, 2022

Sorry I write here, I could not find a way to PM you so I hope you won't mind I ask here.

I would like to update my own benchmarks for an MPSC channel I wrote, but there is no official release of Kanal that I can benchmark since the pre2 release has this issue with async. I don't want to misrepresent the potential of Kanal, so do you feel that the master branch of Kanal is suitable or would you prefer that I don't benchmark Kanal just yet?

@fereidani
Copy link
Owner

Thank you for your professionalism, Kanal async is not ready yet, I need to redesign some parts of it, if you like to benchmark the sync version, the master branch is ok. but if you like to run benchmarks for async, I think it will be ready in pre3.

@sbarral
Copy link
Author

sbarral commented Oct 23, 2022

No problem, I understand!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants