Hangs on async benchmarks #11

sbarral · 2022-10-17T16:53:38Z

kanal hangs for me 99% of the time in all async benchmarks, both with the official benchmarks and with Tachyobench.

I was initially suspecting a problem with async notifications, but since it uses 100% CPU on all threads when hanging, I start to think it might actually be due to the use of spin locks, though I don't know for sure.

The text was updated successfully, but these errors were encountered:

fereidani · 2022-10-17T20:28:37Z

thank you for your report, whats your hardware specs?

sbarral · 2022-10-17T20:33:37Z

thank you for your report, whats your hardware specs?

I only checked on my laptop so far, a i5-7200U (kaby lake, 2/4 cores).

fereidani · 2022-10-17T20:46:52Z

which rust version? is it hang on any specific test?

sbarral · 2022-10-17T20:51:28Z

which rust version? is it hang on any specific test?

This is with rust 1.64.
The MPMC, MPSC and SPSC all have this issue. Not sure about the sequential test, i can check tomorrow (I'm on mobile now).

fereidani · 2022-10-17T21:05:17Z

ok, thanks, currently i can't reproduce it on my PC and laptop.

sbarral · 2022-10-18T05:40:06Z

ok, thanks, currently i can't reproduce it on my PC and laptop.

Well, this is weird... Do you have intel hardware?

I tried it on an EC2 instance and it had the exact same problem, so the issue is probably widespread.
This was an EC2 t2.micro instance (exact hardware was Intel Xeon E5-2676 v3) so you can reproduce it using a free AWS starter account.

Investigating further I noticed that tests that reserve enough capacity for all messages, such as:

run_async!("bounded_mpsc(empty)", mpsc::<BenchEmpty>(Some(MESSAGES)));`

actually run.

In general, increasing the capacity seems to decrease the probability of hanging. Also, the SPSC test sometimes works even with small capacity.
I managed to sometimes get the tests running with capacities of 10 or 100, but then the perf of kanal-async is not that good compared to flume-async or async-channel, which also suggests some issues with the notification primitives.

EDIT: I see that the Cargo.toml of the benchmark does not pin the version of tokio (it probably should, to improve reproducibility). My Cargo.lock says I'm using tokio-1.21.2.

fereidani · 2022-10-18T06:48:53Z

Yes, there is UB somewhere in code and I'm investigating it, I think there is a problem with our stack borrow rules, I should fix that first, I'm gonna keep you updated to run your tests again then.
Are using nightly or stable Rust?

sbarral · 2022-10-18T06:50:04Z

Yes, there is UB somewhere in code and I'm investigating it, I think there is a problem with our stack borrow rules, I should fix that first, I'm gonna keep you updated to run your tests again then. Are using nightly or stable Rust?

Using stable only.

fereidani · 2022-10-22T20:11:35Z

Could you please rerun your tests, with the latest benchmark repo and the latest Kanal repo?

sbarral · 2022-10-22T20:50:07Z

I could only test on my laptop for now, but it seems to work fine with commit #774a05f :-)

Likewise on tachyobench, so I plan to merge support for kanal once you make a release.

fereidani · 2022-10-23T04:18:39Z

Thank you!

sbarral · 2022-10-23T09:17:04Z

Sorry I write here, I could not find a way to PM you so I hope you won't mind I ask here.

I would like to update my own benchmarks for an MPSC channel I wrote, but there is no official release of Kanal that I can benchmark since the pre2 release has this issue with async. I don't want to misrepresent the potential of Kanal, so do you feel that the master branch of Kanal is suitable or would you prefer that I don't benchmark Kanal just yet?

fereidani · 2022-10-23T09:42:14Z

Thank you for your professionalism, Kanal async is not ready yet, I need to redesign some parts of it, if you like to benchmark the sync version, the master branch is ok. but if you like to run benchmarks for async, I think it will be ready in pre3.

sbarral · 2022-10-23T10:00:14Z

No problem, I understand!

zstewar1 mentioned this issue Oct 19, 2022

Bounded Async Channel is 100x worse than kanal sync channel when system is loaded #12

Closed

fereidani closed this as completed Oct 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hangs on async benchmarks #11

Hangs on async benchmarks #11

sbarral commented Oct 17, 2022

fereidani commented Oct 17, 2022

sbarral commented Oct 17, 2022

fereidani commented Oct 17, 2022 •

edited

Loading

sbarral commented Oct 17, 2022

fereidani commented Oct 17, 2022

sbarral commented Oct 18, 2022 •

edited

Loading

fereidani commented Oct 18, 2022

sbarral commented Oct 18, 2022 •

edited

Loading

fereidani commented Oct 22, 2022

sbarral commented Oct 22, 2022

fereidani commented Oct 23, 2022

sbarral commented Oct 23, 2022

fereidani commented Oct 23, 2022

sbarral commented Oct 23, 2022

Hangs on async benchmarks #11

Hangs on async benchmarks #11

Comments

sbarral commented Oct 17, 2022

fereidani commented Oct 17, 2022

sbarral commented Oct 17, 2022

fereidani commented Oct 17, 2022 • edited Loading

sbarral commented Oct 17, 2022

fereidani commented Oct 17, 2022

sbarral commented Oct 18, 2022 • edited Loading

fereidani commented Oct 18, 2022

sbarral commented Oct 18, 2022 • edited Loading

fereidani commented Oct 22, 2022

sbarral commented Oct 22, 2022

fereidani commented Oct 23, 2022

sbarral commented Oct 23, 2022

fereidani commented Oct 23, 2022

sbarral commented Oct 23, 2022

fereidani commented Oct 17, 2022 •

edited

Loading

sbarral commented Oct 18, 2022 •

edited

Loading

sbarral commented Oct 18, 2022 •

edited

Loading