Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use availableProcessors() worker threads #37

Closed

Conversation

wjoel
Copy link

@wjoel wjoel commented Dec 30, 2021

Basic RPS benchmark results from Ryzen 5900X:

This branch:
Fs2Netty: 305456 request/sec, 305456 response/sec

main:
Fs2Netty: 149224 request/sec, 149224 response/sec
Fs2IO: 160947 request/sec, 160947 response/sec
RawNetty: 302905 request/sec, 302905 response/sec

Closes #4

Basic RPS benchmark results from Ryzen 5900X:

This branch:
Fs2Netty: 305456 request/sec, 305456 response/sec

main:
Fs2Netty: 149224 request/sec, 149224 response/sec
Fs2IO: 160947 request/sec, 160947 response/sec
RawNetty: 302905 request/sec, 302905 response/sec
@wjoel
Copy link
Author

wjoel commented Jan 3, 2022

With a similar change to RawNetty, it's about 10x faster than Fs2IO, and still 5x faster than Fs2Netty with this PR.

Using availableProcessors for workers in RawNetty, epoll if available:
RawNetty: 1520858 request/sec, 1520858 response/sec

(epoll didn't make much of a difference for this test, NIO gets about the same results)

@wjoel
Copy link
Author

wjoel commented Jan 3, 2022

I did a bunch of additional benchmarks with more recent versions of fs2 and cats-effect. Seems like there's a significant performance hit related to tracing in cats-effect, and possibly a smaller one in fs2. There's a decent amount of variance, so the first might not really be much slower than the second, but anyways... for future reference:

3.0.6 (tracing disabled, CE 3.3.3) Speed: 320236 request/sec, 320236 response/sec
3.2.4 (tracing disabled, CE 3.3.3) Speed: 264885 request/sec, 264885 response/sec
3.2-10-421c242 (main, tracing enabled, CE 3.3.3) Speed: 196105 request/sec, 196105 response/sec
3.2-10-421c242 (main, tracing disabled, CE 3.3.3) Speed: 285688 request/sec, 285688 response/sec
3.2-10-421c242 (main) Speed: 163657 request/sec, 163657 response/sec
3.2.4 Speed: 179674 request/sec, 179674 response/sec
3.2.3 Speed: 168397 request/sec, 168397 response/sec
3.2.2 Speed: 240868 request/sec, 240868 response/sec
3.2.1 Speed: 209951 request/sec, 209951 response/sec
3.1.6 Speed: 215442 request/sec, 215442 response/sec
3.1.0 Speed: 234656 request/sec, 234656 response/sec
3.0.6 Speed: 273681 request/sec, 273681 response/sec

@wjoel
Copy link
Author

wjoel commented Dec 18, 2023

@djspiewak Not sure if you missed this, or if it's just not interesting? I'll close the PR in January, I'm getting tired of seeing it open in my PR list. ;)

Especially with the io_uring and epoll stuff being worked on in CE3, maybe this fs2-netty isn't all that relevant anymore.

@wjoel wjoel closed this Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Figure out why on earth RPS is so bad
1 participant