-
-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvement on nonblocking sockets #838
Conversation
@misalcedo Thanks, I do hope to review this soon, but have been a bit behind with holiday travel and catching up on other things. So thanks for your patience also in the interim. |
No worries, take as much time as you need. |
@misalcedo, this is absolutely fantastic - thank you! I'll also try to take a read through and see what I can do to test this internally at Stripe, but I'm not sure I'll have time until closer to the holidays. Embarrassingly, we still have not bumped versions after having internal breakages from #796 (🤞 resolved by v0.95.0). I recently made some observations that the current code, while much lower for allocs, did negatively impact total allocated memory. The header string would be re-allocated as each line was processed, getting smaller and smaller. (My naive understanding is that the former is much worse than the latter) From a quick skim of your changes, I believe you may have also addressed that but I'll do a more thorough read later. |
The more eyes on it the better. I haven't tried to do any memory profiling, but that should be fairly simple since I setup the benchmark with Ruby-prof. As long as you are not using Ruby 3 that is. Ruby-prof segfaults for me whenever I try to do memory profiling on Ruby 3.2.2. I think this is because the garbage collector is different, but not entirely sure. I opened ruby-prof/ruby-prof#326 with ruby-prof |
Yep, I've faced similar segfault issues. When I last measured this, I used https://github.com/Shopify/memory_profiler. You can wrap a block with the profiler and get pretty granular information on (1) alloc count, (2) source of allocs, (3) size and specific strings allocated. This is how we've spotted both the single-char alloc issues and the large line-by-line allocs. When I get to testing this internally, I'll also spin up the memory profiler and check. It's been almost a year since I last walked through profiling Excon but I have a bunch of notes + hacked up PRs from the last time that I'll look back on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delays, I wanted to make sure I could find a time when I was fresh and focused to give this a thorough look. Overall it seems great, thanks again for your work and help. I had a couple minor suggestions about naming/cleanup that I'd like to discuss, but other than that I think it's very nearly there. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think the readable_bytes
change does help a lot in terms of quickly and easily understanding.
@misalcedo thanks again for the work you did here and your collaboration in polishing it. @lgo I'll be curious to hear what you find, and would welcome any notes/code you have that might help us profile in the future (it's not something I have much experience with either). Also, just let me know if there is anything you need so that you all can get bumped to a newer version. |
Absolutely, I'd be more than happy to share notes on the whole process and see what I can do to provide a simple/reproducible suite (maybe just tweaking @misalcedo's). @misalcedo - thanks again for these awesome changes as well as the benchmarks to keep this a standard! |
Contributing was actually pretty fun for me, so glad I could help. I played around with the memory profiler but struggled to process the vast amount of information it provided. Now I have something new to learn this weekend. |
Closes #836
The goal of this change is to improve the performance of excon while maintaining a low number of allocations.
This change is similar in nature to what
Net::HTTP
does in https://github.com/ruby/ruby/blob/v3_2_2/lib/net/protocol.rb#L166 .The following is a zipped archive of 3 ruby-prof profiles measuring allocations. From the profiles, it is evident the number of allocations in OpenSSL calls and Socket calls remains very similar (still significantly less than v0.71.0).
Excon.zip
Benchmark
Run on a 32-core Codespace with nothing else running. When run on a 4-core Intel MacBook Pro, Excon is fast enough to not be a bottleneck (the local docker container was my bottleneck).