-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid growbuffer, take two #197
Conversation
Add benchmarks for 1000-byte packets in addition to the existing benchmarks for 100-byte packets.
Codecov Report
@@ Coverage Diff @@
## master #197 +/- ##
==========================================
+ Coverage 74.97% 75.16% +0.18%
==========================================
Files 17 17
Lines 1215 1224 +9
==========================================
+ Hits 911 920 +9
Misses 208 208
Partials 96 96
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
437cb98
to
b35932d
Compare
Use a sync.Pool for buffers used by (*SessionRTP).Write and friends. The pool is global, so its cost can be amortised across multiple sessions.
@adriancable on an in-order ARMv7 with no hardware crypto (AM335x, Cortex-A8 at 1GHz) the results are less dramatic (due to the time spent in crypto) but still siginificant:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been running these changes for a few days on ARMv7 and x86 with no issues. The benchmark improvements are impressive. A few % shouldn't be left on the table.
Take two of #77. Due to overhead having been removed elsewhere, the results are much more dramatic, especially for GCM.
When I first submitted that, @Sean-Der noticed that this relies on nextConn.write not retaining a reference to the buffer. This remains true of this version. However, relying on write methods not retaining the buffer is standard practice in the Go world, as mentioned here: https://pkg.go.dev/io#Writer. I remain naturally open to other approaches, but I think that the performance improvement is too large to leave on the table, especially on embedded systems with slow memory interconnects (where the results are even more dramatic, as pointed out by @adriancable).
This depends on #196.