Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable buffer pooling in DotNetty transport #4252

Merged

Conversation

Aaronontheweb
Copy link
Member

close #3879
close #3273
close #4244

For reasons that are fundamentally structural, it's not safe for Akka.Remote to use DotNetty's buffer pooling of any kind - the reason being that serialization and deserialization is handled outside of the ChannelPipeline itself thus the bytes returned from the channel aren't safe for release until after they're successfully decoded by Akka.Remote's endpoint actors.

In order to take advantage of buffer pooling, Akka.Remote will need to be redesigned with a more integrated serialization pipeline in mind - something we've discussed on #2378

Ran some local benchmarks on my development machine here - no major changes in observed performance at all.

Before

ProcessorCount:                    8
ClockSpeed:                        0 MHZ
Messages sent/received per client: 200000  (2e5)

Num clients, Total [msg], Msgs/sec, Total [ms]
         1,  200000,     35144,    5691.34
         5, 1000000,     88614,   11285.31
        10, 2000000,     93485,   21394.37
        15, 3000000,     88789,   33788.05
        20, 4000000,     88633,   45130.35
        25, 5000000,     88562,   56458.64

After

ProcessorCount:                    8            
ClockSpeed:                        0 MHZ        
Actor Count:                       16           
Messages sent/received per client: 200000  (2e5)
Is Server GC:                      True         
                                                
Num clients, Total [msg], Msgs/sec, Total [ms]  
         1,  200000,      7882,   25375.09      
         5, 1000000,     89159,   11216.35      
        10, 2000000,     89957,   22233.77      
        15, 3000000,     89644,   33466.30      
        20, 4000000,     88131,   45387.95      
        25, 5000000,     87505,   57140.24      

@Aaronontheweb Aaronontheweb merged commit 94c15d4 into akkadotnet:dev Feb 26, 2020
@Aaronontheweb Aaronontheweb deleted the fix-3879-DotNetty-buffer-pooling branch February 26, 2020 00:11
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this pull request Feb 26, 2020
Deleted since akkadotnet#4252 eliminated need for this
@dnickless
Copy link

I actually do see a major difference in the numbers... The first line for a single client appears to be off by some serious factor?

@Aaronontheweb
Copy link
Member Author

@dnickless that's a byproduct of our batching system that we added, which improved overall performance significantly: #4106

The first round of the benchmark puts very little pressure on either side of the wire since it's a single actor emitting 20 messages at a time, beneath the default threshold of 30 messages per batch used by the batching system. As a result, we're depending on the recurring 40ms timer to flush out writes in that scenario - hence why the first round of the benchmark is a big outlier, since the operating system can't guarantee exactly 40ms each time. I added an article on how to performance tune this new batching system to the Akka.NET documentation here: https://getakka.net/articles/remoting/performance.html

Just to give you some idea on the numbers:

Before: https://getakka.net/articles/remoting/performance.html#no-io-batching

Average performance: 82,539 msg/s.

Standard deviation: 46,827 msg/s.

After: https://getakka.net/articles/remoting/performance.html#with-io-batching

Average performance: 141,091 msg/s.

Standard deviation: 15,291 msg/s.

Those figures came from a different piece of hardware than the one I used for the benchmark yesterday (we haven't replaced our standard benchmarking setup since moving onto Azure DevOps instead of our home-grown CI.) The figures on the website came from a 2019-generation development laptop. The figures on this PR came from a 2012 laptop running much older hardware.

Just now, I re-ran the benchmark with these changes on a third machine (home office machine - AMD Ryzen setup built in 2017) which has overall better hardware.

λ  dotnet run -c Release --framework netcoreapp3.1
ProcessorCount:                    16
ClockSpeed:                        0 MHZ
Actor Count:                       32
Messages sent/received per client: 200000  (2e5)
Is Server GC:                      True

Num clients, Total [msg], Msgs/sec, Total [ms]
         1,  200000,     58755,    3404.19
         5, 1000000,    183790,    5441.41
        10, 2000000,    178779,   11187.50
        15, 3000000,    180452,   16625.69
        20, 4000000,    178277,   22437.16
        25, 5000000,    180291,   27733.56

You'll see the big drop in the first round of performance there too - it's the same root issue: falling beneath the batching threshold. All of those batch thresholds (time, msg count, total bytes) can be customized according to the article I linked earlier in Akka.NET v1.4.0.

Aaronontheweb added a commit that referenced this pull request Feb 26, 2020
* disabled Ask_does_not_deadlock spec

* Delete Bug3370DotNettyLinuxBufferPoolSpec.cs

Deleted since #4252 eliminated need for this

* relaxing timeout on FSMTimingSpec
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this pull request Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants