Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket write queue fixes and improvements #4202

Merged
merged 7 commits into from
Apr 3, 2023

Conversation

pwojcikdev
Copy link
Contributor

@pwojcikdev pwojcikdev commented Apr 2, 2023

The socket asynchronous write loop, first introduced in #1938 got broken by #2918. The problematic PR attempted to streamline logic, seemingly with the assumption that just a strand is enough to guarantee correct order of execution of async writes, however this assumption is not correct. A strand can only ensure that tasks are executed sequentially with other tasks, so an unfortunate call sequence of async_write_1 > async_write_2 > async_write_1_complete > async_write_2_complete ... is still possible. Here is a link to a brief discussion from the Boost ASIO mailing list that mentions this problem: https://boost-users.boost.narkive.com/MrmFQST2/multiple-async-operations-on-a-socket To correctly implement a multi-writer async write loop, both strand and a write queue of some sort is required. Failure to do so might result in data from multiple write requests being interleaved, which is exactly the issue observed during high load local tests. This could also explain why a drop in connected nodes is observed each time live network activity spikes.

A related improvement is that the newly introduced socket send queue is further split into multiple subqueues, based on traffic type. This allows for better prioritization of inter node communication when network throughtput is limited, eg. ensuring that bootstrap traffic won't preempt voting when network is experiencing a period of heightened activity. This should eliminate the need for aggressive throttling of ascending bootstrapper rate, as both client and server should now automatically throttle their rate in response to network back pressure (socket send queue being filled). This could be extended further to differently prioritize live voting / voting requests / voting responses / block publishing, which should give the network additional resiliency, but this is a TODO.

@clemahieu clemahieu merged commit 62bdaba into nanocurrency:develop Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants