Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
bad executor exception using a timeout in the client #1599
We are using boost beast in a high performance manner, i.e. we need to be able to communicate >1000 transactions per second per connection. For our application it is crucial to use a timeout on the stream which is provided by your library.
We encounter a critical exception when a client has a timeout policy enabled (which sends ping/pong messages by the beast library) and when we stress the async_write function (see example client below) by sending messages all the time. The bug results in an unhandled exception:
It seems that op_idle_ping implementation in the beast library cannot find the strand to execute the ping operation and an exception is raised (see websocket/impl/ping.hpp:179, I am not sure though). It is our belief that the library should function perfectly fine when the async_write function is properly stressed like our example.
Version of Beast
#define BOOST_BEAST_VERSION 248
Steps necessary to reproduce the problem
We have adjusted the client websocket async c++ example to reproduce the bug. Basically we added the timeout function with ping enabled and we keep on sending the message on the stream in a proper way.
All relevant compiler information
We are using ubuntu 19.04 and gnu compiler
The server can be anything, a quick running server in nodejs is for example (using the certificate and key of your example directory):
Thank you for the detailed report. Moved-from executors are supposed to still be usable (according to boostorg/asio@e830f97) but it seems that a moved-from polymorphic executor is no longer usable:
If possible, please try doing both of these things:
Then run the program to determine if the issue is still present (I suspect it won't be).
I believe that either of these changes will resolve your issue. The information provided by attempting these changes will help support our case for a bug in asio.
Thanks for your quick reply!
I have commented out the move-constructor of net::executor inside my boost files, it seems to be working as I haven't received any error when running for 5 minutes. Did you already submit a bug report to asio?
Your second suggestion has my preference, since this is a fix in my own code instead of inside the boost asio header files. I do not always have write access to these files. Unfortunately when I change the stream template to use the strand executor as you suggest, it leads to a segfault (there is no segfault when the ping feature is disabled):
The stack trace shows that again the segfault is produced by the same ping code (ping.hpp:179) that tries to queue something on the strand.
Yes, boost version.hpp says:
No special flags:
or for debugging:
We are using:
./client localhost 8080 "test"
@djarek Indeed it does not throw the exception when using the websocket-async-server example out of the box for some unknown reason. We do however randomly see this exception happening in our own beast server implementation. In this implementation, we have used limits on the stream like the simple ratelimiter and we have set a timeout policy. I will try to isolate what parameters in our server code server trigger this bahaviour.
If you want to see the bug directly happening, try use the nodejs example that I passed. This uses a widely used websocket implementation that also obeys the spec and passes all autobahn tests. I have just run this on another machine and it directly throws the error on the client.
Like I said, we do also see it happening with the beast server implementation but we have used many flags on the stream. I will investigate and report back.
I found out why the async-server example does not reproduce the exception on the client side: the server example echos all messages directly back to the client and therefore the idle_timeout timer on the client never triggers. The error can be easily reproduced by putting the server-example in read-only mode.
The full async server example code that allows to reproduce the bug on the client is
The issue is here: https://github.com/boostorg/beast/blob/develop/include/boost/beast/websocket/impl/ping.hpp#L179
I believe that version 248-hf1 fixes it, please try it if possible. We added a unit test which reproduces the problem exactly. Thank you for your very detailed report which made this easy to track down and fix!