Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve websocket performance #37

Merged
merged 3 commits into from
Jun 23, 2020
Merged

Conversation

nkoenig
Copy link
Contributor

@nkoenig nkoenig commented Jun 23, 2020

The ::Run loop was not throttled. This would peg a process without accomplishing meaningful work.

This PR adds some logic to throttle the ::Run loop using a default wait time and notifications based on whether there is data to transmit.

Signed-off-by: Nate Koenig nate@openrobotics.org

Nate Koenig added 3 commits June 22, 2020 12:47
Signed-off-by: Nate Koenig <nate@openrobotics.org>
Signed-off-by: Nate Koenig <nate@openrobotics.org>
Signed-off-by: Nate Koenig <nate@openrobotics.org>
@codecov
Copy link

codecov bot commented Jun 23, 2020

Codecov Report

Merging #37 into ign-launch1 will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##           ign-launch1      #37   +/-   ##
============================================
  Coverage        28.15%   28.15%           
============================================
  Files                3        3           
  Lines              721      721           
============================================
  Hits               203      203           
  Misses             518      518           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12f53ae...35fd79a. Read the comment docs.

Copy link
Contributor

@german-e-mas german-e-mas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's working well locally. LGTM!

@nkoenig nkoenig changed the base branch from fix_empty_ssl to ign-launch1 June 23, 2020 22:18
@nkoenig nkoenig merged commit c96183b into ign-launch1 Jun 23, 2020
@nkoenig nkoenig deleted the websockets_performance branch June 23, 2020 23:10
// Wait for (1/60) seconds or an event.
std::unique_lock<std::mutex> lock(this->runMutex);
this->runConditionVariable.wait_for(lock,
0.0166s, [&]{return !this->run || this->messageCount > 0;});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a potential issue that could be due to my poor understanding of the libwebsockets event loop.

From my brief search, if libwebsockets is implemented without an event loop like libev or libuv, message processing is done by active polling. Polling is started by explicitly calling lws_service.

Our event loop looks something like this.

WebsocketServer::Run → lws_service → rootCallback → switch(_reason) → lws_callback_on_writable(_wsi)
       (loop)                                                                   ↓
                                                                         specificCallback

lws_service is only called by WebsocketServer::Run, so websocket events are only processed every time WebsocketServer::Run loops. WebsocketServer::Run calls lws_service and then proceeds to acquire runMutex to wait until it hits a timeout, the server is stopped or there is a pending message in the queue (which is checked using messageCount).

The problem is that messageCount is only modified by websocket callbacks. Since the runMutex lock prevents WebsocketServer::Run from calling lws_service, callbacks modifying messageCount will not run until the lock times out and lws_service is called again. This will prevent the messageCount trigger from ever going off.

The way things are currently setup will increase connection latency as websocket events will only be processed every number of times per second, but it still may be good enough for our use case as it does not need (soft) real-time data.

A potential solution for a future version is to use an event loop like libev or libuv. There is likely a simpler solution I'm not seeing as well.

I may be completely wrong about the event loop, but thought this was worth raising.

On another note, if my event loop model is wrong and messages are actually processed in the background, then would it make sense to change the timeout duration from 0.0166s to this->publishPeriod?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants