Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mg_broadcast not returning on ESP32 (Mongoose 6.11 and ESP-IDF 3.0-rc1) #899

Closed
avanbremen opened this issue Feb 17, 2018 · 6 comments
Closed

Comments

@avanbremen
Copy link

I have created an example ESP32 project in order to help debug the issue (on GitHub here). The project was created using Mongoose 6.11 and ESP-IDF Pre-release 3.0-rc1. Besides creating an example project, I will be as thorough as I can be to help you better understand and troubleshoot the issue.

The issue
Unfortunately, function mg_broadcast never returns and thus locks up the calling task, timer_task in the example.

Example project
The project connects to Wi-Fi and runs a WebSocket server on port 8000. The Mongoose event handler runs in its own FreeRTOS task, mg_task. Whenever a WebSocket frame is received, text "ws_frame_reply" is sent in response. A secondary FreeRTOS task, timer_task, runs every 10 seconds and calls mg_broadcast in order to demonstrate a multi-threaded solution. The mg_broadcast callback, on_work_complete, is called (from within task mg_task) which in its turn sends text "timer_task" over the WebSocket.

A more detailed description can be found in mg_test_main.c on GitHub here. Debug information is logged to the console (defaults to COM3).

Expected behaviour
Function mg_broadcast returns so that task timer_task can call mg_broadcast (and thus initiate sending a message over the WebSocket) every 10 second iteration.

Actual behaviour
mg_broadcast never returns, thus locking up task timer_task. The text "timer_task" is only sent once.

Analysis
Within function mg_broadcast, every function call returns up until the last one: dummy = MG_RECV_FUNC(mgr->ctl[0], (char *) &len, 1, 0);

Callback function on_work_complete, that gets passed into mg_broadcast, is even called (albeit only once), therefore we can conclude that mg_mgr_poll in task mg_task wakes up and invokes the callback function.

Besides defining MG_ENABLE_BROADCAST as 1 in the project Makefile, no changes were made to the Mongoose configuration. Therefore MG_RECV_FUNC defaults to recv.

Since no changes were made to the default ESP32 configuration, LwIP loopback support should be enabled. This was also confirmed by Espressif in the issue @nkolban raised here.

Testing
Also tested with older versions of Mongoose and ESP-IDF.

Side note
I noticed the mg_broadcast callback is called 1 + N times, where N is the number of user sockets. I assume the callback function is called for the internal loopback socket as well, hence the +1. In the example project I set nc->user_data to 1 on MG_EV_WEBSOCKET_HANDSHAKE_DONE event, in order to differentiate between the loopback socket and the user sockets.

Can you confirm that my assumption was right and the approach was correct?

Thank you and kind regards,
Arjan

@cpq
Copy link
Member

cpq commented Feb 18, 2018

The mentioned enablement of a lwip loopback does not actually show that the loopback transfer works.

The internal management socket pair is using two connected DGRAM sockets (UDP). Rationale: DGRAM socket pair guarantees that the receiver will not get partial message. Your example obviously proves that UDP socket pair doesn't work as expected.

Try to experiment and change DGRAM to STREAM (UDP->TCP) . Patch mongoose.c:

void mg_socket_if_init(struct mg_iface *iface) {
  (void) iface;
  DBG(("%p using select()", iface->mgr));
#if MG_ENABLE_BROADCAST
  mg_socketpair(iface->mgr->ctl, SOCK_STREAM);  // <-- here! Use SOCK_STREAM.
#endif
}

Rebuild and retry, let us know whether mg_broadcast() hangs.

@cpq
Copy link
Member

cpq commented Feb 18, 2018

I have debugged that further.
mg_broadcast() sends a message then receives 1 byte back in a blocking recv().
Mongoose thread recv() the message, then send()s 1 byte back, unblocking mg_broadcast().

Apparently, it was send() of 1 byte that was failing with EINVAL, causing mg_broadcast to block. And that was due to the fact one socket of a pair had a local address of 0.0.0.0 instead of 127.0.0.1. That caused send() to 0.0.0.0 to fail with EINVAL.

The fix that changes mg_socketpair() logic in mongoose is pushed that makes pair works for both POSIX, windows and LWIP implementations.

@cpq
Copy link
Member

cpq commented Feb 18, 2018

Take the HEAD mongoose.c from https://github.com/cesanta/mongoose/tree/dev and retry.

@cpq cpq closed this as completed Feb 18, 2018
@avanbremen
Copy link
Author

@cpq Thank you so much for getting this fix out as soon as you did, much appreciated. I hope the info and example project were useful during debugging. I just tested the changes you made and everything works like a charm.

By the way, is the mg_broadcast callback called N + 1 times because it's getting called for every user socket (N) + 1 time for the internal loopback socket? If so, is setting nc->user_data on e.g. MG_EV_WEBSOCKET_HANDSHAKE_DONE and using that to detect user sockets the recommended approach?

Thanks again and my compliments for creating Mongoose. I will definitely be doing more in-depth testing soon.

Kind regards and all the best,
Arjan

@cpq
Copy link
Member

cpq commented Feb 18, 2018

Internal loopback is not wrapped into the mg_connection - thus no, internal loopback is not the reason. Maybe the +1 that you see is the "listening" connection.

The broadcast callback could check all sorts of criteria and thus filter those connections that should not receive the message. For example, if you want to trigger all established websocket connections, check if (c->flags & MG_F_IS_WEBSOCKET) { ... }

@avanbremen
Copy link
Author

Of course, the listening connection. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants