Skip to content

Commit

Permalink
Ensure connection lost callbacks are complte prior to disconnect end
Browse files Browse the repository at this point in the history
Encountered this valgrind error while testing some code the other day:

==2568== Memcheck, a memory error detector
==2568== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2568== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==2568== Command: ./test
==2568== Parent PID: 2560
==2568==
==2568== Thread 7:
==2568== Invalid read of size 8
==2568==    at 0x4CDA970: connectionLost_call (in /usr/lib/libpaho-mqtt3cs.so.1.3.9)
==2568==    by 0x500AEA6: start_thread (pthread_create.c:477)
==2568==    by 0x5121A2E: clone (clone.S:95)
==2568==  Address 0x91e8238 is 40 bytes inside a block of size 176 free'd
==2568==    at 0x48399AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==2568==    by 0x4CF8B0F: myfree (in /usr/lib/libpaho-mqtt3cs.so.1.3.9)
==2568==    by 0x4CF3947: ListUnlink (in /usr/lib/libpaho-mqtt3cs.so.1.3.9)
==2568==    by 0x4CF3A16: ListRemove (in /usr/lib/libpaho-mqtt3cs.so.1.3.9)
==2568==    by 0x4CDA612: MQTTClient_destroy (in /usr/lib/libpaho-mqtt3cs.so.1.3.9)
==2568==    by 0x142BC0: comms_run (comms_connect.c:436)
==2568==    by 0x500AEA6: start_thread (pthread_create.c:477)
==2568==    by 0x5121A2E: clone (clone.S:95)
==2568==  Block was alloc'd at
==2568==    at 0x483877F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==2568==    by 0x4CF8725: mymalloc (in /usr/lib/libpaho-mqtt3cs.so.1.3.9)
==2568==    by 0x4CD9F58: MQTTClient_createWithOptions (in /usr/lib/libpaho-mqtt3cs.so.1.3.9)
==2568==    by 0x1428FB: comms_run (comms_connect.c:404)
==2568==    by 0x500AEA6: start_thread (pthread_create.c:477)
==2568==    by 0x5121A2E: clone (clone.S:95)
==2568==

It indicates a use after free of the MQTTClient data structure in the
connectionLost_call.

A bit of digging revealed that the problem was that my code Called
MQTTClient_disconnect and MQTTClient_destroy in rapid succession.
Because MQTTClient_disconnect does this:

Thread_start(connectionLost_call, m);

Its entirely possible for connectionLost_call to be executed after the
return from MQTTClient_disconnect completes, at which time the Client
may be destroyed/freed, leading to the above valgrind error.

fix is pretty straightforward. We could use a mutex and condition
variable to serialize the two threads, but Windows doesn't have support
for condition variables currently, so just use one time semaphore to
ensure serialization

Signed-off-by: Neil Horman <nhorman@gmail.com>

convert to use of semaphore
  • Loading branch information
nhorman authored and icraggs committed Nov 28, 2022
1 parent be035e7 commit 911488a
Showing 1 changed file with 16 additions and 2 deletions.
18 changes: 16 additions & 2 deletions src/MQTTClient.c
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,11 @@
const char *client_timestamp_eye = "MQTTClientV3_Timestamp " BUILD_TIMESTAMP;
const char *client_version_eye = "MQTTClientV3_Version " CLIENT_VERSION;

struct conlost_sync_data {
sem_type sem;
void *m;
};

int MQTTClient_init(void);

void MQTTClient_global_init(MQTTClient_init_options* inits)
Expand Down Expand Up @@ -701,9 +706,12 @@ static int clientSockCompare(void* a, void* b)
*/
static thread_return_type WINAPI connectionLost_call(void* context)
{
MQTTClients* m = (MQTTClients*)context;
struct conlost_sync_data *data = (struct conlost_sync_data *)context;
MQTTClients* m = (MQTTClients *)data->m;

(*(m->cl))(m->context, NULL);

Thread_post_sem(data->sem);
return 0;
}

Expand Down Expand Up @@ -1915,6 +1923,9 @@ static int MQTTClient_disconnect1(MQTTClient handle, int timeout, int call_conne
START_TIME_TYPE start;
int rc = MQTTCLIENT_SUCCESS;
int was_connected = 0;
struct conlost_sync_data sync = {
NULL, m
};

FUNC_ENTRY;
if (m == NULL || m->c == NULL)
Expand Down Expand Up @@ -1944,8 +1955,11 @@ static int MQTTClient_disconnect1(MQTTClient handle, int timeout, int call_conne
MQTTClient_stop();
if (call_connection_lost && m->cl && was_connected)
{
sync.sem = Thread_create_sem(&rc);
Log(TRACE_MIN, -1, "Calling connectionLost for client %s", m->c->clientID);
Thread_start(connectionLost_call, m);
Thread_start(connectionLost_call, &sync);
Thread_wait_sem(sync.sem, 5000);
Thread_destroy_sem(sync.sem);
}
FUNC_EXIT_RC(rc);
return rc;
Expand Down

0 comments on commit 911488a

Please sign in to comment.