New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mosquitto_loop_misc function call on_disconnect callback twice when keepalive timeout #1067

Closed
xingchen02 opened this Issue Dec 4, 2018 · 4 comments

Comments

Projects
None yet
2 participants
@xingchen02
Copy link

xingchen02 commented Dec 4, 2018

My code is somehting like this:

void discon_cb(struct mosquitto *mosq, ...)
{
    //do something disconnect here
   ...
   mosquitto_destroy(mosq);
}

//init mosq
struct mosquitto *init_mosqtt_instence(...)
{
    struct mosquitto *mosq =  mosquitto_new(...);
    ...
     mosquitto_disconnect_callback_set(mosq, discon_cb);
     mosquitto_connect(mosq, host, port, MQTT_DEFAULT_KEEPALIVE);
    ...
    return mosq;
}

//main func
int main(...)
{
    ...
   struct mosquitto *mosq = init_mosqtt_instence(...);
   
   while(1)
   {
       //handle mosq msg here
       mosquitto_loop(mosq, 2000, 1);
   }

   ...
}

the mosquitto_loop function will call mosquitto_loop_misc function, when keepalive timeout, then mosquitto_loop_misc will do:

int mosquitto_loop_misc(struct mosquitto *mosq)
{
	...
        
        //will call on_disconnect in mosquitto__check_keepalive first time
	mosquitto__check_keepalive(mosq);
	...
		if(mosq->on_disconnect){
			mosq->in_callback = true;
                        
                        //will call on_disconnect here second time
			mosq->on_disconnect(mosq, mosq->userdata, rc);
			mosq->in_callback = false;
		}
	...
	return MOSQ_ERR_SUCCESS;
}
``
so, when this bug occured, my process will suspend or raise a aborted error because call on_disconnect twice(because mosquitto_destroy(mosq) twice)  .

so, I suggest that mosquitto__check_keepalive functiong shoud not call on_disconnect in it, just call on_disconnect in mosquitto_loop_misc function.

so I did this modify and make my process work well:
#ifdef WITH_BROKER
void mosquitto__check_keepalive(struct mosquitto_db *db, struct mosquitto *mosq)
#else
void mosquitto__check_keepalive(struct mosquitto *mosq)
#endif
{
	time_t next_msg_out;
	time_t last_msg_in;
	time_t now = mosquitto_time();
#ifndef WITH_BROKER
	int rc;
#endif

	assert(mosq);
#if defined(WITH_BROKER) && defined(WITH_BRIDGE)
	/* Check if a lazy bridge should be timed out due to idle. */
	if(mosq->bridge && mosq->bridge->start_type == bst_lazy
				&& mosq->sock != INVALID_SOCKET
				&& now - mosq->next_msg_out - mosq->keepalive >= mosq->bridge->idle_timeout){

		log__printf(NULL, MOSQ_LOG_NOTICE, "Bridge connection %s has exceeded idle timeout, disconnecting.", mosq->id);
		net__socket_close(db, mosq);
		return;
	}
#endif
	pthread_mutex_lock(&mosq->msgtime_mutex);
	next_msg_out = mosq->next_msg_out;
	last_msg_in = mosq->last_msg_in;
	pthread_mutex_unlock(&mosq->msgtime_mutex);
	if(mosq->keepalive && mosq->sock != INVALID_SOCKET &&
			(now >= next_msg_out || now - last_msg_in >= mosq->keepalive)){

		if(mosq->state == mosq_cs_connected && mosq->ping_t == 0){
			send__pingreq(mosq);
			/* Reset last msg times to give the server time to send a pingresp */
			pthread_mutex_lock(&mosq->msgtime_mutex);
			mosq->last_msg_in = now;
			mosq->next_msg_out = now + mosq->keepalive;
			pthread_mutex_unlock(&mosq->msgtime_mutex);
		}else{
#ifdef WITH_BROKER
			if(mosq->listener){
				mosq->listener->client_count--;
				assert(mosq->listener->client_count >= 0);
			}
			mosq->listener = NULL;
			net__socket_close(db, mosq);
#else 
#if 0 //a patch for bug: mosquitto_loop_misc function call on_disconnect callback twice  
			net__socket_close(mosq);
			pthread_mutex_lock(&mosq->state_mutex);
			if(mosq->state == mosq_cs_disconnecting){
				rc = MOSQ_ERR_SUCCESS;
			}else{
				rc = MOSQ_ERR_KEEPALIVE;
			}
			pthread_mutex_unlock(&mosq->state_mutex);
			pthread_mutex_lock(&mosq->callback_mutex);
			if(mosq->on_disconnect){
				mosq->in_callback = true;
				mosq->on_disconnect(mosq, mosq->userdata, rc);
				mosq->in_callback = false;
			}
			pthread_mutex_unlock(&mosq->callback_mutex);
#endif
#endif
		}
	}
}

please check and is anyone have another sugesstion about this?
@ralight

This comment has been minimized.

Copy link
Contributor

ralight commented Dec 4, 2018

My suggestion would be that you shouldn't call mosquitto_destroy() to delete everything to do with a mosquitto client instance when you are inside a callback for that instance.

In your specific case, if you use mosquitto_loop_forever(), then when you disconnect it should exit from that function and you can destroy the mosquitto instance afterwards.

@xingchen02

This comment has been minimized.

Copy link

xingchen02 commented Dec 6, 2018

@ralight
Thanks, at first, i'm sorry that there is something i havn't put the case clearly. in my main loop, i can not use mosquitto_loop_forever(), because it is a block function, my main loop looks more like :
//main func
int main(...)
{
...
struct mosquitto *mosq = init_mosqtt_instence(...);
...
while(1)
{
//handle mosq msg here
mosquitto_loop(mosq, 2000, 1);
//do something others...
...

}
...
}

anyway, it is no so good to call disconnect callback twice when keepalive time out, may be the keepalive mechanism need do better to avoiding this problem.

ralight added a commit that referenced this issue Dec 6, 2018

@ralight

This comment has been minimized.

Copy link
Contributor

ralight commented Dec 6, 2018

Yes, you're right there is a redundant bit of code there. I've now removed it, thanks for spotting it.

@ralight ralight added this to the 1.5.5 milestone Dec 6, 2018

@ralight ralight closed this Dec 8, 2018

@xingchen02

This comment has been minimized.

Copy link

xingchen02 commented Dec 17, 2018

Thank you for your professional job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment