New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bridge connection enters a connect-disconnect loop when incomplete QoS 2 publish, and local broker fails to persist for any reason. #57

Closed
ralight opened this Issue Mar 15, 2016 · 2 comments

Comments

Projects
None yet
2 participants
@ralight
Contributor

ralight commented Mar 15, 2016

migrated from Bugzilla #467304
status UNCONFIRMED severity normal in component Mosquitto for 1.4
Reported in version 1.4 on platform PC
Assigned to: Roger Light

On 2015-05-14 02:46:43 -0400, jsaak jsaak wrote:

Bridge connection enters a connect-disconnect loop when incomplete QoS 2 publish, and local broker fails to persist for any reason.

Scenario:

  1. local mosq publishes QoS2 to remote mosq
  2. local mosq dies (fails to persist)
  3. local mosq restarts with "clean_session false"
  4. local mosq reestabilishes bridge connection to remote mosq
  5. remote mosq replies with PUBREC
  6. local mosq does not find the corresponding message in the DB, gives error
  7. local mosq disconnect bridge connection
  8. goto 4.

My proposed solution is that change 6.
If it does not find the mid in the DB reply anyway, with a WARNING.

--- a/mosquitto/lib/read_handle_shared.c
+++ b/mosquitto/lib/read_handle_shared.c
@@ -103,6 +103,10 @@ int _mosquitto_handle_pubrec(struct mosquitto *mosq)
_mosquitto_log_printf(NULL, MOSQ_LOG_DEBUG, "Received PUBREC from %s (Mid: %d)", mosq->id, mid);

    rc = mqtt3_db_message_update(mosq, mid, mosq_md_out, mosq_ms_wait_for_pubcomp);
  • if (rc) {
  • rc = 0;
  • _mosquitto_log_printf(NULL, MOSQ_LOG_WARNING, "Received PUBREC is not in the DB, replying anyway");
  • }
    #else
    _mosquitto_log_printf(mosq, MOSQ_LOG_DEBUG, "Client %s received PUBREC (Mid: %d)", mosq->id, mid);
@hmvp

This comment has been minimized.

Show comment
Hide comment
@hmvp

hmvp May 19, 2016

We see the same in our production environment with the latest version.

Unfortunately this seems endemic in MQTT implementations: see also eclipse/paho.mqtt.java#27 for the same bug in the java client

Under the right circumstances this happens for most/all acknowledgements so PUBCOMP and PUBACK can probably trigger the same behaviour (Possibly SUBACKs as well)

hmvp commented May 19, 2016

We see the same in our production environment with the latest version.

Unfortunately this seems endemic in MQTT implementations: see also eclipse/paho.mqtt.java#27 for the same bug in the java client

Under the right circumstances this happens for most/all acknowledgements so PUBCOMP and PUBACK can probably trigger the same behaviour (Possibly SUBACKs as well)

ralight added a commit that referenced this issue May 19, 2016

[57] Handle PUB* with unknown message id gracefully.
Allows message flow to complete where e.g. the broker didn't persist a
partially complete flow.

Thanks to jsaak jsaak and Hiram van Paassen.

Bug: #57
@ralight

This comment has been minimized.

Show comment
Hide comment
@ralight

ralight May 19, 2016

Contributor

Thanks for the nudge, I believe this is now fixed.

Contributor

ralight commented May 19, 2016

Thanks for the nudge, I believe this is now fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment