CommsReceiver disconnecting on unknown PUBREC #27

Open
jpwsutton opened this Issue Feb 4, 2016 · 0 comments

Projects

None yet

1 participant

@jpwsutton
Member

migrated from Bugzilla #462474
status UNCONFIRMED severity normal in component MQTT-Java for 1.1
Reported in version unspecified on platform PC
Assigned to: Bin Zhang

On 2015-03-18 10:57:11 -0400, Martijn Stellinga wrote:

Whenever the CommsReceiver class receiver an MqttAck message, the class looks for the token of the originally published message, and if it is not found, it throws an exception (CommsReceiver line 123).

Unfortunately, we have a production system where our application crashed, resulting in our MQTT broker (mosquitto) sending a PUBREC message for a message that is not known in the application anymore.
Because the CommReceiver throws an exception, it disconnects.
Unfortunately, because the PUBREC is never acknowledged, Mosquitto keeps sending the PUBREC message every time the application tries to connect. We can only solve this by clearing the Mosquitto message database, meaning we lose any other valid messages that are still queued.

It seems it would be more robust if the Commsreceiver acknowledges PUBREC messages, even if it does not have a token for them, and logs a warning.

On 2015-03-19 03:09:00 -0400, Bin Zhang wrote:

Per my understanding, the use case is:

Client publishes a message with QoS2, and server receives this message and replies with PUBREC, but client can never acknowledge this PUBREC because it cannot find a stored PUBLISH with the same packet ID.

I think the client shouldn't discard the message util it received PUBREC, so at this time, the client is the message owner. It should consider the message hasn't arrived the server, and should resend the PUBLISH message again. It probably means the client data store is corrupted. And I don't understand why the server will keep sending the PUBREC, unless it receives another PUBLISH from client. because the ownership of the message hasn't been transferred to the server yet.

And yes, i agree the it would be more robust if just acknowledge PUBREC in this case.
but I'm not sure if it's a good idea. But at least i think we need to make sure the client store not lose any QoS1&2 messages even it crashes.

cc Ian, WDYT?

On 2015-11-02 11:00:45 -0500, Maarten van Schouwenburg wrote:

I may have found a situation where this can occur, without the client data store being corrupted.

If the Paho client sends a PUBLISH with the dup flag set, at the exact same time the server sends the PUBREC, the server (mosquitto) will respond to the second message too. Which results in 2 PUBREC messages being send.

If we just ignore the fact that this one unknown and just call clientState.notifyReceivedAck((MqttAck)message); we may get another duplicate message on PUBCOMP

excerpt from mosquitto.log:

1446478939: Received PUBLISH from backend1 (d0, q2, r0, m49638, '/v1/account/xxxxx/devices/xxxxxx/status', ... (103 bytes))
1446478939: Sending PUBREC to backend1 (Mid: 49638)
1446478959: Received PUBLISH from backend1 (d1, q2, r0, m49638, '/v1/account/xxxxx/devices/xxxxxx/status', ... (103 bytes))
1446478959: Sending PUBREC to backend1 (Mid: 49638)
1446478959: Sending PUBREC to backend1 (Mid: 49638)
1446478972: Received PUBREL from backend1 (Mid: 49638)
1446478972: Sending PUBCOMP to backend1 (Mid: 49638)
1446478973: Received PUBREL from backend1 (Mid: 49638)
1446478973: Sending PUBCOMP to backend1 (Mid: 49638)

@jpwsutton jpwsutton added the bug label Feb 12, 2016
@jpwsutton jpwsutton modified the milestone: Backlog Jun 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment