New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket connection doesn’t get messages on subscription after terminated reconnect #645

Closed
HyP3r- opened this Issue Dec 6, 2017 · 6 comments

Comments

Projects
None yet
4 participants
@HyP3r-

HyP3r- commented Dec 6, 2017

I’m using a STMicroelectronics Wireless Module (SPWF04S) as MQTT Websocket Client. This device doesn’t support any other protocol. After the connection is established the Wireless Module does subscribe on the topic c2d/dev/00000007/ to receive messages from the server which are dedicated to this device. If I now disconnect the device from the mains and connect it again (while this procedure the device doesn’t close the Websocket Connection), the device does again subscribe on the topic c2d/dev/00000007/ but then doesn’t receive any messages from this topic. If I reconnect the device from mains again, the device receives messages. That means every second time I connect the device, the device doesn't receive messages from the broker from this topic. I also can reproduce this fault with a Desktop Websockets MQTT Client when I terminate the executable (sigkill) instead of normally closing it.

I’m using mosquito 1.4.14 and libwebsockets 2.4.1 (fresh build today).

Here are the logs (with log_type all).

Normal Situation:

Received PUBLISH from SPWF04S (d0, q0, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Sending PUBLISH to ccfc508d-af4a-413b-93f0-c3e2753fe3d4 (d0, q0, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef (d0, q0, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
Sending PUBLISH to SPWF04S (d0, q0, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))…

Broken Situation:

Received PUBLISH from SPWF04S (d0, q0, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Sending PUBLISH to ccfc508d-af4a-413b-93f0-c3e2753fe3d4 (d0, q0, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef (d0, q0, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))

The Broker does not send the Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef to SPWF04S.

That’s how the connection looks like (always the same):

Client SPWF04S already connected, closing old connection.
New client connected from 212.118.196.186 as SPWF04S (c1, k60, u'user').
Sending CONNACK to SPWF04S (0, 0)
Received SUBSCRIBE from SPWF04S
    c2d/dev/00000007/ (QoS 2)
SPWF04S 2 c2d/dev/00000007/
Sending SUBACK to SPWF04S
@karlp

This comment has been minimized.

Show comment
Hide comment
@karlp

karlp Dec 6, 2017

Contributor

a) you can definitely use more than just websockets mqtt with that module. even st talks about native mqtt, http rest, anything you like. anyway...

b) you seem to be subscribing at qos2, but only publishing at qos 0? If that's the case, messages will not be queued for your subscriber when it goes offline. It's unclear from your logs when you are disconnecting things. I presume you're seeing them disconnect later, after the keepalive, or if you reconnect soon enough, you're seeing that "client xxxx already connected..." line.

c) your client is subscribing with what seems to be a fixed clientid, good, but without cleansession set to false, you're also not going to get anything queued for you, even if you fix your qos problem.

Contributor

karlp commented Dec 6, 2017

a) you can definitely use more than just websockets mqtt with that module. even st talks about native mqtt, http rest, anything you like. anyway...

b) you seem to be subscribing at qos2, but only publishing at qos 0? If that's the case, messages will not be queued for your subscriber when it goes offline. It's unclear from your logs when you are disconnecting things. I presume you're seeing them disconnect later, after the keepalive, or if you reconnect soon enough, you're seeing that "client xxxx already connected..." line.

c) your client is subscribing with what seems to be a fixed clientid, good, but without cleansession set to false, you're also not going to get anything queued for you, even if you fix your qos problem.

@HyP3r-

This comment has been minimized.

Show comment
Hide comment
@HyP3r-

HyP3r- Dec 7, 2017

Thank you for your fast answer.

I have double checked, the STMicroelectronics Wireless Module SPWF04S does only support MQTT over Websockets:

The MQTT layer is built on top of web sockets. This means that (when an MQTT broker has been connected through AT+S.MQTTCONN) the total amount of available web sockets decreases to just 1. Note that it is not possible to run another AT+S.WSOCKON to the same web socket server.

As next step I have fixed all publishes and subscriptions to QoS 2 but it was still doesn’t working. After I have changed the Fixed ID to a Random ID which I generate each time new, everything seems to work.

But I still don’t get it: after the client starts and establishes connection (with clean session and a fixed ID), the client should get new messages (I don’t think and want to talk about persistence) from the subscription. But why does the client don’t get this messages?

HyP3r- commented Dec 7, 2017

Thank you for your fast answer.

I have double checked, the STMicroelectronics Wireless Module SPWF04S does only support MQTT over Websockets:

The MQTT layer is built on top of web sockets. This means that (when an MQTT broker has been connected through AT+S.MQTTCONN) the total amount of available web sockets decreases to just 1. Note that it is not possible to run another AT+S.WSOCKON to the same web socket server.

As next step I have fixed all publishes and subscriptions to QoS 2 but it was still doesn’t working. After I have changed the Fixed ID to a Random ID which I generate each time new, everything seems to work.

But I still don’t get it: after the client starts and establishes connection (with clean session and a fixed ID), the client should get new messages (I don’t think and want to talk about persistence) from the subscription. But why does the client don’t get this messages?

@karlp

This comment has been minimized.

Show comment
Hide comment
@karlp

karlp Dec 7, 2017

Contributor

are you having problems gettting messages that were published while you were powered off? That's what I thought the first time.

If so, you must have a fixed ID, (NOT random id) and you must NOT have clean session.

Contributor

karlp commented Dec 7, 2017

are you having problems gettting messages that were published while you were powered off? That's what I thought the first time.

If so, you must have a fixed ID, (NOT random id) and you must NOT have clean session.

@HyP3r-

This comment has been minimized.

Show comment
Hide comment
@HyP3r-

HyP3r- Dec 8, 2017

are you having problems gettting messages that were published while you were powered off? That's what I thought the first time.

No. I do not receive any messages that were sent after the device was turned on and connected to the server.

For a better understanding, I have compiled the following protocol here (please mind the second connection, the first and third is working):

...
Client SPWF04S already connected, closing old connection.
New client connected from 212.118.196.186 as SPWF04S (c1, k60, u'user').
Sending CONNACK to SPWF04S (0, 0)
Received SUBSCRIBE from SPWF04S
    c2d/dev/00000007/ (QoS 2)
SPWF04S 2 c2d/dev/00000007/
Sending SUBACK to SPWF04S
...
Received PUBLISH from SPWF04S (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Sending PUBLISH to ccfc508d-af4a-413b-93f0-c3e2753fe3d4 (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef (d0, q2, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
Sending PUBLISH to SPWF04S (d0, q2, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
...
Client SPWF04S already connected, closing old connection.
New client connected from 212.118.196.186 as SPWF04S (c1, k60, u'user').
Sending CONNACK to SPWF04S (0, 0)
Received SUBSCRIBE from SPWF04S
    c2d/dev/00000007/ (QoS 2)
SPWF04S 2 c2d/dev/00000007/
Sending SUBACK to SPWF04S
...
Received PUBLISH from SPWF04S (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Sending PUBLISH to ccfc508d-af4a-413b-93f0-c3e2753fe3d4 (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef (d0, q2, r0, m0, 'c2d/dev/00000007/
...
Client SPWF04S already connected, closing old connection.
New client connected from 212.118.196.186 as SPWF04S (c1, k60, u'user').
Sending CONNACK to SPWF04S (0, 0)
Received SUBSCRIBE from SPWF04S
    c2d/dev/00000007/ (QoS 2)
SPWF04S 2 c2d/dev/00000007/
Sending SUBACK to SPWF04S
...
Received PUBLISH from SPWF04S (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Sending PUBLISH to ccfc508d-af4a-413b-93f0-c3e2753fe3d4 (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef (d0, q2, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
Sending PUBLISH to SPWF04S (d0, q2, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
...

HyP3r- commented Dec 8, 2017

are you having problems gettting messages that were published while you were powered off? That's what I thought the first time.

No. I do not receive any messages that were sent after the device was turned on and connected to the server.

For a better understanding, I have compiled the following protocol here (please mind the second connection, the first and third is working):

...
Client SPWF04S already connected, closing old connection.
New client connected from 212.118.196.186 as SPWF04S (c1, k60, u'user').
Sending CONNACK to SPWF04S (0, 0)
Received SUBSCRIBE from SPWF04S
    c2d/dev/00000007/ (QoS 2)
SPWF04S 2 c2d/dev/00000007/
Sending SUBACK to SPWF04S
...
Received PUBLISH from SPWF04S (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Sending PUBLISH to ccfc508d-af4a-413b-93f0-c3e2753fe3d4 (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef (d0, q2, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
Sending PUBLISH to SPWF04S (d0, q2, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
...
Client SPWF04S already connected, closing old connection.
New client connected from 212.118.196.186 as SPWF04S (c1, k60, u'user').
Sending CONNACK to SPWF04S (0, 0)
Received SUBSCRIBE from SPWF04S
    c2d/dev/00000007/ (QoS 2)
SPWF04S 2 c2d/dev/00000007/
Sending SUBACK to SPWF04S
...
Received PUBLISH from SPWF04S (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Sending PUBLISH to ccfc508d-af4a-413b-93f0-c3e2753fe3d4 (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef (d0, q2, r0, m0, 'c2d/dev/00000007/
...
Client SPWF04S already connected, closing old connection.
New client connected from 212.118.196.186 as SPWF04S (c1, k60, u'user').
Sending CONNACK to SPWF04S (0, 0)
Received SUBSCRIBE from SPWF04S
    c2d/dev/00000007/ (QoS 2)
SPWF04S 2 c2d/dev/00000007/
Sending SUBACK to SPWF04S
...
Received PUBLISH from SPWF04S (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Sending PUBLISH to ccfc508d-af4a-413b-93f0-c3e2753fe3d4 (d0, q2, r0, m0, 'd2c/dev/00000007/', ... (40 bytes))
Received PUBLISH from 864e7ddd-89f8-485d-a876-a24e98fe3cef (d0, q2, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
Sending PUBLISH to SPWF04S (d0, q2, r0, m0, 'c2d/dev/00000007/', ... (146 bytes))
...
@PierreF

This comment has been minimized.

Show comment
Hide comment
@PierreF

PierreF Jan 5, 2018

Contributor

I can reproduce this issue. It happen when a new client replace an existing one (i.e. using the same clientid) and cleansession is False.
This means that if you don't care about message sent while you are offline, using cleansession=true and a random clientid should fix your issue.


To reproduce this issue, I'm using:

cat > /tmp/mosquitto.conf << EOF
listener 1885
listener 1886
protocol websockets
EOF
mosquitto -c /tmp/mosquitto.conf -v

Run the following client:

cat > /tmp/mqtt-client.py << EOF
import paho.mqtt.client as mqtt
import time
TRANSPORT = 'websockets'
PORT = 1886
HOST = 'localhost'
def on_connect(client, userdata, flag, rc):
    if rc == 0:
        client.subscribe('topic')
def on_subscribe(client, userdata, mid, granted_qos):
    print('> Sending message')
    client.publish('topic', 'payload')
def on_message(client, userdata, msg):
    print('< Received message %s %s' % (msg.topic, msg.payload))

client1 = mqtt.Client(client_id='wsclient', clean_session=True, transport=TRANSPORT)
client2 = mqtt.Client(client_id='wsclient', clean_session=True, transport=TRANSPORT)
client1.on_connect = on_connect
client1.on_subscribe = on_subscribe
client1.on_message = on_message
client2.on_connect = on_connect
client2.on_subscribe = on_subscribe
client2.on_message = on_message

client1.connect(HOST, PORT)
client1.loop_start()

time.sleep(1)
print('Switching client...')
client1.loop_stop()
client2.connect(HOST, PORT)
client2.loop_start()
time.sleep(1)
print('Switching client...')
client2.loop_stop()
client1.loop_start()
time.sleep(60)
EOF
$ python3 /tmp/mqtt-client.py

You will see that the message is only received by first and third client.

> Sending message
< Received message topic b'payload'
Switching client...
> Sending message
Switching client...
> Sending message
< Received message topic b'payload'

On server logs:

1515150036: New client connected from 127.0.0.1 as wsclient (c1, k60).
1515150036: Sending CONNACK to wsclient (0, 0)
1515150036: Received SUBSCRIBE from wsclient
1515150036: 	topic (QoS 0)
1515150036: wsclient 0 topic
1515150036: Sending SUBACK to wsclient
1515150036: Received PUBLISH from wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))
1515150036: Sending PUBLISH to wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))
1515150037: Client wsclient already connected, closing old connection.
1515150037: New client connected from 127.0.0.1 as wsclient (c1, k60).
1515150037: Sending CONNACK to wsclient (0, 0)
1515150037: Received SUBSCRIBE from wsclient
1515150037: 	topic (QoS 0)
1515150037: wsclient 0 topic
1515150037: Sending SUBACK to wsclient
1515150037: Received PUBLISH from wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))
1515150038: Client wsclient disconnected.
1515150039: Client wsclient already connected, closing old connection.
1515150039: New client connected from 127.0.0.1 as wsclient (c1, k60).
1515150039: Sending CONNACK to wsclient (0, 0)
1515150039: Received SUBSCRIBE from wsclient
1515150039: 	topic (QoS 0)
1515150039: wsclient 0 topic
1515150039: Sending SUBACK to wsclient
1515150039: Received PUBLISH from wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))
1515150039: Sending PUBLISH to wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))

If you change TRANSPORT and PORT to "tcp" and 1885, the client will works as expected.


I think the issue come because mqtt3_handle_connect will add a duplicate key in contexts_by_id which is not permitted by uthash.
This only happen with WebSocket connection, because with classic connection, the old context is removed from contexts_by_id during do_disconnect.

Contributor

PierreF commented Jan 5, 2018

I can reproduce this issue. It happen when a new client replace an existing one (i.e. using the same clientid) and cleansession is False.
This means that if you don't care about message sent while you are offline, using cleansession=true and a random clientid should fix your issue.


To reproduce this issue, I'm using:

cat > /tmp/mosquitto.conf << EOF
listener 1885
listener 1886
protocol websockets
EOF
mosquitto -c /tmp/mosquitto.conf -v

Run the following client:

cat > /tmp/mqtt-client.py << EOF
import paho.mqtt.client as mqtt
import time
TRANSPORT = 'websockets'
PORT = 1886
HOST = 'localhost'
def on_connect(client, userdata, flag, rc):
    if rc == 0:
        client.subscribe('topic')
def on_subscribe(client, userdata, mid, granted_qos):
    print('> Sending message')
    client.publish('topic', 'payload')
def on_message(client, userdata, msg):
    print('< Received message %s %s' % (msg.topic, msg.payload))

client1 = mqtt.Client(client_id='wsclient', clean_session=True, transport=TRANSPORT)
client2 = mqtt.Client(client_id='wsclient', clean_session=True, transport=TRANSPORT)
client1.on_connect = on_connect
client1.on_subscribe = on_subscribe
client1.on_message = on_message
client2.on_connect = on_connect
client2.on_subscribe = on_subscribe
client2.on_message = on_message

client1.connect(HOST, PORT)
client1.loop_start()

time.sleep(1)
print('Switching client...')
client1.loop_stop()
client2.connect(HOST, PORT)
client2.loop_start()
time.sleep(1)
print('Switching client...')
client2.loop_stop()
client1.loop_start()
time.sleep(60)
EOF
$ python3 /tmp/mqtt-client.py

You will see that the message is only received by first and third client.

> Sending message
< Received message topic b'payload'
Switching client...
> Sending message
Switching client...
> Sending message
< Received message topic b'payload'

On server logs:

1515150036: New client connected from 127.0.0.1 as wsclient (c1, k60).
1515150036: Sending CONNACK to wsclient (0, 0)
1515150036: Received SUBSCRIBE from wsclient
1515150036: 	topic (QoS 0)
1515150036: wsclient 0 topic
1515150036: Sending SUBACK to wsclient
1515150036: Received PUBLISH from wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))
1515150036: Sending PUBLISH to wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))
1515150037: Client wsclient already connected, closing old connection.
1515150037: New client connected from 127.0.0.1 as wsclient (c1, k60).
1515150037: Sending CONNACK to wsclient (0, 0)
1515150037: Received SUBSCRIBE from wsclient
1515150037: 	topic (QoS 0)
1515150037: wsclient 0 topic
1515150037: Sending SUBACK to wsclient
1515150037: Received PUBLISH from wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))
1515150038: Client wsclient disconnected.
1515150039: Client wsclient already connected, closing old connection.
1515150039: New client connected from 127.0.0.1 as wsclient (c1, k60).
1515150039: Sending CONNACK to wsclient (0, 0)
1515150039: Received SUBSCRIBE from wsclient
1515150039: 	topic (QoS 0)
1515150039: wsclient 0 topic
1515150039: Sending SUBACK to wsclient
1515150039: Received PUBLISH from wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))
1515150039: Sending PUBLISH to wsclient (d0, q0, r0, m0, 'topic', ... (7 bytes))

If you change TRANSPORT and PORT to "tcp" and 1885, the client will works as expected.


I think the issue come because mqtt3_handle_connect will add a duplicate key in contexts_by_id which is not permitted by uthash.
This only happen with WebSocket connection, because with classic connection, the old context is removed from contexts_by_id during do_disconnect.

@ralight

This comment has been minimized.

Show comment
Hide comment
@ralight

ralight Sep 25, 2018

Contributor

This behaviour was fixed earlier, but I've now ensured that the old client gets removed from the hash before the duplicate one is added. This will be in 1.5.3.

Contributor

ralight commented Sep 25, 2018

This behaviour was fixed earlier, but I've now ensured that the old client gets removed from the hash before the duplicate one is added. This will be in 1.5.3.

@ralight ralight closed this Sep 25, 2018

@ralight ralight added this to the 1.5.3 milestone Sep 25, 2018

ralight added a commit that referenced this issue Sep 27, 2018

Fix incorrect hash usage with duplicate clients.
Fix duplicate clients being added to by_id hash before the old client
was removed.

Closes #645.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment