Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain bit not set on messages published to bridges (protocol level 131) #1223

Closed
trarbr opened this issue Jun 20, 2019 · 8 comments · Fixed by #1226 or #1253
Closed

Retain bit not set on messages published to bridges (protocol level 131) #1223

trarbr opened this issue Jun 20, 2019 · 8 comments · Fixed by #1226 or #1253

Comments

@trarbr
Copy link
Contributor

trarbr commented Jun 20, 2019

Environment

  • VerneMQ Version: v1.8.0
  • OS: erlio/docker-vernemq
  • Erlang/OTP version (if building from source): N/A
  • VerneMQ configuration (vernemq.conf) or the changes from the default: N/A
  • Cluster size/standalone: Standalone
  • Mosquitto version: 1.6.3

Expected behavior

Imagine the following setup


                       +-----------+            +-----------+
+------------+         |           <------------+           |       +------------+
|            +<--------+ Mosquitto |            |  VerneMQ  +<------+            |
|  Client B  |         | Bridge    |            |  central  |       |  Client A  |
|            +-------->+           +----------->+           +------>+            |
+------------+         |           |            |           |       +------------+
                       +-----------+            +-----------+

The Mosquitto broker is connected to the VerneMQ broker as a bridge client, using the private bridge protocol (131). When Client A publishes a message to the VerneMQ broker with Retain set to 1, the Mosquitto bridge should receive the message with Retain set to 1.

Actual behaviour

In the above scenario, the Retain bit is currently set to 0 when sent from VerneMQ broker to Mosquitto bridge. Also, if Client B publishes a message to the Mosquitto bridge with Retain set to 1, VerneMQ receives the message with Retain set to 1 (as expected). So it looks like this:

                       +-----------+    R: 0    +-----------+
+------------+  R: 0   |           <------------+           | R: 1  +------------+
|            +<--------+ Mosquitto |            |  VerneMQ  +<------+            |
|  Client B  |         | Bridge    |            |  central  |       |  Client A  |
|            +-------->+           +----------->+           +------>+            |
+------------+  R: 1   |           |    R: 1    |           | R: 0  +------------+
                       +-----------+            +-----------+

But it should look like this (only thing changed is the Retain bit from VerneMQ to Mosquitto):

                       +-----------+ ** R: 1 **  +-----------+
+------------+  R: 0   |           <------------+           | R: 1  +------------+
|            +<--------+ Mosquitto |            |  VerneMQ  +<------+            |
|  Client B  |         | Bridge    |            |  central  |       |  Client A  |
|            +-------->+           +----------->+           +------>+            |
+------------+  R: 1   |           |    R: 1    |           | R: 0  +------------+
                       +-----------+            +-----------+

In fact, if the VerneMQ broker is swapped with a Mosquitto broker, the Retain bit is propagated as expected.

The following steps first reproduce the expected behaviour using Mosquitto, and then the behaviour with VerneMQ:

First create two configuration files for the Mosquitto brokers:

# mosq-central.conf
listener 1883 127.0.0.1
allow_anonymous true
# mosq-bridge.conf:
listener 11883 127.0.0.1

# Bridge configuration
connection central
address 127.0.0.1:1883
topic # out 1 to/central/ from/bridge/
topic # in 0 from/central/ to/bridge/
remote_clientid bridge_remote
local_clientid bridge_local
cleansession false
bridge_protocol_version mqttv31
try_private true
restart_timeout 10
  1. Start the mosq-central broker:
    /usr/local/sbin/mosquitto -c mosq-central.conf

  2. Start the mosq-bridge broker:
    /usr/local/sbin/mosquitto -c mosq-bridge.conf

  3. Start a client A that subscribes to mosq-central:
    mosquitto_sub -p 1883 -t '#' -F 'topic="%t" payload="%p" retained="%r"'

  4. Start a client B that subscribes to mosq-bridge:
    mosquitto_sub -p 11883 -t '#' -F 'topic="%t" payload="%p" retained="%r"'

  5. Publish a retained message to mosq-central on a bridged topic:
    mosquitto_pub -p 1883 -t 'to/bridge/central-says' -m '1' -r
    Client A prints (as expected)
    topic="to/bridge/central-says" payload="1" retained="0"
    Client B prints (as expected)
    topic="from/central/central-says" payload="1" retained="0"

  6. Now disconnect client A and reconnect it. Client A prints (as expected)
    topic="to/bridge/central-says" payload="1" retained="1"

  7. Now disconnect client B and reconnect it. Client B prints (as expected)
    topic="from/central/central-says" payload="1" retained="1"

Stop the mosq-central broker and start a VMQ broker instead: docker run -e "DOCKER_VERNEMQ_ALLOW_ANONYMOUS=on" -p 1883:1883 erlio/docker-vernemq:1.8.0-alpine. Go through steps 2-7. Everything will be the same, except step 7: When reconnected, Client B does not receive the retained message (as the Mosquitto bridge did not receive the message with the Retain bit set to 1).

@larshesel
Copy link
Contributor

Hi @trarbr

Thanks for the detailed write-up - we'll take a look as soon as possible!

@larshesel
Copy link
Contributor

Hi @trarbr

I just took a quick look at this and created a patch for it, see #1226. It would be awesome if you could test it out (it of course have to be approved in review as well).

@trarbr
Copy link
Contributor Author

trarbr commented Jun 21, 2019

Hi @larshesel

Thanks for the quick response. I just tested it out. Unfortunately, it does not solve the issue. Retained messages forwarded from VerneMQ to the Mosquitto bridge client still do not have the retain bit set to 1.

It seems like the changes in the PR relate to vmq_bridge, which I assume is only relevant when VerneMQ is a bridge client. Fixing that sounds like a good idea as well, but in our situation, VerneMQ is the central broker that the Mosquitto bridge client is connecting to.

@trarbr
Copy link
Contributor Author

trarbr commented Jul 5, 2019

I noticed that bridging is actually incorporated in MQTT v5, so I guess the features needed might already be in your MQTT v5 handler?

This is mentioned in the Subscription Options part of the MQTT v5 spec. To quote:

Bit 3 of the Subscription Options represents the Retain As Published option. If 1, Application Messages forwarded using this subscription keep the RETAIN flag they were published with. If 0, Application Messages forwarded using this subscription have the RETAIN flag set to 0. Retained messages sent when the subscription is established have the RETAIN flag set to 1.

I dived into the codebase a bit, and found this which handles the "retain as published" flag. I wonder if it might be possible to set this "retain as published" flag in the subscriber info for connections using protocol 131? I guess that should just fix it? (OK, I bet it's not simply, but hopefully food for thought...)

If not, we will just have to wait until Mosquitto's bridge can connect using MQTT v5 😄

@trarbr
Copy link
Contributor Author

trarbr commented Jul 5, 2019

Digging into it a little more...

The SubInfo tuple is created by vmq_fsm_util:to_vmq_subtopics/2, which is invoked here from vmq_mqtt_fsm.

Would it be acceptable to either add a special case to vmq_fsm_util:to_vmq_subtopics/2, or to handle it in vmq_mqtt_fsm with something like this?

connected(#mqtt_subscribe{message_id=MessageId, topics=Topics}, State) ->
    ...
                SubTopics = subtopics(MaybeChangedTopics, map:get(proto, State))  
   ...
.

%% Internal functions

subtopics(Topics, 131 = _proto) ->
    SubTopics = vmq_mqtt_fsm_util:to_vmq_subtopics(MaybeChangedTopics, undefined)
    lists:map(fun({T, QoS}) -> {T, #{rap => true}} end);
subtopics(Topics, _proto) ->
    vmq_mqtt_fsm_util:to_vmq_subtopics(MaybeChangedTopics, undefined).

(Please excuse my bad pseudo-erlang)

@larshesel
Copy link
Contributor

It seems like the changes in the PR relate to vmq_bridge, which I assume is only relevant when VerneMQ is a bridge client. Fixing that sounds like a good idea as well, but in our situation, VerneMQ is the central broker that the Mosquitto bridge client is connecting to.

Hi @trarbr sorry for me confusing the issue - I'd taken a quick look and noticed that the VerneMQ bridge didn't forward/handle the retained flag at all, which was wrong and then I assumed this was the problem...

So, I just looked at the MQTT (3.1.1) spec to see how the retain bit is handled, and the relevant snippet is this:

When sending a PUBLISH Packet to a Client the Server MUST set the RETAIN flag to 1 if a message is sent as a result of a new subscription being made by a Client [MQTT-3.3.1-8]. It MUST set the RETAIN flag to 0 when a PUBLISH Packet is sent to a Client because it matches an established subscription regardless of how the flag was set in the message it received [MQTT-3.3.1-9]. 

I'm pretty sure VerneMQ forwards the bit like this: when a retained message is published and a subscription already existed the retain flag is 0 and it is only set to 1 when the subscription is made an an retained message is already stored.

Does this make sense? Perhaps the tests you did didn't take this into account?

@trarbr
Copy link
Contributor Author

trarbr commented Jul 9, 2019

Hi @larshesel, no problem - the bridge terminology is quite confusing to me as well.

Yes, VerneMQ's current handling of the retain bit is exactly what is described in MQTT 3.1.1. This matches our observations. And this is great in the majority of cases, except when the clients connect with the try_private protocol (131). The reason protocol 131 exists is exactly to change the handling of retained bits (and to do loop detection). And both of these features were added to MQTT 5 as part of the subscription options (bits 2 and 3), in order to make bridges first-class citizens.

So while VerneMQ currently accepts connections with protocol 131, it doesn't actually implement these changes to the MQTT protocol. Since the logic for Retain as Published (and no-local I assume) is already implemented for MQTT 5, I imagine VerneMQ could implement protocol 131 by piggybacking on the logic you implemented for MQTT 5.

@trarbr
Copy link
Contributor Author

trarbr commented Jul 13, 2019

I just built VerneMQ from master and went through the steps outlined above - and I am happy to say that the issue with retained bits has been fixed 👏 Thank you so much for your work on VerneMQ 🍰

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants