Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MQTT5 "flow control" gets out of sync, prevents message sending (IDFGH-9028) #243

Closed
egnor opened this issue Dec 23, 2022 · 3 comments
Closed

Comments

@egnor
Copy link

egnor commented Dec 23, 2022

Note, I can spin up a PR to implement my recommended fix (see below) if that's helpful.

Repro case

  • make a program that sends MQTT5 QOS1 messages regularly (like 5/sec or something)
  • temporarily freeze the MQTT server (e.g. skill -STOP mosquitto, or whatever)
  • wait 45 seconds or so
  • unfreeze the MQTT server (e.g. skill -CONT mosquitto)
  • observe that, despite thawing the server, no further messages are found

This is just one example case, there are others (see analysis below).

While the server is frozen, this kind of message shows up

E (567293) mqtt5_client: Client send more than 20 QoS1 and QoS2 PUBLISH packet without no ack
I (567299) mqtt_client: esp_mqtt_client_enqueue check fail
E (567305) mqtt5_client: Client send more than 20 QoS1 and QoS2 PUBLISH packet without no ack
I (567313) mqtt_client: esp_mqtt_client_enqueue check fail
D (567319) mqtt_client: No data received in 0ms (all is quiet)
D (567324) mqtt_client: Sending Duplicated QoS1 message with id=11933
D (567330) mqtt5_client: Sent (49) qos > 0 publish packet without ack

In the failure state, debug messages look like this

E (599892) mqtt5_client: Client send more than 20 QoS1 and QoS2 PUBLISH packet without no ack
I (599898) mqtt_client: esp_mqtt_client_enqueue check fail
E (599904) mqtt5_client: Client send more than 20 QoS1 and QoS2 PUBLISH packet without no ack
I (599912) mqtt_client: esp_mqtt_client_enqueue check fail
D (600129) mqtt_client: No data received in 0ms (all is quiet)

Analysis

As per the MQTT5 spec, the MQTT5 client implements flow control. When a QOS=1 or QOS=2 PUBLISH is sent, esp_mqtt5_flow_control() increments client->send_publish_packet_count:

void esp_mqtt5_flow_control(esp_mqtt5_client_handle_t client)

When a PUBACK is received, esp_mqtt5_parse_puback() decrements the counter:

void esp_mqtt5_parse_puback(esp_mqtt5_client_handle_t client)

This counter is checked in `esp_mqtt5_client_publish_check():

if (client->send_publish_packet_count >= client->mqtt5_config->server_resp_property_info.receive_maximum) {

This is a bit fragile because any corner case that fails to increment or decrement appropriately can lead to an out-of-sync, stuck counter that's permanently too high... so it needs to be carefully checked to ensure that ALL transmissions of PUBLISH and receptions of PUBACK are logged correctly. In this case, I see at least three defects:

  1. esp_mqtt5_parse_puback() is only called from mqtt_process_receive() if is_valid_mqtt_msg() is true for the message-ID in question. However, the counter should be decremented for any PUBACK, even if, say, the original message has been deleted due to expiration
  2. esp_mqtt5_client_publish_check() isn't called for DUP retransmissions
  3. The counter isn't reset to zero on reconnection (as far as I can tell)

Recommendation?

Split out the increment/decrement logic from other code, since it needs to be called very specifically:

  • Rename esp_mqtt5_flow_control() to esp_mqtt5_track_outgoing_publish(), make sure it's called as close as possible to where a PUBLISH would be sent (I think this is good)
  • Move the code from esp_mqtt5_parse_puback() into a new esp_mqtt5_track_incoming_puback(), call it directly when a PUBACK is received before the message ID validity check
  • Fix the decrementing code to set a floor at zero, per the spec
  • Fix the DUP-resending code to call esp_mqtt5_client_publish_check()
  • Reset the counter for a new connection (find some logical place to do that)
@github-actions github-actions bot changed the title MQTT5 "flow control" gets out of sync, prevents message sending MQTT5 "flow control" gets out of sync, prevents message sending (IDFGH-9028) Dec 23, 2022
@ESP-YJM
Copy link
Contributor

ESP-YJM commented Dec 26, 2022

@egnor Thanks for reporting it. I will fixed it.

@ESP-YJM
Copy link
Contributor

ESP-YJM commented Dec 26, 2022

@egnor Could you please try the pacth whether can solve the issue you met.
0001-mqtt5-Fix-flow-control-will-regard-the-DUP-packet-an.patch

@egnor
Copy link
Author

egnor commented Dec 28, 2022

@ESP-YJM Yes, that seems to work, at least in my testing!

espressif-bot pushed a commit to espressif/esp-idf that referenced this issue Sep 4, 2023
git log --oneline dffabb067fb3c39f486033d2e47eb4b1416f0c82..301bd9e0282eb993523b84d6c56d2559346aa071

Detailed description of the changes:
* fix: Added missing update to message data
  - See merge request espressif/esp-mqtt!189
  - See commit espressif/esp-mqtt@cc41d1b
* PR: fixing typos in `mqtt5_error_reason_code`
  - See merge request espressif/esp-mqtt!188
  - feat: Add enum definition with typo to keep backwards compatibility (espressif/esp-mqtt@90b4a45)
  - fixing typos in `mqtt5_error_reason_code` (espressif/esp-mqtt@dc775bb)
* docs: Clarify keepalive timeout
  - See merge request espressif/esp-mqtt!186
  - See commit espressif/esp-mqtt@cb1e6cf
* fix: LOG format strings
  - See merge request espressif/esp-mqtt!187
  - See commit espressif/esp-mqtt@a3b04f2
* Fix: Stop client only if it's running.
  - See merge request espressif/esp-mqtt!183
  - See commit espressif/esp-mqtt@36eec6f
* fix: Error on publish message creation was ignored.
  - See merge request espressif/esp-mqtt!185
  - See commit espressif/esp-mqtt@585e3ba
* Fix: Allocation for connection buffer was incorrectly done.
  - See merge request espressif/esp-mqtt!182
  - See commit espressif/esp-mqtt@6c849c6
* Adds mqtt host tests to Ci
  - See merge request espressif/esp-mqtt!181
  - See commit espressif/esp-mqtt@4050df4
* Fix: Outbox was leaked in case of initialization failure
  - See merge request espressif/esp-mqtt!180
  - See commit espressif/esp-mqtt@5d491a4
* feat: Add option to bind interface of use
  - See merge request espressif/esp-mqtt!179
  - Closes espressif/esp-mqtt#253
  - See commit espressif/esp-mqtt@363fbf7
* Add outbox size control feature
  - See merge request espressif/esp-mqtt!141
  - feat: Introduces outbox limit (espressif/esp-mqtt@372ab7b)
  - Removes unused outbox functions. (espressif/esp-mqtt@21a5491)
  - refactor: Group access to output buffer in mqtt_connection_t (espressif/esp-mqtt@122875b)
* Add custom transport configuration
  - See merge request espressif/esp-mqtt!169
  - feat: Add custom transport configuration (espressif/esp-mqtt@a5c1b44)
* Removes leftover calls to event_handler
  - See merge request espressif/esp-mqtt!178
  - See commit espressif/esp-mqtt@a492935
* PR: Added support to set server common name.
  - See merge request espressif/esp-mqtt!173
  - Added support to set server common name. (espressif/esp-mqtt@6195762)
* Merge branch 'bugfix/return_on_qos0_disconnected' into 'master'
  - See merge request espressif/esp-mqtt!175
  - See commit espressif/esp-mqtt@5bd9724
* Merge branch 'bugfix/cpp_compilation' into 'master'
  - See merge request espressif/esp-mqtt!174
  - See commit espressif/esp-mqtt@70cbaca
* bug: Incorrect return on disconnect qos0 publish
  - See merge request espressif/esp-mqtt!172
* Fix: Compilation in C++ with multiple subscribe
  - See merge request espressif/esp-mqtt!171
* ci: Fix qemu build against 5.1
  - See merge request espressif/esp-mqtt!170
  - See commit espressif/esp-mqtt@94defb8
* Minor cleanups on mqtt client
  - See merge request espressif/esp-mqtt!168
  - Removes pending message count (espressif/esp-mqtt@da6d38a)
  - Bugfix: Dispatch transport error on all write operations (espressif/esp-mqtt@5729048)
  - Merge enqueue functions (espressif/esp-mqtt@72833c7)
* Adds a configuration for outbox data destination
  - See merge request espressif/esp-mqtt!166
  - feat: Adds a configuration for outbox data destination (espressif/esp-mqtt@2c71f9e)
* CI: Add configuration for ttfw
  - See merge request espressif/esp-mqtt!167
  - See commit espressif/esp-mqtt@d71dcf3
* mqtt5: Fix flow control will increase count when send fragmented packet
  - See merge request espressif/esp-mqtt!164
  - Closes espressif/esp-mqtt#255
  - See commit espressif/esp-mqtt@5cce2c4
* Add license information to queue
  - See merge request espressif/esp-mqtt!163
  - See commit espressif/esp-mqtt@36f0faa
* MQTTv5: Fixes and additions from GitHub PRs
  - See merge request espressif/esp-mqtt!162
  - Merges espressif/esp-mqtt#250
  - fix: Allow MQTT v5 zero length payload (espressif/esp-mqtt@65a4fda)
  - feature: Include subscribe_id in esp_mqtt5_event_property_t (espressif/esp-mqtt@1011e63)
* Feature:  Enable SUBSCRIBE to multiple topics
  - See merge request espressif/esp-mqtt!156
  - See commit espressif/esp-mqtt@3210255
* Adds Kconfig option to configure poll read timeout
  - See merge request espressif/esp-mqtt!159
  - Closes: espressif/esp-mqtt#245
  - See commit espressif/esp-mqtt@2fa945d
* Fix formatting when using printf nano
  - See merge request espressif/esp-mqtt!160
  - See commit espressif/esp-mqtt@e9b865e
* mqtt5: Fix flow control will regard the DUP packet and not consider PUBCOMP packet
  - See merge request espressif/esp-mqtt!158
  - Closes espressif/esp-mqtt#243
  - See commit espressif/esp-mqtt@ed76036
* Remove possible null pointer dereferences
  - See merge request espressif/esp-mqtt!157
  - Bugfix: Remove Remove possible null pointer dereferences (espressif/esp-mqtt@f80772b)
espressif-bot pushed a commit to espressif/esp-idf that referenced this issue Sep 18, 2023
* Update submodule:
git log --oneline dffabb067fb3c39f486033d2e47eb4b1416f0c82..05b347643f6679cc60a50a9664b86a24ebf3ae10

Detailed description of the changes:
* Fix: Mock test should include idf_additions.h
  - See merge request espressif/esp-mqtt!191
  - fix: Mock tests include idf_additions.h (espressif/esp-mqtt@f35aaa1)
* Adds missing documentation to outbox configuration.
  - See merge request espressif/esp-mqtt!190
  - docs: Adds missing documentation to outbox configuration. (espressif/esp-mqtt@c355e0b)
* fix: Added missing update to message data
  - See merge request espressif/esp-mqtt!189
  - See commit espressif/esp-mqtt@cc41d1b
* PR: fixing typos in `mqtt5_error_reason_code`
  - See merge request espressif/esp-mqtt!188
  - feat: Add enum definition with typo to keep backwards compatibility (espressif/esp-mqtt@90b4a45)
  - fixing typos in `mqtt5_error_reason_code` (espressif/esp-mqtt@dc775bb)
* docs: Clarify keepalive timeout
  - See merge request espressif/esp-mqtt!186
  - See commit espressif/esp-mqtt@cb1e6cf
* fix: LOG format strings
  - See merge request espressif/esp-mqtt!187
  - See commit espressif/esp-mqtt@a3b04f2
* Fix: Stop client only if it's running.
  - See merge request espressif/esp-mqtt!183
  - See commit espressif/esp-mqtt@36eec6f
* fix: Error on publish message creation was ignored.
  - See merge request espressif/esp-mqtt!185
  - See commit espressif/esp-mqtt@585e3ba
* Fix: Allocation for connection buffer was incorrectly done.
  - See merge request espressif/esp-mqtt!182
  - See commit espressif/esp-mqtt@6c849c6
* Adds mqtt host tests to Ci
  - See merge request espressif/esp-mqtt!181
  - See commit espressif/esp-mqtt@4050df4
* Fix: Outbox was leaked in case of initialization failure
  - See merge request espressif/esp-mqtt!180
  - See commit espressif/esp-mqtt@5d491a4
* feat: Add option to bind interface of use
  - See merge request espressif/esp-mqtt!179
  - Closes espressif/esp-mqtt#253
  - See commit espressif/esp-mqtt@363fbf7
* Add outbox size control feature
  - See merge request espressif/esp-mqtt!141
  - feat: Introduces outbox limit (espressif/esp-mqtt@372ab7b)
  - Removes unused outbox functions. (espressif/esp-mqtt@21a5491)
  - refactor: Group access to output buffer in mqtt_connection_t (espressif/esp-mqtt@122875b)
* Add custom transport configuration
  - See merge request espressif/esp-mqtt!169
  - feat: Add custom transport configuration (espressif/esp-mqtt@a5c1b44)
* Removes leftover calls to event_handler
  - See merge request espressif/esp-mqtt!178
  - See commit espressif/esp-mqtt@a492935
* PR: Added support to set server common name.
  - See merge request espressif/esp-mqtt!173
  - Added support to set server common name. (espressif/esp-mqtt@6195762)
* Merge branch 'bugfix/return_on_qos0_disconnected' into 'master'
  - See merge request espressif/esp-mqtt!175
  - See commit espressif/esp-mqtt@5bd9724
* Merge branch 'bugfix/cpp_compilation' into 'master'
  - See merge request espressif/esp-mqtt!174
  - See commit espressif/esp-mqtt@70cbaca
* bug: Incorrect return on disconnect qos0 publish
  - See merge request espressif/esp-mqtt!172
* Fix: Compilation in C++ with multiple subscribe
  - See merge request espressif/esp-mqtt!171
* ci: Fix qemu build against 5.1
  - See merge request espressif/esp-mqtt!170
  - See commit espressif/esp-mqtt@94defb8
* Minor cleanups on mqtt client
  - See merge request espressif/esp-mqtt!168
  - Removes pending message count (espressif/esp-mqtt@da6d38a)
  - Bugfix: Dispatch transport error on all write operations (espressif/esp-mqtt@5729048)
  - Merge enqueue functions (espressif/esp-mqtt@72833c7)
* Adds a configuration for outbox data destination
  - See merge request espressif/esp-mqtt!166
  - feat: Adds a configuration for outbox data destination (espressif/esp-mqtt@2c71f9e)
* CI: Add configuration for ttfw
  - See merge request espressif/esp-mqtt!167
  - See commit espressif/esp-mqtt@d71dcf3
* mqtt5: Fix flow control will increase count when send fragmented packet
  - See merge request espressif/esp-mqtt!164
  - Closes espressif/esp-mqtt#255
  - See commit espressif/esp-mqtt@5cce2c4
* Add license information to queue
  - See merge request espressif/esp-mqtt!163
  - See commit espressif/esp-mqtt@36f0faa
* MQTTv5: Fixes and additions from GitHub PRs
  - See merge request espressif/esp-mqtt!162
  - Merges espressif/esp-mqtt#250
  - fix: Allow MQTT v5 zero length payload (espressif/esp-mqtt@65a4fda)
  - feature: Include subscribe_id in esp_mqtt5_event_property_t (espressif/esp-mqtt@1011e63)
* Feature:  Enable SUBSCRIBE to multiple topics
  - See merge request espressif/esp-mqtt!156
  - See commit espressif/esp-mqtt@3210255
* Adds Kconfig option to configure poll read timeout
  - See merge request espressif/esp-mqtt!159
  - Closes: espressif/esp-mqtt#245
  - See commit espressif/esp-mqtt@2fa945d
* Fix formatting when using printf nano
  - See merge request espressif/esp-mqtt!160
  - See commit espressif/esp-mqtt@e9b865e
* mqtt5: Fix flow control will regard the DUP packet and not consider PUBCOMP packet
  - See merge request espressif/esp-mqtt!158
  - Closes espressif/esp-mqtt#243
  - See commit espressif/esp-mqtt@ed76036
* Remove possible null pointer dereferences
  - See merge request espressif/esp-mqtt!157
  - Bugfix: Remove Remove possible null pointer dereferences (espressif/esp-mqtt@f80772b)
CommanderRedYT pushed a commit to CommanderRedYT/esp-idf that referenced this issue Sep 24, 2023
git log --oneline dffabb067fb3c39f486033d2e47eb4b1416f0c82..301bd9e0282eb993523b84d6c56d2559346aa071

Detailed description of the changes:
* fix: Added missing update to message data
  - See merge request espressif/esp-mqtt!189
  - See commit espressif/esp-mqtt@cc41d1b
* PR: fixing typos in `mqtt5_error_reason_code`
  - See merge request espressif/esp-mqtt!188
  - feat: Add enum definition with typo to keep backwards compatibility (espressif/esp-mqtt@90b4a45)
  - fixing typos in `mqtt5_error_reason_code` (espressif/esp-mqtt@dc775bb)
* docs: Clarify keepalive timeout
  - See merge request espressif/esp-mqtt!186
  - See commit espressif/esp-mqtt@cb1e6cf
* fix: LOG format strings
  - See merge request espressif/esp-mqtt!187
  - See commit espressif/esp-mqtt@a3b04f2
* Fix: Stop client only if it's running.
  - See merge request espressif/esp-mqtt!183
  - See commit espressif/esp-mqtt@36eec6f
* fix: Error on publish message creation was ignored.
  - See merge request espressif/esp-mqtt!185
  - See commit espressif/esp-mqtt@585e3ba
* Fix: Allocation for connection buffer was incorrectly done.
  - See merge request espressif/esp-mqtt!182
  - See commit espressif/esp-mqtt@6c849c6
* Adds mqtt host tests to Ci
  - See merge request espressif/esp-mqtt!181
  - See commit espressif/esp-mqtt@4050df4
* Fix: Outbox was leaked in case of initialization failure
  - See merge request espressif/esp-mqtt!180
  - See commit espressif/esp-mqtt@5d491a4
* feat: Add option to bind interface of use
  - See merge request espressif/esp-mqtt!179
  - Closes espressif/esp-mqtt#253
  - See commit espressif/esp-mqtt@363fbf7
* Add outbox size control feature
  - See merge request espressif/esp-mqtt!141
  - feat: Introduces outbox limit (espressif/esp-mqtt@372ab7b)
  - Removes unused outbox functions. (espressif/esp-mqtt@21a5491)
  - refactor: Group access to output buffer in mqtt_connection_t (espressif/esp-mqtt@122875b)
* Add custom transport configuration
  - See merge request espressif/esp-mqtt!169
  - feat: Add custom transport configuration (espressif/esp-mqtt@a5c1b44)
* Removes leftover calls to event_handler
  - See merge request espressif/esp-mqtt!178
  - See commit espressif/esp-mqtt@a492935
* PR: Added support to set server common name.
  - See merge request espressif/esp-mqtt!173
  - Added support to set server common name. (espressif/esp-mqtt@6195762)
* Merge branch 'bugfix/return_on_qos0_disconnected' into 'master'
  - See merge request espressif/esp-mqtt!175
  - See commit espressif/esp-mqtt@5bd9724
* Merge branch 'bugfix/cpp_compilation' into 'master'
  - See merge request espressif/esp-mqtt!174
  - See commit espressif/esp-mqtt@70cbaca
* bug: Incorrect return on disconnect qos0 publish
  - See merge request espressif/esp-mqtt!172
* Fix: Compilation in C++ with multiple subscribe
  - See merge request espressif/esp-mqtt!171
* ci: Fix qemu build against 5.1
  - See merge request espressif/esp-mqtt!170
  - See commit espressif/esp-mqtt@94defb8
* Minor cleanups on mqtt client
  - See merge request espressif/esp-mqtt!168
  - Removes pending message count (espressif/esp-mqtt@da6d38a)
  - Bugfix: Dispatch transport error on all write operations (espressif/esp-mqtt@5729048)
  - Merge enqueue functions (espressif/esp-mqtt@72833c7)
* Adds a configuration for outbox data destination
  - See merge request espressif/esp-mqtt!166
  - feat: Adds a configuration for outbox data destination (espressif/esp-mqtt@2c71f9e)
* CI: Add configuration for ttfw
  - See merge request espressif/esp-mqtt!167
  - See commit espressif/esp-mqtt@d71dcf3
* mqtt5: Fix flow control will increase count when send fragmented packet
  - See merge request espressif/esp-mqtt!164
  - Closes espressif/esp-mqtt#255
  - See commit espressif/esp-mqtt@5cce2c4
* Add license information to queue
  - See merge request espressif/esp-mqtt!163
  - See commit espressif/esp-mqtt@36f0faa
* MQTTv5: Fixes and additions from GitHub PRs
  - See merge request espressif/esp-mqtt!162
  - Merges espressif/esp-mqtt#250
  - fix: Allow MQTT v5 zero length payload (espressif/esp-mqtt@65a4fda)
  - feature: Include subscribe_id in esp_mqtt5_event_property_t (espressif/esp-mqtt@1011e63)
* Feature:  Enable SUBSCRIBE to multiple topics
  - See merge request espressif/esp-mqtt!156
  - See commit espressif/esp-mqtt@3210255
* Adds Kconfig option to configure poll read timeout
  - See merge request espressif/esp-mqtt!159
  - Closes: espressif/esp-mqtt#245
  - See commit espressif/esp-mqtt@2fa945d
* Fix formatting when using printf nano
  - See merge request espressif/esp-mqtt!160
  - See commit espressif/esp-mqtt@e9b865e
* mqtt5: Fix flow control will regard the DUP packet and not consider PUBCOMP packet
  - See merge request espressif/esp-mqtt!158
  - Closes espressif/esp-mqtt#243
  - See commit espressif/esp-mqtt@ed76036
* Remove possible null pointer dereferences
  - See merge request espressif/esp-mqtt!157
  - Bugfix: Remove Remove possible null pointer dereferences (espressif/esp-mqtt@f80772b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants