Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] MQTT Client Received pubcomp for unknown message ID: XX #1108

Open
1 task
Leviathan09 opened this issue Aug 21, 2023 · 22 comments
Open
1 task

[Bug] MQTT Client Received pubcomp for unknown message ID: XX #1108

Leviathan09 opened this issue Aug 21, 2023 · 22 comments
Assignees
Labels
bug Something isn't working

Comments

@Leviathan09
Copy link

Platform

ESP32

Assembly

I did the assebly by myself

nRF24L01+ Module

nRF24L01+ plus

Antenna

external antenna

Power Stabilization

Elko (~100uF)

Connection picture

  • I will attach/upload an Image of my wiring

Version

0.7.36

Github Hash

ba218ed

Build & Flash Method

AhoyDTU Webinstaller

Setup

System Config is default
Static IP
Only 1 inverter, Intervall 6s, max reries 5
de.pool.ntp.org as NTP
Longtitude and Latitude set with 45 min offset and pause during night
MQTT Broker set with interval 0s, Broker is ioBroker
No Display

Debug Serial Log output

No response

Error description

Since the update from 0.6.0 to 0.7.36 i recieve many MQTT log entries in my ioBroker like this

'Client Received pubcomp for unknown message ID: XX`

I didn't change anything accept updating to a newer version.

@Leviathan09 Leviathan09 added the bug Something isn't working label Aug 21, 2023
@lumapu
Copy link
Owner

lumapu commented Aug 21, 2023

Interessting. I have ioBroker here and two ESPs (8266 and 32). Both of them are connected to the mqtt.0 instance. If I check my log in ioBroker, I can't see anything regarding your reported problem.
Do you use username and password for MqTT?

@Leviathan09
Copy link
Author

Interessting. I have ioBroker here and two ESPs (8266 and 32). Both of them are connected to the mqtt.0 instance. If I check my log in ioBroker, I can't see anything regarding your reported problem. Do you use username and password for MqTT?

Yes, username and password is used but without any special characters or something, just alphabetic character and numbers

@rmayergfx
Copy link

rmayergfx commented Aug 21, 2023

Check your mqtt topics, they changed as result of an issue with Home Assistant. Bu sure to select the right ones after update 0.7.x
#1066

TIpp: Set ntp to your Router IP, this is faster and more stable. You should block AhoyDTU from internet by setting or activating the rules inside your Router.

@Leviathan09
Copy link
Author

Check your mqtt topics, they changed as result of an issue with Home Assistant. Bu sure to select the right ones after update 0.7.x #1066

TIpp: Set ntp to your Router IP, this is faster and more stable. You should block AhoyDTU from internet by setting or activating the rules inside your Router.

iobroker is listening to all mqtt topics, so Home Assistant stuff does not applie here.

Good hint with the router IP for ntp

@rmayergfx
Copy link

rmayergfx commented Aug 23, 2023

Listening to all topics is not the problem, can you see all the topics from ahoydtu in iobroker?
The update changed the structure, so it could be possible that you need to select the right topic again.
Is TasmoAdmin in use? this can also be a reason for the messages!

@BerziOnline
Copy link

I have the same setup and it happens after a few hours the same in my case. Restarting the AhoyDTU or the MQTT adapter in my iobroker fixes the problem directly for the next few hours. It doesn't play a role which side of the MQTT connection I restart, I just have to do it for one side and it's directly working again. But after a few hours the logging starts again with the same messages and the DTU stops reacting on limit control commands over MQTT anymore. (but some values are still updating from DTU to broker, it just seems that the DTU stops reacting on MQTT commands)

@BerziOnline
Copy link

BerziOnline commented Sep 3, 2023

Because the DTU has freezed right now again, I checked the logs even more on iobroker side and I saw, that the reason is a huge amount of these error-messages:
2023-09-03_09-15

The error message sometimes changes:
2023-09-03_09-11_1

I have this problem since 4 days, but one day it wasn't that bad obviously: (very strange)
2023-09-03_09-11

Right now I am getting these messages (although the DTU is full functional and always reachable - distance to the router is 20cm):
2023-09-03_09-19

The log to the same time now on the DTU side:
2023-09-03_09-19_1

Rebooting the DTU directly stops the error messages in the iobroker logs and it's working again.

I have no idea why this is happening. I also investigated the logs trying figure out when this error message starts, but I couldn't see any strange behaviour or commands to the DTU before.

@BerziOnline
Copy link

The bug comes with the step from 0.7.26 to 0.7.36 very sure.

I went over to dev v0.7.45 and the problem still exists. A downgrade to 0.7.26 eliminates the full problem directly.

@Leviathan09
Copy link
Author

The bug comes with the step from 0.7.26 to 0.7.36 very sure.

I went over to dev v0.7.45 and the problem still exists. A downgrade to 0.7.26 eliminates the full problem directly.

Good to known that a downgrade help.
In the meantime i changed the log level of the mqtt adapter to warning which also helped

@I2e4per
Copy link

I2e4per commented Sep 6, 2023

I had this issue too.
I am controlling two inverters over iobroker -> jscript -> mqtt. (Powerlimit)
but i am still on 0.7.45. The only thing i messed around with was that the values i send to the inverter were of type float and not of type string. It expects type string, so i cast them to string.
Furthermore, i rounded the values down to only full decimal values.
Reseted the inverter, rebooted ahoy dtu and since then, this error messages are gone and everything seems working fine.

@Leviathan09
Copy link
Author

I use this script HoymilesZeroExport for controlling my inverter.
But maybe the float/string part could be a problem when ahydtu reports the stuff back to iobroker with mqtt

@I2e4per
Copy link

I2e4per commented Sep 6, 2023

I had a quick look at this *.py script.
as far as i can see there is a cast from float to int, which is basically sort of the same i am doing by rounding to full decimal values.
Therefore, yes. This could be related.

@I2e4per
Copy link

I2e4per commented Sep 6, 2023

see def SetLimit(pLimit):
...
NewLimit = CastToInt(pLimit*Factor)

So the def CastToInt() Method should be changed to Cast ToString() or additional be implemented.
A simple str(..) of the result might also do the trick already.

@I2e4per
Copy link

I2e4per commented Sep 6, 2023

The question is, wht exactly ahoy dtu or iobroker is expecting here.

@BerziOnline
Copy link

I don't think it's the source of the problem. The data sent from my iobroker to the inverter is always a string, because I am using not the percentage as limit, I am using the absulte values. So the limit is a built string with the number and the unit sign "W". This is how the mqtt limit field distinguishes if you want to have an absolute limit or not. So the only chance giving to the same datapoint an absolut limit is in adding "W" to your number, what is only possible if you build a string before. It's described in the documentation of the MQTT part of AhoyDTU.

So as a conclusion:
I already just sending the limit control as a string, because the AhoyDTU exactly needs that to understand it as an absolute value.

@Airwalk007
Copy link

Ich muss diese issue nochmal aufwärmen:

Mein Setup: AhoyDTU 0.8.111 mit 3x HM-800, HoymilesZeroExport-Skript, VzLogger, ioBroker mit MQTT-Server

Seit einigen Versionen (0.8.97 - 0.8.111) habe ich im ioBroker-Protokoll alle 10 Sekunden die folgende Fehlemeldung der MQTT-Servers:

Client [ahoydtu] Received pubrel for unknown message ID

2024-04-19 17_30_28-Window

Die IDs wiederholen sich alle 10 Sekunden bzw. werden irgendwann immer mehr ...

2024-04-19 17_44_48-Window

Diese "Infos" treten nach einigen Stunden nach dem Neustart der Ahoy DTU (oder der MQTT-Instanz) auf. Ich hab nun schon mehrere Ansätze verfolgt - vom Abschalten des HoymilesZeroExport-Skripts über das Aufsetzen einer neuen MQTT-Instanz im ioBroker und die Veränderung des Abfrageintervalls der AhoyDTU mit den Wechselrichtern (höher und niedriger als 10 Sek.) ... nichts hat geholfen (und die Fehlermeldung kam immer alle 10 Sekunden - egal, wie das Abfrageintervall in der AhoyDTU eingestellt ist). MQTT-Intervall steht auf 0 (Standard) in der Ahoy-DTU.

Hat irgendwer noch eine Idee wo man schauen könnte? Diese Problematik trat die letzten Monate nie auf und kam (meiner Meinung nach ohne Veränderungen am Setup) plötzlich. Die Ahoy-DTU habe ich immer geupdated (und die Config übernommen/angepasst) - seit 0.6.xx .... liegt darion vielleicht das Problem? Mich machen die 10 Sekunden stutzig (und die Fehleranzahl - es sind jeweils 50, 52 oder mal 118 Fehler gleichzeitig) ...

@RALFISBORN
Copy link

Gleiches Problem hier bei IOBroker MQTT Adapter V5.2.0 und AhoyDTU V0.8.83
image

@lumapu
Copy link
Owner

lumapu commented May 30, 2024

ich verwendet auch ioBroker, kann den Fehler aber garnicht sehen 🫤

@RALFISBORN
Copy link

bisher konnte ich den Fehler leider nicht mehr reproduzieren.

@stefan123t
Copy link
Collaborator

Ich habe das hier im ioBroker Issue Tracker in einem Kommentar zu 297 gefunden:

In fact whats needed would be a debug log from before and after when it happens to see what exactly the referenced message ids were,

If this error happens on adapter start (like inthe log on Dropbox) I can only assume that the adapter stopped "In the middl of message sending" because the adapter received a pubrel (which is the qos2 answer to pubrec) where he do not know the ID (because IDs were cleared on adapterstart), so I would consider this as an potential normal behaviour for a "restart" of the adapter. This is for the case that the adapter is the receiver of the message

QOS2 process is: 1. sender sends publish -> 2. receiver sends pubrec -> 3. sender sends pubrel -> 4. receiver sends pubcomp

In this case the adapter was ended after he sended out the first pubrec and before he sent out pubcomp and so the sender resends the pubrel, but we do not know these IDs anymore. But the sender remembered that these messages were in the queue unanswered.

We now could persist these inflight messages somehow and restore them on restart ... but we do not do that right now.

So the "pubrel" ones are ok and to be ignored for now ... in fact you know that you somehow did not received these number of messages from the last adapter run ...

@Leviathan09 @BerziOnline @RALFISBORN tritt der Fehler bei Euch noch auf oder können wir das Issue auf der Ahoy Seite schließen ?

@stefan123t stefan123t changed the title MQTT Client Received pubcomp for unknown message ID: XX [Bug] MQTT Client Received pubcomp for unknown message ID: XX Oct 29, 2024
@RALFISBORN
Copy link

@stefan123t Danke für den Hinweis. Bei mir tritt er nicht mehr auf.

@stefan123t
Copy link
Collaborator

Some more details about QOS2 reliability:

MQTT QoS 0, 1, 2 Explained: A Quickstart Guide
https://www.emqx.com/en/blog/introduction-to-mqtt-qos

What is QoS in MQTT?

QoS in MQTT refers to the level of guarantee for message delivery between the publisher and the subscriber. It offers three levels of service:

  • QoS 0 – At most once.
  • QoS 1 – At least once.
  • QoS 2 – Exactly once.

These levels correspond to increasing levels of reliability for message delivery. QoS 0 may lose messages, QoS 1 guarantees the message delivery but potentially exists duplicate messages, and QoS 2 ensures that messages are delivered exactly once without duplication. As the QoS level increases, the reliability of message delivery also increases, but so does the complexity of the transmission process.

In the publisher-to-subscriber delivery process, the publisher specifies the QoS level of a message in the PUBLISH packet. The broker typically forwards the message to the subscriber with the same QoS level. However, in some cases, the subscriber's requirements may necessitate a reduction in the QoS level of the forwarded message.

For example, if a subscriber specifies that they only want to receive messages with a QoS level of 1 or lower, the broker will downgrade any QoS 2 messages to QoS 1 before forwarding them to this subscriber. Messages with QoS 0 and QoS 1 will be transmitted to the subscriber with their original QoS levels unchanged.

MQTT QoS Downgrade

image
...

QoS 2 - Exactly Once

QoS 2 ensures that messages are not lost or duplicated, unlike in QoS 0 and 1. However, it also has the most complex interactions and the highest overhead, as it requires at least two request/response flows between the sender and receiver for each message delivery.

MQTT QoS 2

MQTT QoS 2

  1. To initiate a QoS 2 message transmission, the sender first stores and sends a PUBLISH packet with QoS 2 and then waits for a PUBREC response packet from the receiver. This process is similar to QoS 1, with the exception that the response packet is PUBREC instead of PUBACK.
  2. Upon receiving a PUBREC packet, the sender can confirm that the PUBLISH packet was received by the receiver and can delete its locally stored copy. It no longer needs and cannot retransmit this packet. The sender then sends a PUBREL packet to inform the receiver that it is ready to release the Packet ID. Like the PUBLISH packet, the PUBREL packet needs to be reliably delivered to the receiver, so it is stored for potential retransmission and a response packet is required.
  3. When the receiver receives the PUBREL packet, it can confirm that no additional retransmitted PUBLISH packets will be received in this transmission flow. As a result, the receiver responds with a PUBCOMP packet to signal that it is prepared to reuse the current Packet ID for a new message.
  4. When the sender receives the PUBCOMP packet, the QoS 2 flow is complete. The sender can then send a new message with the current Packet ID, which the receiver will treat as a new message.

Why Are QoS 2 Messages Not Duplicated?

The mechanisms used to ensure that QoS 2 messages are not lost are the same as those used for QoS 1, so they will not be discussed again here.

Compared with QoS 1, QoS 2 ensures that messages are not duplicated by adding a new process involving the PUBREL and PUBCOMP packets.

Before we go any further, let's quickly review the reasons why QoS 1 cannot avoid message duplication.

When we use QoS 1, for the receiver, the Packet ID becomes available again after the PUBACK packet is sent, regardless of whether the response has reached the sender. This means that the receiver cannot determine whether the PUBLISH packet it receives later, with the same Packet ID, is a retransmission from the sender due to not receiving the PUBACK response, or if the sender has reused the Packet ID to send a new message after receiving the PUBACK response. This is why QoS 1 cannot avoid message duplication.

MQTT PUBACK

MQTT PUBACK

In QoS 2, the sender and receiver use the PUBREL and PUBCOMP packets to synchronize the release of Packet IDs, ensuring that there is a consensus on whether the sender is retransmitting a message or sending a new one. This is the key to avoiding the issue of duplicate messages that can occur in QoS 1.

MQTT PUBREL and PUBCOMP

MQTT PUBREL and PUBCOMP

In QoS 2, the sender is permitted to retransmit the PUBLISH packet before receiving the PUBREC packet from the receiver. Once the sender receives the PUBREC and sends a PUBREL packet, it enters the Packet ID release process. The sender cannot retransmit the PUBLISH packet or send a new message with the current Packet ID until it receives a PUBCOMP packet from the receiver.

MQTT PUBREC

MQTT PUBREC

As a result, the receiver can use the PUBREL packet as a boundary and consider any PUBLISH packet that arrives before it as a duplicate and any PUBLISH packet that arrives after it as new. This allows us to avoid message duplication at the protocol level when using QoS 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants