-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent lock binaryState/state in MQTT #51
Comments
Thanks, I'll look into it. |
Thanks to you. Another weird thing, that might be related: I found some strange topics in MQTT that were similar to nuki_hub topics, but with parts missing: _hub_hub with battery info, then I found another weird topic, I made a screenshot yesterday but can't find it...I'll'attach it as soon as I find it. It's like there was some problem in sending data to MQTT from Nuki Hub. It only happened 2 times...I'm on latest version of the firmware. UPDATE: found the screenshot. :) |
I hope it's a sw bug that will be solved. The inconsistency in the status of the Lock I noticed could also depend on this. |
There's nothing obvious, the state and binaryState are always updated together. If it's an MQTT problem, it'll get more complicated. I don't have any such malformed topics, but I'm not using mosquitto. As far as I can see the correct topic is passed to the MQTT stack (pubsubclient), what happens from there I don't know since it's not my code. |
Is it possible that memory on the ESP32 is getting corruped, maybe by a parallel thread? (I am thinking of the WiFi stack in particular, as that would be something you would not experience on your own device?) |
I've reviewed the code building MQTT paths. It's a rather simple method, still there were some stupid mistakes which should be fixed now, you can try release 6.1. If this fixes the actual issue with the binary state I don't know, please test. |
I don't use mosquitto either, I use EMQX v5. So same issue with 2 different brokers.
That's what I hoped: some sw bug that could easily be solved. Thank you. Strange that it did it never occur to you, and also strange why did it occur to me only after weeks of using Nuki Hub. Usually these stupid bugs trigger more often. :)
That inconsistency condition only happened 1 time since I installed Nuki Hub, so I will monitor things and report back in case I observe it again. What allows me to be optimistic is that I observed those malformed topics and the inconsistency in the same time-frame. Will keep you poste, thank you for reacting so quickly. |
After fixing those errors, I'm somewhat surprised that it did work before. I'm using iobrokers built-in MQTT broker, maybe it handles garbled messages/paths better than mosquitto. |
I think you have to follow a strict protocol/spec in order for an MQTT broker to accept data, so if the integrated broker is a true MQTT broker, it means it received the topics malformed, but correct from an MQTT spec perspective, and so it should process them as EMQX and Mosquitto did. The fact that it worked before also for me for almost a month will remain a magic mistery. :) I have some other issues with the web interface (also with previous 6.0 fw), pages render with no formatting often, if I refresh a couple of times I see the formatting (black background, etc.), but sometimes it throws an http error (a connection reset error) then after 2-3s it appears again. It seems like http connections are not managed very well...that's the feeling. Thanks a lot for your support. |
Probably related to issue #48, let's follow up there. |
Im still seeing this issue:
Using 6.1 |
didn't occur to me yet, after 6.1. But I used nuki_hub for quite some time without the issue, then one day it happened...when I started seeing malformed topics in mqtt. Do you have malformed topics? open all nuki_hub subtopics to check for weird names... |
Let's keep in mind that the malformed topics aren't necessarily related to the incosistent states. |
For me it seems te be related to having multiple nuki locks: #53 I have moved one back to another project and now I don't see this anymore. |
Correct, we don't know, that's why I asked, so we can check for patterns. |
I have updated my device to 6.1 and cleaned out all the messed up topics using |
The question is where this happens. As far as I can tell, the paths should be built correctly now. After that comes the MQTT library. Just to verify everything is correct on my side, I've built a version that outputs all paths to the serial console. |
Since 6.1 I haven't seen malformed topics, but I guess it's too soon to say it's fixed. I'll keep monitoring it and keep you posted. |
Unfortunately, for me they are still there. To make sure they are not artifacts, I cleaned up again after posting #51 (comment) and this is a screenshot of today: |
I attempted to find issues in the code but the only thing I found that could cause memory issues is a RSSI value that would not be not integer (not even sure that this possible). The itoa() would cause unexpected results. Or maybe it's another bug in the It would mean that checking for end of strings with "0x00" would not always work or that the fact that the generated path itself would cause issues because its array is too big and padded with "0x00". I am not familiar with this method of initializing string arrays with 0x00 so sorry if I say stupid things. |
@Mincka In C (and derived languages like C++), strings are pointers to zero-terminated arrays of characters. |
It's a mystery to me why this is happening. There's nothing too special about the rssi node, it's published via the publishInt() method, which is used by for a few other nodes too (batteryDrain, lockDistance, ledBrightness). Maybe the bug is triggered more easily with the rssi node because it's constantly updated ... same goes for wifiRssi and presenceDetection though. |
I have the issue with other nodes as well (see previous screenshots). It is not limited to any single topic. |
@technyon Do you have this issue yourself? Maybe we have something else in common that can be causing this. @giejay was thinking about something related to multiple locks. I have two locks with one ESP M5 Atom per lock. Do you have multiple locks @mundschenk-at and @alexdelprete ? |
Only one lock. Also, the ESP can only be paired to one lock anyway. |
I've thought about switching to to QoS 1, the library does support it ... it's just that QoS 0 is the default. It also supports QoS 2, there's some issue with it open though, I'd need to check. |
At least QoS 1 sounds like a good idea for this usecase. |
QoS 2 is a little bit too much for resource constrained device, I think QoS 1 is a good compromise. Do we need to tackle duplication in our use case? I'm studying a little bit MQTT because I never did before, and it's interesting. I wanted to understand better this Clean Sessions vs Persistent Sessions thing. And I think we definitely want to use Persistent Sessions, because it's true NH mainly publishes messages, but it's also a consumer of very important messages (commands) that it doesn't want to lose (e.g. after a sudden reboot), so with Persistent Sessions and QoS 1 we tell the broker to store those in a queue until they are consumed by NH: |
@alexdelprete Any news about the official MQTT support by the way? Did they have such details in their specs? Edit: Replying to myself. From your post: |
I'm not following them anymore, had a harsh discussion with Jurgen after I asked him a simple question: will MQTT support be available for the Nuki Bridge? He didn't answer clearly. So I pushed him again and he didn't want to answer. So from what I understand, they only want to support MQTT on SL3 model, to push sales (old SL2 users will have to upgrade). So I got a little bit upset and wrote in the forum that would be my last post supporting the initiative, and I also wrote that with a $10 device and Nuki Hub project we have the best bridge there is, with active developement and everything working fine (almost...). And I left them to their destiny...I don't like their approach, no transparency, they lie on functionalities (remember zigbee support? never happened in the end). On the technical side: QoS=2 might be a little overkill for a low resource device, but we could experiment, maybe a configurable QoS option with 1/2 to test things and then decide? cc @technyon |
Thanks for the feedback. Their hybrid approach with QoS=2 only for device control seems to make sense but is more complicated to implement. Is it possible that with QoS=1, we'd risk to have an |
I think the hybrid approach is the best one: in the end, that's why we have different QoS levels. Not all messages have the same intrinsic importance. As you said, Lock/Unlock should definitely have QoS=2 because of the risk of duplication. But the duplication can also be managed at the client level, ensuring you never send 2 times those commands by waiting for the ACK, that QoS=1 provides. It's a bit like choosing UDP vs TCP. |
QoS=0 is the worst of all, duplication is an issue at all lower levels than QoS=2, it's not a specific issue of QoS=1. So for me the baseline should be QoS=1, and QoS=2 for very critical commands. |
I'm a bit hesitant to go to QOS 2 because: arduino-libraries/ArduinoMqttClient#64 You can give QOS 1 a try though: I don't think receiving the lock action twice will cause any issues ... it's unlikely anyway everything is running over TCP. |
it should answer with PUBREL and wait for PUBCOMP. You are right, let's wait for them to anwer to that.
This has the CS flag set to false, right? QoS=1 goes with CS=False. |
From what I see in EMQX (nice admin interface, and there's even a HA addon in the works, thanks for the recommendation, @alexdelprete), it does not. |
@mundschenk-at thanks. I didn't know about the addon, it would be great, do you have a link? @technyon if it's not too much work, can you prepare a test fw with all options we discussed configurable?
It would be good so we can play with the settings and find a good configuration. |
Any chance you could compile the binaries yourself, would make a lot easier. Check the readme, I've provider a virtual machine to compile. RSSI interval is already configurable via the web interface. I've put QOS and CS into "Config.h", you just have to change the defines and recompile. |
@alexdelprete It's still quite new, but in the Community repository: https://github.com/hassio-addons/addon-emqx |
Oh, I don't follow the addons because I use HA Core. Every app/service is installed outside of HA environment (docker and LXCs), and then integrated through integrations. It's good that Frenck provided a better alternative MQTT broker. |
Sure, I'll try. If I have problems I'll let you know. Thanks. |
UPDATE: nevermind, I forgot to |
@technyon Jan, where do I set this? I need it to check what I have enabled/disabled while testing. |
I had already changed Thanks. |
They could be merged into one file actually. |
Yeah...probably it would be a good choice. Let me know if you merge them. |
So what are you testing first @alexdelprete ? |
I didn't allocate the time yet to do a proper test. For now I allocated time to import the VM in my Proxmox and customize the image because the settings weren't working for me. Now the VM is ok and I know how to compile etc., but I need to schedule the test activity. Probably by tomorrow. :) I will let you know...the only worry I have is that actually I'm using |
I understand. If it ain't broke, don't fix it. ;) That's better with your stable setup because we'll know if potential issues are related to the QoS overload. In my case, I could certainly not deduce anything from playing with this yet. |
I'm a tinkerer...so I get bored in steady states...:) I will play with settings and try to tell you if I notice some degradation. Also, I want to experiment with QoS=1. |
BTW: if you guys use discord, I setup a dedicated chat for Nuki Hub, you can join here. |
Today I noticed that lock sensor in HA was reporting unlocked when Nuki was locked, I checked MQTT and noticed this:
You will notice that
lock/binaryState
isunlocked
butlock/state
islocked
.I restarted Nuki Hub to force it to reset the states from the lock and everything was good again.
BTW: is it possible to add a restart button in the UI? right now I go into settings and hit save to force it to restart)
The text was updated successfully, but these errors were encountered: