-
Notifications
You must be signed in to change notification settings - Fork 117
MQTT to Home Assistant breaks every approx 24 hours #14
Comments
Same issue. |
Having the same issue - though my seems to quit long before 24 hours. I will get your new branch loaded up as soon as I can to test. Update - Starting |
Same here. Please notice that MAC are define for both miscale and miflora. Miscale is working, myflora no.
|
The problem, as I understand it, occurs when another device is connected to
MiFlora when the gateway attempts to read from it. It will fail because you
can only connect one device. But it won’t be able to connect again for some
reason even after MiFlora is available.
As a temporary fix for myself, I have set the refresh to 30min and only use
the phone app in that window. This won’t stop others from trying to connect
and causing it to fail though.
I believe the new branch should solve this. However I have not tested it
myself.
|
Uhm I don't think so. I don't use official mi apps (mi home or flower care) because I love to see plants status directly from HA (in home or remotely). BTW with the new branch miflora doesn't work at all. I went back to the original branch. |
It doesn’t have to be the official app. It can be any BT device that tries
to connect while the gateway is trying.
Anyway. I will try to take a look tonight and post a possible fix.
Thanks for reporting.
|
I've made error on that branch. Its updated ( commit rebased ), please check now. I dont have miflora myself so I cannot verify it. Exception with undeclared mac variable should be fixed. |
Updated: Now i received the first payload, but also an error:
|
I updated the miflora.py Worker File. It seems to be working for now - I let it run a while using ./gateway.py -d without issue I will reboot the Pi and let the service start up on its own and run. I'll keep and eye on it and see if if has the same issue as before. Thank you for your efforts |
Try connecting via official app while the gateway is connecting. It should
fail. Force close the app and see if the gateway keeps working. If so.
Problem finally solved.
|
@jokerigno Are you using some rpi ? Be sure to use latests bluez ( compile manually from latests code ). Bt /bluez for rpi 3 / rpi zero seems not very stable sometimes. That error doesn't provide much info, som not sure what might be wrong. If later update seems to get any info maybe its ok. I'm running that gw with eq3 thermostats and miscale for many weeks without any problems, sometimes bt update randomly fails, I think its some problems with bt stack on rpi or devices so I dont care mouch about it. From my experience bt is not most reliazle protocol to get data from devies, thats why I've wrote that gateway to keep it separated from hassio and to ensure only one bt device is updated / conntected at data refresh. |
https://scribles.net/updating-bluez-on-raspberry-pi-5-43-to-5-48/ Replace every instance of 5.48 with 5.49 |
Yes I'm using a Pi3. I've run apt-get update and the installed bluez 5.48 using your link (thank you).
But the issue persists:
|
That error is inside btlewrap lib. So it looks like there is some bug. Maybe some newer / older version of that lib would solve this. Also I'm not sure if its anything to handle, from that output looks like you are raeding values correctly both from miscale and miflora ? If that exception is annoying, https://github.com/ChristianKuehnel/btlewrap looks like a place to debug issue in that lib. |
I am also on a Pi3 - just using a miflora. The data does not come through after approx. 18 hours and then I need to reboot the pi to make it work again. Not sure how I can help debug. |
@jeroenterheerdt Did you tried this branch: https://github.com/zewelor/bt-mqtt-gateway/tree/test_miflora_hanging_fix ? |
@zewelor oops, I missed that. Trying that one now - it seems to run fine (so no errors as above). I will keep running it for a while and see if the same lock-up happens. Thanks! |
@zewelor I am getting this output:
So the first update works fine but then there is an exception after which the update never succeeds. |
That backtrace shows only error from btlewrap library. I dont have any experience with it, and I'm not sure what can cause it. Maybe try to raise that issue on that lib github ( https://github.com/ChristianKuehnel/btlewrap ) so we could get more insight about whats wrong or how to change code in miflora worker. Maybe that worker has some wrong usage but I don't know. Its hard to me to debug it without having miflora device. Thats the only worker I'm not using. |
So it looks like you're trying to disconnect from a sensor that you are not connected to. Thus What I can do is check in the disconnect method if we are connected at all and only then call the disconnect. I just pushed a patch for this. Please try with the latest version from master: |
Thanks for looking into it. From what I understand there is already check for being connected or not? ( https://github.com/ChristianKuehnel/btlewrap/blob/cd34f4096360d56d4b64ca5faad45c15bec1ac6c/btlewrap/base.py#L16 ). Looks like it gets disconnected twice ? I've done quick looking at miflora code and yours, and I think there is possible bug here (?): miflora usage: If I understand it correctly there is double disconnect ? on exit and on del ? Still I'm not sure how it would forbid next values update as errors are catched. Maybe there is some weird state left in the bluetooth adapter / wrapper ? Some more cleaning needed after each update ? Worker is rather simple and recreates miflora poller on each update ( https://github.com/zewelor/bt-mqtt-gateway/blob/test_miflora_hanging_fix/workers/miflora.py#L30 ) , @ChristianKuehnel do you see anything wrong here ? |
Well, I wanted to make sure that we do disconnect from the device whenever the program or part of the program ends. this is to make sure that we leave a clean system state in any case. Even if we crash. In the normal case end should be called when exiting the context manager. If that fails for whatever reason the destructor del should be called anyway. Thus it might be called it multiple times, which is safe if the disconnect function can deal with multiple calls, which it should be able to now. As I check for None before calling disconnect(). |
@ChristianKuehnel how would I use the new version of |
Hey, i try to reproduce the error and a can't. i tested it with btlewrap 0.0.2. |
i actually tested it with the updated bluepy wrapper from btlewrap. problem is the same, after a first timeout, it looks like we always get timeouts... |
ok, i have a solution for the problem i think, but i'm not sure about it. @ChristianKuehnel @bbbenji @zewelor def __del__(self):
if self._lock.locked():
self._lock.release() |
@hobbypunk90 Thanks for investigation, that sounds like possible cause. Im not python expert either, just learning step by step. I was suspecting something like this, that why I've changed to recreate miflora poller on each update ( c4ddaae ). From my understainging it should workaround this by recreating backend, so no lock should be alive ? If adding del method solves problem, best would be to add issue on https://github.com/ChristianKuehnel/btlewrap . Also did you tried https://github.com/zewelor/bt-mqtt-gateway/compare/test_miflora_hanging_fix branch if it works ? Maybe its fixed there. I dont have miflora device to test it. |
@zewelor |
so... what do we do? |
@hobbypunk90 Well lock is only valid for workers that are using btlewrapper. Its not global so you still can't be sure that only 1 device access bluetooth hardware, that's general problem with current bluetooth support, that's why I've made that gateway to workaround that. Only thing I can see now is to remove that lock from the outside in each worker ? That doesn't seem good. I would first try issue in btlewrap to see @ChristianKuehnel opinion on this case. |
@zewelor you are absolutely right. I only talked about btlewrap 😉 |
I had a short look at the python documentation: https://docs.python.org/3/reference/datamodel.html#object.__exit__ This says that 'exit' is also called when an exception is raised. |
@ChristianKuehnel i know, but i tested it, it is not called if the scheduler timeout kills the function call |
I'm not familiar with this. Can you point me to the definition of the
scheduler and how your using it to kill functions?
Are you aware that miflora had a retry mechanism in place to deal with for
failed read attempts?
Marcel Hoppe <notifications@github.com> schrieb am Do., 9. Aug. 2018, 09:24:
… @ChristianKuehnel <https://github.com/ChristianKuehnel> i know, but i
tested it, it is not called if the scheduler timeout kills the function call
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ATCWCB548546oZhStlzKOqk1g8VF9Gi6ks5uO_G0gaJpZM4UVo9h>
.
|
@ChristianKuehnel sorry, i'm also not familiar with the scheduler or how it works. |
It's usually a bad idea to just kill a function, as it might leave the system in some inconsistend state. So I would advise against doing this, regardless of the library used. And this example shows it. If you do not want to wait for poller = MifloraPoller(interface, retries=0) And I just added a test case, to make sure if an exception is thrown, that it's handles correctely in the |
@hobbypunk90 Every update from workers is queued into single queue and updated in single thread to solve concurent usage of bt hardware. Timeout is handled by https://pypi.org/project/interruptingcow/ . Ive added to to be sure some hanging worker wont block entire gateway. I'm using it with thermostats worker and it works stable for past few months. Thermostat lib used some different bluetooth library inside. As for retries im not confident it will solve all edge cases for example with some weird behaving bluetooth firmware on miflora. From my experience IOT bluetooth devices has some weird bluetooth stack bugs etc, and won't count on its stability etc, thats why I wanted some hard timeout to keep gateway updates going. |
@zewelor i understand your problem and i would resolve this problem in the same way😉 |
The change is merged in the master branch of miflora, so you can test it from there. If this fixes the issue we then need to request a new release from miflora. |
@ChristianKuehnel I tested it, it seams to work 🙂 i can block the sensor with the app and after closing the app we can collect data again 👍😁 |
i run the last days with the miflora lib from master and the bug don't occur anymore. |
Great, closing :D |
I am still having issues after I did a pull of bt-mqtt-gateway. What do I need to do to get the latest version working?
Doing a |
Did you update miflora to the version from git? |
@hobbypunk90 this is going to be a stupid question, but how would I do that? what do I need to run where to update miflora? |
Found it. I'll leave it here in case anyone is wondering. First remove any installed |
I've added miflora from GH as miflora worker requirement, so it should install newest version from GH by default. |
After approx. 24 hours bt-mqtt-gateway messages do not make it over to Home Assistant any more. I do not see any errors in the bt-mqtt-gateway service status and a simple restart resolves the issue. It is annoying though. I am using MQTT on the Home Assistant side, which functions fine (other devices are using it). My source for the bt-mqtt-gateway is a miflora sensor that has not moved.... can I help debug somehow?
The text was updated successfully, but these errors were encountered: