Fix issue where switchbot stops responding due to "disc" status #39

heisenberg2980 · 2022-02-18T16:28:16Z

Commenting line 283 out to avoid the issue where switchbot stops responding until HA restart.
If the switchbot is in disc state, it tries to read a response but it will never get one. By removing disc from the while loop, it will fall into BTLEDisconnectError exception and retry properly the next time around, without any deadlocks.

Commenting line 283 out to avoid the issue where switchbot stops responding until HA restart. If the switchbot is in disc state, it tries to read a response but it will never get one. By removing disc from the while loop, it will fall into BTLEDisconnectError exception and retry properly the next time around, without any deadlocks.

Danielhiversen · 2022-02-18T17:25:38Z

@RenierM26 any comment?

switchbot/__init__.py

RenierM26 · 2022-02-19T10:22:08Z

Hi There,

This will cause a lot of retries if the user has multiple devices. The bluepy helper returns disconnect after any operation finishes...usually an issue if you have 3+ devices. A disconnect could also happen before the scan finishes. (The bluepy library does a lot of connect/disconnects...)

The connect function also has a built in timeout so it won't be stuck waiting for a disconnect (The default is close to a min, can't remember the exact time). It would probably be better just to call the connect function with a shorter timeout. (code is already there)

The actual problem is most likely related to something else and this would just be a workaround.

Just on a side note: I have a raspberry pi 4 with latest firmware (manually updated) running for 2.5 months without any issues.

heisenberg2980 · 2022-02-19T10:34:06Z

I agree this is a workaround, but for all the users having issues from December (i.e. home-assistant/core#61535) this workaround will be really useful as it is really frustrating having to restart HA every time switchbot stops working. In my case that happens just every 3-4 days (I think that is because my switchbot and my HA server are really close), but some people mention they have to restart every day, which is painful.
Happy to test any other fixes or solutions, but unless this PR causes extra issues to people with multiple devices, in the meantime I really think this PR should be merged.

pascalwinters · 2022-02-19T12:48:33Z

@RenierM26 I also agree this is a workaround. But for me, Im using Switchbot since this week, and it hasn't been stable for a day.
I seems that after a command (sometimes 2 commands) the integration is broken and a HA reboot is necessary. I have at this moment 2 curtains combined to 1 group. So in HA it's one device.

I'm also happy to test any other fixes, but what @heisenberg2980 says. Can we in the meantime merge this PR?

Just on a side note: I have a raspberry pi 4 with latest firmware (manually updated) running for 2.5 months without any issues.

Which firmware do you mean? The Switchbot curtain firmware? My curtains are running on v3.1.

RenierM26 · 2022-02-19T13:00:39Z

Hi @pascalwinters,

The raspberry itself. There where a few Bluetooth fixes in the firmware and wasn't included in Hass OS at that time. (October/November I think)

My DEV and Main PI both only have the Switchbot bluetooth integration running. (no other bluetooth integrations)

Are you running any other bluetooth integrations in addition to Switchbot? Might explain the lockups.

pascalwinters · 2022-02-19T13:02:44Z

Hi @RenierM26 ,

No, I have special bought for this integration a bluetooth dongle. So it's only used for the Switchbot integration.

RenierM26 · 2022-02-19T13:03:33Z

Thank you for the feedback.

I have no objections to this pull request, just trying to find the root cause.

RenierM26 · 2022-02-19T13:04:25Z

Do you perhaps have the make and model of the dongle?

pascalwinters · 2022-02-19T13:05:46Z

I agree, the root cause is important to find. Will it help if there is more debug logging added, so we can enable it and provide you with more info?

I have no objections to this pull request, just trying to find the root cause.

@Danielhiversen Can you merge this PR please?

pascalwinters · 2022-02-19T13:05:52Z

Do you perhaps have the make and model of the dongle?

TP LINK UB400 in combination with an Ondroid N2+

heisenberg2980 · 2022-02-19T13:59:42Z

@RenierM26 I also have a Raspberry Pi 4 with the latest version of HA Core and Supervisor and this is the only integration using Bluetooth, but I still had this issue with my switchbot every 3-4 days until I removed the line proposed in this PR. Not sure why yours is not failing, maybe you could try moving your switchbot far away from your Raspberry to see if that force the issue to occur.

RenierM26 · 2022-02-19T15:51:15Z

Hi @heisenberg2980,

Thanks for the info! I'm able to replicate the issue by moving it out of range.

I'm going to test a timeout on the connect function and see if this solves the problem.

pascalwinters · 2022-02-19T17:37:21Z

@RenierM26

I'm going to test a timeout on the connect function and see if this solves the problem.

It’s very good news you can reproduce it.

When do you expect you can test this and creating a PR?

RenierM26 · 2022-02-19T18:10:09Z

Looks like timeout won't be a practical fix. There's more methods that need to be overridden before this will work.

Looks like this pull request is best fix for now.

heisenberg2980 · 2022-02-19T18:15:44Z

Perfect, let´s merge it and if we find a better solution in the future we will implement it.

@Danielhiversen can you merge this PR?

pascalwinters · 2022-02-19T18:16:17Z

Looks like timeout won't be a practical fix. There's more methods that need to be overridden before this will work.

Looks like this pull request is best fix for now.

Ok. Thanks for your time to test it. I appreciate it.
@Danielhiversen regarding the info from @RenierM26, is it possible to merge this PR and create a release?

RenierM26 · 2022-02-19T19:15:16Z

Hi All,

Found the reason why the timeout isn't working. I have also added a 40 seconds timeout if the device is not reachable for some or other reason - should prevent the lockup issue.

Mind testing? #40

Nevermind....Looks like PR 40 needs some additional bug fixing.

RenierM26 · 2022-02-20T07:44:14Z

Hi @pascalwinters and @heisenberg2980,

Made a couple of improvements to the handling of disc states. Mind testing #40 ?

pascalwinters · 2022-02-20T08:29:16Z

Hi @pascalwinters and @heisenberg2980,

Made a couple of improvements to the handling of disc states. Mind testing #40 ?

HI @RenierM26 , I’m happy to test. But I don’t know how to test this? Because I’m using HA OS. So I can add it as custom component, but I can’t replace the python package. Any tips?
Is it possible to create an pre-release? Because when I added as custom component, I can bump the python package myself.

I'm running the integration on your branch. First impression looks good, sometimes the delay is bigger because of this warning:
2022-02-20 11:12:40 WARNING (SyncWorker_2) [switchbot] Bluepy busy, waiting before retry

I see this warning frequently, I'm not sure if it's a good sign.

heisenberg2980 · 2022-02-20T10:34:09Z

Happy to test it but haven´t tested a PR before, are there any instructions/tutorial I can follow? do I need to disable the original integration and download the branch to the custom component folder?

pascalwinters · 2022-02-20T10:36:31Z

@heisenberg2980

Yes, you can copy the switchbot integration from the core repository to the custom component directory and replace the manifest content with this:

{
  "domain": "switchbot",
  "name": "SwitchBot",
  "documentation": "https://www.home-assistant.io/integrations/switchbot",
  "requirements": ["git+https://github.com/RenierM26/pySwitchbot.git@pyswitchbot_timeout#PySwitchbot==0.13.3"],
  "config_flow": true,
  "codeowners": ["@danielhiversen", "@RenierM26"],
  "iot_class": "local_polling",
  "loggers": ["switchbot"],
  "version": "1.0.0"
}

This all will automatically overwrite the core integration until you delete the custom switchbot component.

heisenberg2980 · 2022-02-20T11:25:26Z

@pascalwinters when you say "copy the switchbot integration from the core repository" do you mean the original version of the integration (master branch) or the PR (pyswitchbot_timeout branch)? Also where is the manifest file?

pascalwinters · 2022-02-20T11:28:28Z

@pascalwinters when you say "copy the switchbot integration from the core repository" do you mean the original version of the integration (master branch) or the PR (pyswitchbot_timeout branch)? Also where is the manifest file?

I mean this repository:
https://github.com/home-assistant/core/tree/dev/homeassistant/components/switchbot

heisenberg2980 · 2022-02-20T11:49:38Z

Thanks @pascalwinters, testing it now

pascalwinters · 2022-02-20T12:05:14Z

Thanks @pascalwinters, testing it now

Nice. Good luck.

RenierM26 · 2022-02-20T16:16:54Z

I see this warning frequently, I'm not sure if it's a good sign.

Bluepy doesn't handle the "disc" status all that well. It captures any disconnect from any device (disconnect is the last stage of any connection to any device) I have added warning logging when this happens to aid in troubleshooting. Could change it to a lower level later on to prevent log spamming (Debugging perhaps)

Connection failures should work a lot better.

I think the btle scanning function, in the bluepy library, might ultimately be the root cause (it has aggressive retries if the helper drops a connection before the scan completes....which kind of leads to a controlled race condition.) You can connect to more than one device at a time but the last bluepy release doesn't handle it all that well.

Sorry for the long explanation.

pascalwinters · 2022-02-21T06:54:24Z

@RenierM26 Thank you for explaining. For me it's stable since the beginning of using this fix. Two remarks/questions:

Maybe it's good (before merge this PR), to mark this line as DEBUG:
2022-02-20 11:12:40 WARNING (SyncWorker_2) [switchbot] Bluepy busy, waiting before retry
I'm missing the log of reading each (I mean) 60 seconds the current status. Is this log event deleted?

heisenberg2980 · 2022-02-21T17:08:48Z

After 24 hours of testing, the fix looks promising, with no lockups yet, but there is something weird I would like to share, my switchbot seems to take in average much longer to respond than before, taking sometimes up to 30 seconds to switch, which very rarely happened before. I don´t have exact numbers, but I would say now it takes 20-30 seconds 50% of the times (taking 5-10 seconds the other 50%), and before it usually took less than 10 seconds most of the times (taking over 20 seconds just 10-15% of the times).

Obviously this could be due to external factors or interferences and not related with this fix, but wanted to mention it just in case @pascalwinters or someone else is also experiencing it.

BTW, even with this performance issue, if I have to decide between now and before the fix I would choose the fix, it is always better to have a response (even if the response is a bit slow) rather than a lockup that requires a HA restart.

pascalwinters · 2022-02-21T17:43:18Z

@heisenberg2980 I have the same experience. Sometimes it takes more time to respond. I think it's waiting for a timeout or so?
But for me, it's now priority having a working environment and after that looking for a solution with better performance.

Maybe @RenierM26 can explain the delays?

@RenierM26 Is it an idea to merge this and make a PR for HA (I can do that if you want) so it has been fixed for all the users in HA with the next release. But we must make progress, because the beta is almost there.

heisenberg2980 · 2022-02-21T19:30:35Z

@heisenberg2980

Yes, you can copy the switchbot integration from the core repository to the custom component directory and replace the manifest content with this:
{
  "domain": "switchbot",
  "name": "SwitchBot",
  "documentation": "https://www.home-assistant.io/integrations/switchbot",
  "requirements": ["git+https://github.com/RenierM26/pySwitchbot.git@pyswitchbot_timeout#PySwitchbot==0.13.3"],
  "config_flow": true,
  "codeowners": ["@danielhiversen", "@RenierM26"],
  "iot_class": "local_polling",
  "loggers": ["switchbot"],
  "version": "1.0.0"
}
This all will automatically overwrite the core integration until you delete the custom switchbot component.

@pascalwinters Out of curiosity, how did you update the requirements line of the manifest to point to the PR? is there any link/document to understand the logic?

pascalwinters · 2022-02-21T19:33:27Z

@heisenberg2980
Yes, you can copy the switchbot integration from the core repository to the custom component directory and replace the manifest content with this:
{
  "domain": "switchbot",
  "name": "SwitchBot",
  "documentation": "https://www.home-assistant.io/integrations/switchbot",
  "requirements": ["git+https://github.com/RenierM26/pySwitchbot.git@pyswitchbot_timeout#PySwitchbot==0.13.3"],
  "config_flow": true,
  "codeowners": ["@danielhiversen", "@RenierM26"],
  "iot_class": "local_polling",
  "loggers": ["switchbot"],
  "version": "1.0.0"
}
This all will automatically overwrite the core integration until you delete the custom switchbot component.
@pascalwinters Out of curiosity, how did you update the requirements line of the manifest to point to the PR? is there any link/document to understand the logic?

https://developers.home-assistant.io/docs/creating_integration_manifest/#custom-requirements-during-development--testing

heisenberg2980 · 2022-02-21T20:18:53Z

@RenierM26 @pascalwinters considering #40 looks promising, unless you have any objection I will close this PR

pascalwinters · 2022-02-21T20:20:41Z

@RenierM26 @pascalwinters considering #40 looks promising, unless you have any objection I will close this PR

Seems good to me.

heisenberg2980 · 2022-02-22T13:00:10Z

Closing this PR as #40 seems to be a better solution

heisenberg2980 mentioned this pull request Feb 18, 2022

Switchbot integration malfunction, inconsistant communication with the bots home-assistant/core#61535

Closed

heisenberg2980 changed the title ~~Update __init__.py~~ Fix issue where switchbot stops responding due to "disc" status Feb 18, 2022

heisenberg2980 mentioned this pull request Feb 18, 2022

pySwitchbot hangs if the switchbot is in "disc" state #38

Closed

Danielhiversen reviewed Feb 18, 2022

View reviewed changes

switchbot/__init__.py Outdated Show resolved Hide resolved

Update switchbot/__init__.py

7e3eb52

heisenberg2980 closed this Feb 22, 2022

heisenberg2980 deleted the patch-1 branch February 22, 2022 15:28

Fix issue where switchbot stops responding due to "disc" status #39

Fix issue where switchbot stops responding due to "disc" status #39

Uh oh!

Conversation

heisenberg2980 commented Feb 18, 2022

Uh oh!

Danielhiversen commented Feb 18, 2022

Uh oh!

Uh oh!

RenierM26 commented Feb 19, 2022

Uh oh!

heisenberg2980 commented Feb 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pascalwinters commented Feb 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RenierM26 commented Feb 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pascalwinters commented Feb 19, 2022

Uh oh!

RenierM26 commented Feb 19, 2022

Uh oh!

RenierM26 commented Feb 19, 2022

Uh oh!

pascalwinters commented Feb 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pascalwinters commented Feb 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

heisenberg2980 commented Feb 19, 2022

Uh oh!

RenierM26 commented Feb 19, 2022

Uh oh!

pascalwinters commented Feb 19, 2022

Uh oh!

RenierM26 commented Feb 19, 2022

Uh oh!

heisenberg2980 commented Feb 19, 2022

Uh oh!

pascalwinters commented Feb 19, 2022

Uh oh!

RenierM26 commented Feb 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RenierM26 commented Feb 20, 2022

Uh oh!

pascalwinters commented Feb 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

heisenberg2980 commented Feb 20, 2022

Uh oh!

pascalwinters commented Feb 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

heisenberg2980 commented Feb 20, 2022

Uh oh!

pascalwinters commented Feb 20, 2022

Uh oh!

heisenberg2980 commented Feb 20, 2022

Uh oh!

pascalwinters commented Feb 20, 2022

Uh oh!

RenierM26 commented Feb 20, 2022

Uh oh!

pascalwinters commented Feb 21, 2022

Uh oh!

heisenberg2980 commented Feb 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pascalwinters commented Feb 21, 2022

Uh oh!

heisenberg2980 commented Feb 21, 2022

Uh oh!

pascalwinters commented Feb 21, 2022

Uh oh!

heisenberg2980 commented Feb 21, 2022

Uh oh!

heisenberg2980 commented Feb 19, 2022 •

edited

Loading

pascalwinters commented Feb 19, 2022 •

edited

Loading

RenierM26 commented Feb 19, 2022 •

edited

Loading

pascalwinters commented Feb 19, 2022 •

edited

Loading

pascalwinters commented Feb 19, 2022 •

edited

Loading

RenierM26 commented Feb 19, 2022 •

edited

Loading

pascalwinters commented Feb 20, 2022 •

edited

Loading

pascalwinters commented Feb 20, 2022 •

edited

Loading

heisenberg2980 commented Feb 21, 2022 •

edited

Loading