Skip to content

fix: retry sooner on transient DNS failures during device update#161

Merged
iprak merged 2 commits intoiprak:mainfrom
mfncl9991:fix/dns-retry-on-update
Apr 19, 2026
Merged

fix: retry sooner on transient DNS failures during device update#161
iprak merged 2 commits intoiprak:mainfrom
mfncl9991:fix/dns-retry-on-update

Conversation

@mfncl9991
Copy link
Copy Markdown
Contributor

Summary

During live testing of PR #158, occasional transient DNS failures were observed:

HomeAssistantError: Error communicating with Winix: Cannot connect to host us.api.winix-iot.com:443 ssl:default [Timeout while contacting DNS servers]

DNS recovers on its own, but with no retry logic the coordinator waits the full scan interval (30s) before trying again, leaving entities unavailable in the meantime.

Change

Catch ClientConnectorDNSError in _async_update_data and raise UpdateFailed(retry_after=timedelta(seconds=15)), letting the DataUpdateCoordinator framework retry sooner rather than waiting the full interval.

ClientConnectorDNSError is already wrapped into HomeAssistantError by the driver, so it's detected via __cause__. All other HomeAssistantError cases (HTTP errors, timeouts) are re-raised unchanged.

try:
    for device_wrapper in self._device_wrappers:
        await device_wrapper.update()
except HomeAssistantError as err:
    if isinstance(err.__cause__, aiohttp.ClientConnectorDNSError):
        raise UpdateFailed(retry_after=timedelta(seconds=15)) from err
    raise

References

When a DNS lookup for us.api.winix-iot.com fails transiently, the
driver raises HomeAssistantError wrapping aiohttp.ClientConnectorDNSError.
Previously this propagated unhandled, leaving entities unavailable until
the next full coordinator interval (30s).

Catch the DNS-specific case in _async_update_data and raise
UpdateFailed(retry_after=15s) so the coordinator retries sooner rather
than waiting the full scan interval.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Owner

@iprak iprak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this and getting to it before me.

Comment thread custom_components/winix/manager.py Outdated
Per review feedback, replace the isinstance(__cause__) check with a
dedicated exception class that makes retry intent explicit.

- Add WinixTransientError(HomeAssistantError) to driver.py; raised from
  get_state for ClientError (covers DNS, connection) and TimeoutError
- manager.py catches WinixTransientError and retries once via
  UpdateFailed(retry_after=15s); on second consecutive failure resets
  the flag and raises UpdateFailed() to resume normal poll interval
- LOGGER.info added for both the retry trigger and the give-up case
- Remove now-unused aiohttp and HomeAssistantError imports from manager.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@iprak iprak merged commit 705e7cd into iprak:main Apr 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants