Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid InvalidStateError on disconnect #67

Merged
merged 1 commit into from
Jun 27, 2021

Conversation

functionpointer
Copy link
Contributor

I have a python script with this library subscribed to two topics on a self-hosted mosquitto broker. Rebooting the router between the two caused the script to crash with an InvalidStateError in force_disconnect() (full stack trace below).

I think this error happens because i have two coroutines with an async for waiting for messages from two separate client.filtered_messages(). One of them gets the error first and runs __aexit__() and ends up calling force_disconnect(). When the second one gets there, the _disconnect future already has a result and set_result() causes the InvalidStateError.

__aexit__() already has code to prevent this:

    async def __aexit__(
        self, exc_type: Type[Exception], exc: Exception, tb: TracebackType
    ) -> None:
        """Disconnect from the broker."""
        # Early out if already disconnected...
        if self._disconnected.done():

However, this code is ineffective due to the await calls between the check and set_result(). My pull request fixes this problem by introducing another check with no await between the check and set_result().

Stack trace for InvalidStateError ``` Jun 27 17:13:54 lignin python3[132815]: DEBUG:send_serial:sent b'PC: light 0\n' Jun 27 17:14:48 lignin python3[132815]: WARNING:mqtt:Could not gracefully disconnect due to "Operation timed out". Forcing disconnection. Jun 27 17:14:48 lignin python3[132815]: ERROR:asyncio:Task exception was never retrieved Jun 27 17:14:48 lignin python3[132815]: future: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/locks.py", line 226, in wait Jun 27 17:14:48 lignin python3[132815]: await fut Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.CancelledError Jun 27 17:14:48 lignin python3[132815]: During handling of the above exception, another exception occurred: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/tasks.py", line 492, in wait_for Jun 27 17:14:48 lignin python3[132815]: fut.result() Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.CancelledError Jun 27 17:14:48 lignin python3[132815]: The above exception was the direct cause of the following exception: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 337, i> Jun 27 17:14:48 lignin python3[132815]: return await asyncio.wait_for(fut, timeout=timeout, **kwargs) Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/tasks.py", line 494, in wait_for Jun 27 17:14:48 lignin python3[132815]: raise exceptions.TimeoutError() from exc Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.TimeoutError Jun 27 17:14:48 lignin python3[132815]: During handling of the above exception, another exception occurred: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/water_arduino.py", line 193, in read_until_failure Jun 27 17:14:48 lignin python3[132815]: await asyncio.gather(*tasks, return_exceptions=False) Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/water_arduino.py", line 187, in write Jun 27 17:14:48 lignin python3[132815]: await client.publish(topic, data) Jun 27 17:14:48 lignin python3[132815]: await client.publish(topic, data) Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 232, i> Jun 27 17:13:23 lignin python3[132815]: DEBUG:send_serial:sent b'PC: light 0\n' Jun 27 17:13:53 lignin python3[132815]: DEBUG:read_mqtt:received light msg: b'{"state": "ON", "brightness": "255"}' Jun 27 17:13:53 lignin python3[132815]: DEBUG:send_serial:sent b'PC: light 255\n' Jun 27 17:13:54 lignin python3[132815]: DEBUG:read_mqtt:received light msg: b'{"state": "OFF", "brightness": "255"}' Jun 27 17:13:54 lignin python3[132815]: DEBUG:send_serial:sent b'PC: light 0\n' Jun 27 17:14:48 lignin python3[132815]: WARNING:mqtt:Could not gracefully disconnect due to "Operation timed out". Forcing disconnection. Jun 27 17:14:48 lignin python3[132815]: ERROR:asyncio:Task exception was never retrieved Jun 27 17:14:48 lignin python3[132815]: future: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/locks.py", line 226, in wait Jun 27 17:14:48 lignin python3[132815]: await fut Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.CancelledError Jun 27 17:14:48 lignin python3[132815]: During handling of the above exception, another exception occurred: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/tasks.py", line 492, in wait_for Jun 27 17:14:48 lignin python3[132815]: fut.result() Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.CancelledError Jun 27 17:14:48 lignin python3[132815]: The above exception was the direct cause of the following exception: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 337, i> Jun 27 17:14:48 lignin python3[132815]: return await asyncio.wait_for(fut, timeout=timeout, **kwargs) Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/tasks.py", line 494, in wait_for Jun 27 17:14:48 lignin python3[132815]: raise exceptions.TimeoutError() from exc Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.TimeoutError Jun 27 17:14:48 lignin python3[132815]: During handling of the above exception, another exception occurred: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/water_arduino.py", line 193, in read_until_failure Jun 27 17:14:48 lignin python3[132815]: await asyncio.gather(*tasks, return_exceptions=False) Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 337, i> Jun 27 17:14:48 lignin python3[132815]: return await asyncio.wait_for(fut, timeout=timeout, **kwargs) Jun 27 17:14:48 lignin python3[132815]: return await asyncio.wait_for(fut, timeout=timeout, **kwargs) Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/tasks.py", line 494, in wait_for Jun 27 17:14:48 lignin python3[132815]: raise exceptions.TimeoutError() from exc Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.TimeoutError Jun 27 17:14:48 lignin python3[132815]: During handling of the above exception, another exception occurred: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/water_arduino.py", line 193, in read_until_failure Jun 27 17:14:48 lignin python3[132815]: await asyncio.gather(*tasks, return_exceptions=False) Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/water_arduino.py", line 187, in write Jun 27 17:14:48 lignin python3[132815]: await client.publish(topic, data) Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 232, i> Jun 27 17:14:48 lignin python3[132815]: await self._wait_for(confirmation.wait(), timeout=timeout) Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 339, i> Jun 27 17:14:48 lignin python3[132815]: await self._wait_for(confirmation.wait(), timeout=timeout) Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 339, i> Jun 27 17:14:48 lignin python3[132815]: raise MqttError("Operation timed out") Jun 27 17:14:48 lignin python3[132815]: asyncio_mqtt.error.MqttError: Operation timed out Jun 27 17:14:48 lignin python3[132815]: During handling of the above exception, another exception occurred: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/tasks.py", line 492, in wait_for Jun 27 17:14:48 lignin python3[132815]: fut.result() Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.CancelledError Jun 27 17:14:48 lignin python3[132815]: The above exception was the direct cause of the following exception: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 337, i> Jun 27 17:14:48 lignin python3[132815]: return await asyncio.wait_for(fut, timeout=timeout, **kwargs) Jun 27 17:14:48 lignin python3[132815]: File "/usr/lib/python3.9/asyncio/tasks.py", line 494, in wait_for Jun 27 17:14:48 lignin python3[132815]: raise exceptions.TimeoutError() from exc Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.TimeoutError Jun 27 17:14:48 lignin python3[132815]: During handling of the above exception, another exception occurred: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 529, i> Jun 27 17:14:48 lignin python3[132815]: await self.disconnect() Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 193, i> Jun 27 17:14:48 lignin python3[132815]: await self._wait_for(self._disconnected, timeout=timeout) Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 339, i> Jun 27 17:14:48 lignin python3[132815]: raise MqttError("Operation timed out") Jun 27 17:14:48 lignin python3[132815]: asyncio_mqtt.error.MqttError: Operation timed out Jun 27 17:14:48 lignin python3[132815]: During handling of the above exception, another exception occurred: Jun 27 17:14:48 lignin python3[132815]: Traceback (most recent call last): Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/water_arduino.py", line 202, in read_mqtt Jun 27 17:14:48 lignin python3[132815]: await read_until_failure() Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/water_arduino.py", line 193, in read_until_failure Jun 27 17:14:48 lignin python3[132815]: await asyncio.gather(*tasks, return_exceptions=False) Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 535, i> Jun 27 17:14:48 lignin python3[132815]: await self.force_disconnect() Jun 27 17:14:48 lignin python3[132815]: File "/home/username/git/water_guard/venv/lib/python3.9/site-packages/asyncio_mqtt/client.py", line 196, i> Jun 27 17:14:48 lignin python3[132815]: self._disconnected.set_result(None) Jun 27 17:14:48 lignin python3[132815]: asyncio.exceptions.InvalidStateError: invalid state
</details>

@frederikaalund frederikaalund merged commit 26df531 into sbtinstruments:master Jun 27, 2021
@frederikaalund
Copy link
Member

Thanks for the pull request. Let me have a look. :)

Ouch, that's a very subtle race condition. Nice find and thanks for the fix. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants