Skip to content
This repository has been archived by the owner. It is now read-only.

Race condition in voice connection flow #2056

imayhaveborkedit opened this issue Apr 9, 2019 · 0 comments

Race condition in voice connection flow #2056

imayhaveborkedit opened this issue Apr 9, 2019 · 0 comments


Copy link

@imayhaveborkedit imayhaveborkedit commented Apr 9, 2019

There was a report of an unusual voice related error.

Task exception was never retrieved
future: <Task finished coro=<VoiceClient._create_socket() done, defined at /usr/local/lib/python3.7/site-packages/discord/> exception=AttributeError("'NoneType' object has no attribute 'close'")>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/discord/", line 195, in _create_socket
AttributeError: 'NoneType' object has no attribute 'close'

At first glance this looks a bit odd and upon further inspection it looks downright peculiar. Apparently is None and something is trying to close it. How has this happened? Surely it's not the user since if the user ever got ahold of a VoiceClient it would be properly constructed. We must dig deeper.

  1. The only time is None is between __init__ and right until DiscordVoiceWebSocket.from_client returns in VoiceClient.connect().
  2. The error occurred in VoiceClient._create_socket(), which is only called by ConnectionState.parse_voice_server_update() in,
  3. _create_socket is where the library opens the voice socket after receiving a VOICE_SERVER_UPDATE event, which means waiting for discord to send it.
  4. parse_voice_server_update gets a VoiceClient by key and checks if it exists to call _create_socket.

This is all happening while the user is trying to create a VoiceClient object via VoiceChannel.connect(). The code for this is:

voice = VoiceClient(state=state, timeout=timeout, channel=self)
state._add_voice_client(key_id, voice)
    await voice.connect(reconnect=reconnect)
except asyncio.TimeoutError:

The issue here is the time between adding the VoiceClient+key and connect() returning. It's during VoiceClient.connect() that is set, but not before the handshake is complete. The handshake is complete when VoiceClient._create_socket returns successfully. The line from the error is the line in _create_socket, but this is only run if the handshake event has been set, which is only ever set in this function at the very end.

For this error to occur, a VoiceClient needs to be in the process of connecting, before the ws attribute has been set, has to have completed the handshake once, and then have another handshake triggered before the waiter in VoiceClient.start_handshake() returns and sets the ws attribute.

To me, this seems like the combination of a very close race condition combined with Discord's signature Eventual Consistency™. A second VOICE_SERVER_UPDATE event would have to arrive before the connection flow completed.

As for how to fix this, I have two possible ideas at the moment:

  1. Add a check for the ws attribute before attempting to close it.
  2. Add a new Event (or other state object) to indicate the VoiceClient is in the middle of the connection flow and should ignore or defer connection related calls (probably just _create_socket) until connecting has finished.

Once I do some testing to see if I can reproduce this I will see if I can come up with a suitable fix.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants