Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in voice connection flow #2056

Closed
imayhaveborkedit opened this issue Apr 9, 2019 · 0 comments

Comments

@imayhaveborkedit
Copy link
Contributor

commented Apr 9, 2019

There was a report of an unusual voice related error.

Task exception was never retrieved
future: <Task finished coro=<VoiceClient._create_socket() done, defined at /usr/local/lib/python3.7/site-packages/discord/voice_client.py:168> exception=AttributeError("'NoneType' object has no attribute 'close'")>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/discord/voice_client.py", line 195, in _create_socket
    await self.ws.close(4000)
AttributeError: 'NoneType' object has no attribute 'close'

At first glance this looks a bit odd and upon further inspection it looks downright peculiar. Apparently VoiceClient.ws is None and something is trying to close it. How has this happened? Surely it's not the user since if the user ever got ahold of a VoiceClient it would be properly constructed. We must dig deeper.

  1. The only time VoiceClient.ws is None is between __init__ and right until DiscordVoiceWebSocket.from_client returns in VoiceClient.connect().
  2. The error occurred in VoiceClient._create_socket(), which is only called by ConnectionState.parse_voice_server_update() in state.py,
  3. _create_socket is where the library opens the voice socket after receiving a VOICE_SERVER_UPDATE event, which means waiting for discord to send it.
  4. parse_voice_server_update gets a VoiceClient by key and checks if it exists to call _create_socket.

This is all happening while the user is trying to create a VoiceClient object via VoiceChannel.connect(). The code for this is:

voice = VoiceClient(state=state, timeout=timeout, channel=self)
state._add_voice_client(key_id, voice)
try:
    await voice.connect(reconnect=reconnect)
except asyncio.TimeoutError:
    ...

The issue here is the time between adding the VoiceClient+key and connect() returning. It's during VoiceClient.connect() that VoiceClient.ws is set, but not before the handshake is complete. The handshake is complete when VoiceClient._create_socket returns successfully. The line from the error is the self.ws.close(4000) line in _create_socket, but this is only run if the handshake event has been set, which is only ever set in this function at the very end.

For this error to occur, a VoiceClient needs to be in the process of connecting, before the ws attribute has been set, has to have completed the handshake once, and then have another handshake triggered before the waiter in VoiceClient.start_handshake() returns and sets the ws attribute.

To me, this seems like the combination of a very close race condition combined with Discord's signature Eventual Consistency™. A second VOICE_SERVER_UPDATE event would have to arrive before the connection flow completed.

As for how to fix this, I have two possible ideas at the moment:

  1. Add a check for the ws attribute before attempting to close it.
  2. Add a new Event (or other state object) to indicate the VoiceClient is in the middle of the connection flow and should ignore or defer connection related calls (probably just _create_socket) until connecting has finished.

Once I do some testing to see if I can reproduce this I will see if I can come up with a suitable fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.