Skip to content

Fix voice connection issues and upgrade to voice v8 #10210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 30, 2025

Conversation

DA-344
Copy link
Contributor

@DA-344 DA-344 commented Jun 16, 2025

Summary

Fixes the 4006 errors caused by how dpy handled the voice endpoints ports. Also upgrades the voice to v8.

Related post: https://discord.com/channels/336642139381301249/1380627762968264844

Checklist

  • If code changes were made then they have been tested.
    • I have updated the documentation to reflect the changes.
  • This PR fixes an issue.
  • This PR adds something new (e.g. new method or parameters).
  • This PR is a breaking change (e.g. methods or parameters removed/renamed)
  • This PR is not a code change (e.g. documentation, README, ...)

@DigiDuncan
Copy link

Does this PR need further testing? I'd consider getting a fix for this in urgent as bots that require VC connections are unstable at best and unusable at worst, and some of those bots are accessibility tools.

@DA-344
Copy link
Contributor Author

DA-344 commented Jun 18, 2025

I need to implement buffered resuming, after that and testing it works, yeah, it can be merged

@gutbash
Copy link

gutbash commented Jun 19, 2025

i agree with @DigiDuncan this feels widespread and urgent. thank you for the work yall are doing to fix this!

@DA-344
Copy link
Contributor Author

DA-344 commented Jun 19, 2025

A quick note, saying "you should merge this soon" won't make this be merged soon.

@gutbash
Copy link

gutbash commented Jun 19, 2025

A quick note, saying "you should merge this soon" won't make this be merged soon.

who said that?

@DA-344
Copy link
Contributor Author

DA-344 commented Jun 19, 2025

I said a quick note, not that anyone said it.

@gutbash
Copy link

gutbash commented Jun 19, 2025

I said a quick note, not that anyone said it.

right

@itsTheFae
Copy link

In order to test resume, should I just leave a voice client connected to a channel or does this need to be playing audio as well?
Otherwise this seems to be working as expected.

Thank you for working on this!

@DA-344
Copy link
Contributor Author

DA-344 commented Jun 19, 2025

You'll need to find a way to crash the discord voice or something in order to test buffered resume lol, but I guess it should work fine?

@itsTheFae
Copy link

Classic discord. We'll just find out later if/when it is a problem. lol
Thanks again for fixing the issue.

@DigiDuncan
Copy link

DigiDuncan commented Jun 20, 2025

Received this error on testing the branch:

Traceback (most recent call last):
  File "C:\Users\DigiDuncan\Documents\GitHub\DECBot\decbot\cogs\tts.py", line 189, in queueTask
    await vc.play_until_done(audio)
  File "C:\Users\DigiDuncan\Documents\GitHub\DECBot\decbot\discordplus\voiceclient.py", line 24, in play_until_done
    return await play_until_done_future(self, source)
  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\player.py", line 775, in run
    self._do_run()
  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\player.py", line 764, in _do_run
    play_audio(data, encode=not self.source.is_opus())
  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\voice_client.py", line 585, in send_audio_packet
    packet = self._get_voice_packet(encoded_data)
  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\voice_client.py", line 382, in _get_voice_packet
    return encrypt_packet(header, data)
  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\voice_client.py", line 388, in _encrypt_aead_xchacha20_poly1305_rtpsize
    box = nacl.secret.Aead(bytes(self.secret_key))
AttributeError: module 'nacl.secret' has no attribute 'Aead'

@DA-344
Copy link
Contributor Author

DA-344 commented Jun 20, 2025

Received this error on testing the branch:

Traceback (most recent call last):

  File "C:\Users\DigiDuncan\Documents\GitHub\DECBot\decbot\cogs\tts.py", line 189, in queueTask

    await vc.play_until_done(audio)

  File "C:\Users\DigiDuncan\Documents\GitHub\DECBot\decbot\discordplus\voiceclient.py", line 24, in play_until_done

    return await play_until_done_future(self, source)

  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\player.py", line 775, in run

    self._do_run()

  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\player.py", line 764, in _do_run

    play_audio(data, encode=not self.source.is_opus())

  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\voice_client.py", line 585, in send_audio_packet

    packet = self._get_voice_packet(encoded_data)

  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\voice_client.py", line 382, in _get_voice_packet

    return encrypt_packet(header, data)

  File "C:\Users\DigiDuncan\miniconda3\envs\decbot\lib\site-packages\discord\voice_client.py", line 388, in _encrypt_aead_xchacha20_poly1305_rtpsize

    box = nacl.secret.Aead(bytes(self.secret_key))

AttributeError: module 'nacl.secret' has no attribute 'Aead'

Not this branch error, update PyNaCl

@Wubbity
Copy link

Wubbity commented Jun 20, 2025

This fixed the issue with my bot joining and leaving rapidly - related to #10207

@dolfies
Copy link
Contributor

dolfies commented Jun 20, 2025

This fixed the issue with my bot joining and leaving rapidly - related to #10207

Is the bot being kicked?

@Khasimir
Copy link

This fixed the issue with my bot joining and leaving rapidly - related to #10207

Is the bot being kicked?

Unclear, but per the related issue (#10207), the bots are connecting and disconnecting several times. Thanks for considering merging this fix.

@Wubbity
Copy link

Wubbity commented Jun 21, 2025

This fixed the issue with my bot joining and leaving rapidly - related to #10207

Is the bot being kicked?

My bot's (a 420 related bot) purpose is to join the most populated voice channel in any given discord server at a specific time, wait until X:20 of that hour then play a "Cheers" related sound. Recently, I've gotten reports of people saying the bot would rapidly join and leave each time the automated scheduling would happen. No code has been changed.

Using this gitpull has fixed the issue. Much appreciated and apologizes for the late response.

@benrucker
Copy link

Just chiming in that I agree that this is a high-urgency problem to fix. Though I understand that it will take time to merge! Thanks for making a PR for this ❤️

@gutbash
Copy link

gutbash commented Jun 21, 2025

A quick note, saying "you should merge this soon" won't make this be merged soon.

@chrismuzyn
Copy link

Somehow, even though I experienced this issue 24 hours ago, today, without doing anything, my voice bot works again. Are they slow rolling the update or something? I definitely did not pull this merge request.

@itsTheFae
Copy link

I have another silly question in regard to network ports that are used by discord or the library for voice. Does any know what ports or ranges discord typically uses for voice? (its a long shot but figured might be worth asking.)

When I first used this patch, for about three days or so, it would connect just fine.
After that I had to allow UDP ports 2053 and 19300 and did so for bi-directional traffic. Previously it would only use a high-range, system assigned port for outbound connections. Needing the specific ports is fine, but struck me as out of the norm since it had not changed in several years up to now.
While it could just be my system, I felt it was also worth sharing in case anyone else is having similar issues.

@DA-344
Copy link
Contributor Author

DA-344 commented Jun 22, 2025

You can make a get request to /voice/regions to get all available region uris: https://discord.com/developers/docs/resources/voice#list-voice-regions

@DA-344
Copy link
Contributor Author

DA-344 commented Jun 22, 2025

Somehow, even though I experienced this issue 24 hours ago, today, without doing anything, my voice bot works again. Are they slow rolling the update or something? I definitely did not pull this merge request.

This is because your bot connected to a voice region that indeed used the 443 port, and the library defaulted to that port, so it works.

@dolfies
Copy link
Contributor

dolfies commented Jun 22, 2025

I have another silly question in regard to network ports that are used by discord or the library for voice. Does any know what ports or ranges discord typically uses for voice? (its a long shot but figured might be worth asking.)

When I first used this patch, for about three days or so, it would connect just fine.
After that I had to allow UDP ports 2053 and 19300 and did so for bi-directional traffic. Previously it would only use a high-range, system assigned port for outbound connections. Needing the specific ports is fine, but struck me as out of the norm since it had not changed in several years up to now.
While it could just be my system, I felt it was also worth sharing in case anyone else is having similar issues.

Discord no longer uses a defined port for voice. Previously, it was always 443 but that was an implementation detail.

@DigiDuncan
Copy link

Not this branch error, update PyNaCl

Yep! This fixes it, and the patch seems to work flawlessly. Thank you so much!

@mebesto123
Copy link

I am testing this merge with my bot. The connection issue is fixed but the audio is not playing. I am not sure if any of the changes impacted the player. Is the player working for others?

            vc = await channel.connect()
            path += os.path.sep + song
            exe = repoPath + os.path.sep + "ffmpeg" + os.path.sep + "bin" + os.path.sep + 'ffmpeg.exe' if platform.system() == 'Windows' else '/usr/bin/ffmpeg'
            vc.play(discord.FFmpegPCMAudio(path, executable= exe))
            while vc.is_playing():
                #Start Playing
                sleep(.1)            
            await vc.disconnect()

Console Output:

2025-06-24 18:49:27 INFO     discord.voice_state Connecting to voice...
2025-06-24 18:49:27 INFO     discord.voice_state Starting voice handshake... (connection attempt 1)
2025-06-24 18:49:27 INFO     discord.voice_state Voice handshake complete. Endpoint found: XXXXXXX.discord.media:443
2025-06-24 18:49:27 INFO     discord.voice_state Voice connection complete.
2025-06-24 18:49:30 INFO     discord.player ffmpeg process 24260 successfully terminated with return code of 0.
2025-06-24 18:49:30 INFO     discord.voice_state The voice handshake is being terminated for Channel ID XXXXXXX (Guild ID XXXXXX)

@DA-344
Copy link
Contributor Author

DA-344 commented Jun 24, 2025

This does not edit anything related to the Player, maybe it's an issue on your side.

@benrucker
Copy link

Been running this commit for about a day and a half in a single server with fairly consistent voice use, connects + reconnects, and so far haven't run into any issues.

@Wubbity
Copy link

Wubbity commented Jun 25, 2025

image

My bot has now been using this (consistently) for ~2 days with the bot joining (atleast) one voice channel every hour for the 48 hours without issue. About 4 days total.

Just updating

Edit: CheersBot joins multiple servers at the same time to play a sound- not just a single server.

harp0030 added a commit to harp0030/temp_til_pull that referenced this pull request Jun 25, 2025
…WebSocket errors

Technical Implementation:
- Upgrade Discord.py to v8 voice protocol (commit 398bdbec from DA-344/fix/voice-issues)
- Fix voice endpoint parsing that was incorrectly stripping port information
- Implement sequence acknowledgment (seq_ack) handling in voice gateway
- Add buffered resume capability for voice connection recovery
- Update voice WebSocket from v4 to v8 with heartbeat payload structure

Voice Connection Enhancements:
- Retry logic with exponential backoff for 4006 errors (3-9s delays)
- Rrror classification for session invalid vs recoverable errors
- Handling for US Discord servers most affected by 4006 errors
- Remove conflicting custom workarounds that interfered with official fix
- Maintain compatibility with native Discord.py connection methods

Dependency Updates:
- Update PyNaCl requirements (>=1.6.0) for new voice encryption methods
- Ensure compatibility with nacl.secret.Aead for voice v8 protocol
- Force reinstall Discord.py to latest fix branch with all commits

Root Cause Resolution:
The 4006 "session no longer valid" error was caused by Discord's infrastructure
changes that moved from fixed port 443 to dynamic ports, requiring voice protocol
v4→v8 upgrade with proper endpoint parsing and sequence acknowledgment handling.
This primarily affected US Discord voice servers (Dallas, LA, Chicago, etc.).

Resolves: Rapptz/discord.py#10207
References: Rapptz/discord.py#10210
@mas6y6
Copy link

mas6y6 commented Jun 26, 2025

2025-06-25 23:14:12,619 - INFO - Voice handshake complete. Endpoint found: c-atl11-2c71f790.discord.media
2025-06-25 23:14:12,763 - ERROR - Failed to connect to voice... Retrying in 9.0s...
Traceback (most recent call last):
  File "/home/mas6y65/XCAIS/env/lib/python3.11/site-packages/discord/voice_state.py", line 402, in _connect
    await self._handshake_websocket()
  File "/home/mas6y65/XCAIS/env/lib/python3.11/site-packages/discord/voice_state.py", line 553, in _handshake_websocket
    await self.ws.poll_event()
  File "/home/mas6y65/XCAIS/env/lib/python3.11/site-packages/discord/gateway.py", line 1037, in poll_event
    raise ConnectionClosed(self.ws, shard_id=None, code=self._close_code)
discord.errors.ConnectionClosed: Shard ID None WebSocket closed with 4006
2025-06-25 23:14:12,764 - INFO - The voice handshake is being terminated for Channel ID XXXXX (Guild ID XXXXX)

Got this error when testing the library

@VenusMods
Copy link

Using this PR, my Bot was running for a good 2 days before my logs got spammed with similar error messages like this.

2025-06-25 02:41:35,894 - ERROR - Attempting a reconnect in 142.79s
Traceback (most recent call last):
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1512, in _create_direct_connection
    hosts = await self._resolve_host(host, port, traces=traces)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1128, in _resolve_host
    return await asyncio.shield(resolved_host_task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1159, in _resolve_host_with_throttle
    addrs = await self._resolver.resolve(host, port, family=self._family)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/resolver.py", line 40, in resolve
    infos = await self._loop.getaddrinfo(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 867, in getaddrinfo
    return await self.run_in_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/socket.py", line 962, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/discord/client.py", line 701, in connect
    self.ws = await asyncio.wait_for(coro, timeout=60.0)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/tasks.py", line 479, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/discord/gateway.py", line 381, in from_client
    socket = await client.http.ws_connect(str(url))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/discord/http.py", line 554, in ws_connect
    return await self.__session.ws_connect(url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/client.py", line 1061, in _ws_connect
    resp = await self.request(
           ^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/client.py", line 770, in _request
    resp = await handler(req)
           ^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/client.py", line 725, in _connect_and_send_request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 622, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1189, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/venus/Documents/venus_bot_updated/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1518, in _create_direct_connection
    raise ClientConnectorDNSError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorDNSError: Cannot connect to host gateway-us-east1-b.discord.gg:443 ssl:default [Temporary failure in name resolution]

@Rapptz
Copy link
Owner

Rapptz commented Jun 26, 2025

Both of these errors are part of normal operating procedure. 4006 in the voice websocket is just an invalid session, you can see the client attempting to reconnect as part of the normal flow. The DNS error is just an error with your DNS and not something the library has any control over. Add a secondary DNS so you have a backup or use something like 1.1.1.1.

Copy link
Owner

@Rapptz Rapptz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine, thanks.

@Rapptz Rapptz merged commit 2175bd5 into Rapptz:master Jun 30, 2025
8 checks passed
@DA-344 DA-344 deleted the fix/voice-issues branch June 30, 2025 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error 4006 causing bot to repeatedly connect to vc and fail