Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some shard's connection is broken, but can't reconnect until the program exit #1294

Closed
3 tasks done
JellyBrick opened this issue May 10, 2020 · 9 comments
Closed
3 tasks done
Labels
status: need more info in need of additional details

Comments

@JellyBrick
Copy link

JellyBrick commented May 10, 2020

General Troubleshooting

  • I have checked for similar issues.
  • I have updated to the latest JDA version.
  • I have checked the branches or the maintainers' PRs for upcoming bug fixes.

Bug Report

Some shard's connection is broken, but can't reconnect until the program exit.
(with this warnings)

...
[20:07:31.088] [WARN ] [WebSocketClient]: Hit the WebSocket RateLimit! This can be caused by too many presence or voice status updates (connect/disconnect/mute/deaf). Regular: 0 Voice: 19 Chunking: 0
[20:07:36.516] [WARN ] [WebSocketClient]: Missed 2 heartbeats! Trying to reconnect...
[20:08:44.271] [WARN ] [WebSocketClient]: Hit the WebSocket RateLimit! This can be caused by too many presence or voice status updates (connect/disconnect/mute/deaf). Regular: 0 Voice: 19 Chunking: 0
...

It appears to be related to this PR (#1282).

Expected Behavior

The connection will be auto recovering soon.

Code Example or Reproduction Steps

N/A

Code for JDABuilder or DefaultShardManagerBuilder Used

    DefaultShardManagerBuilder.createDefault(token)
                    .setAutoReconnect(true)
                    .setAudioSendFactory(
                            AsyncPacketProviderFactory.adapt(
                                    NativeAudioSendFactory()
                            )
                    )
                    .addEventListeners(eventListener)
                    .setUseShutdownNow(true)
                    .setBulkDeleteSplittingEnabled(true)
                    .build()

Exception or Error

N/A

@JellyBrick JellyBrick changed the title Some shards stop working intermittently Some shard's connection was broken, but can't reconnect until the program exit May 10, 2020
@JellyBrick JellyBrick changed the title Some shard's connection was broken, but can't reconnect until the program exit Some shard's connection is broken, but can't reconnect until the program exit May 10, 2020
@MinnDevelopment MinnDevelopment added the status: need more info in need of additional details label May 10, 2020
@MinnDevelopment
Copy link
Member

What is the shard status? Do you have any exceptions, warnings, or error logs? Can you provide a thread dump?

@JellyBrick
Copy link
Author

JellyBrick commented May 10, 2020

  1. In one or more guild, bot going offline (total shards: 30)
  2. This is the warning message
    (spam message in console with intervals regular)
[WARN ] [WebSocketClient]: Hit the WebSocket RateLimit! This can be caused by too many presence or voice status updates (connect/disconnect/mute/deaf). Regular: 0 Voice: 19 Chunking: 0
[WARN ] [WebSocketClient]: Missed 2 heartbeats! Trying to reconnect...

@Andre601
Copy link
Contributor

Do you have any priviledged intents enabled/disabled?
I received a similar issue, but with "Chunking" and a fix was to change the ChunkingFilter to NONE and to also disable The Intents for PRESENCE, VOICE_STATE (I assume you need it?) and CLIENT_STATUS.

Not sure if disabling some of them would help.

@MinnDevelopment
Copy link
Member

I need the following:

  1. JDA#getStatus for the shard that is stuck
  2. All WARN/ERROR log messages of that session (or at least leading up to the stuck state)
  3. The thread dump when its stuck jstack -l <pid>

@JellyBrick
Copy link
Author

Privileged intents are disabled.
and

  1. The status is all CONNECTED.
  2. This's it.
[WARN ] [daemon-pool-gateway-4-thread-2] [WebSocketClient]: Hit the WebSocket RateLimit! This can be caused by too many presence or voice status updates (connect/disconnect/mute/deaf). Regular: 0 Voice: 19 Chunking: 0
[WARN ] [daemon-pool-gateway-4-thread-3] [WebSocketClient]: Missed 2 heartbeats! Trying to reconnect...

[daemon-pool-gateway-4-thread-2]

priority:5 - threadId:daemon-pool-gateway-4-thread-2 - state:WAITING
stackTrace:
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.6/Native Method)
- parking to wait for <0x0000000481611850> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.6/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.6/AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.6/ScheduledThreadPoolExecutor.java:1177)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.6/ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.6/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.6/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.6/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.6/Thread.java:834)

[daemon-pool-gateway-4-thread-3]

priority:5 - threadId:daemon-pool-gateway-4-thread-3 - state:WAITING
stackTrace:
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.6/Native Method)
- parking to wait for <0x0000000481611850> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.6/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.6/AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.6/ScheduledThreadPoolExecutor.java:1177)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.6/ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.6/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.6/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.6/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.6/Thread.java:834)

@MinnDevelopment
Copy link
Member

Thats only 2 daemon threads, this isn't a thread dump. Please provide a complete thread dump. You can post it on https://gist.github.com and link it here.

@MinnDevelopment
Copy link
Member

It looks like you somehow killed the WriteThread for shard 4.

@MinnDevelopment
Copy link
Member

MinnDevelopment commented May 11, 2020

Update to 4.1.1_148 and check if the behavior returns.

@JellyBrick
Copy link
Author

JellyBrick commented May 12, 2020

Thanks! It seems like the problem is solved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: need more info in need of additional details
Projects
None yet
Development

No branches or pull requests

3 participants