New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heartbeat dropping crashes the bot until manually restarted #430
Comments
I am experiencing the same issue. My bot will stop working as soon as it can't get a heartbeat and is needing to be restarted about once an hour right now. :( |
Also, I am using an Azure VM; should be a pretty stable connection. |
I believe I know the cause for this issue and it's a simple fix, however, we're planning on refactoring the gateway which would eliminate this problem anyhow. (Or at least in theory, it would.) There's a current experiment you can try running to see if this produces the issue or not:
What I believe is happening is that for every This is honestly an oversight on my part, and we can probably do away entirely with the subtraction of the random value. We did this, however, because it worked well for most cases that needed us to check for jitter. The Gateway does handle network latency between you (the client) and the server, but we didn't know just how precise it was. The Developers documentation also mentioned that it was okay for us to do this, so we bit the bullet and did it anyhow. |
Do you mean this will introduce the error? Or that it should fix it? I am available to try this and let you know. |
Just FYI, I removed the - random() arithmetic math from the Heartbeat class and still am having the same issue. |
Does the bot crash more, less or the same amount of frequency? |
I left my bot running overnight and it wasn't working when I woke up. I'll restart it, monitor, and let you know. I can record some average times to crashing for you. |
Same issue after the fix: 01/19/2022 12:35:26 AM INFO: [] Starting bot. Seems to be crashing every 2 hours. |
I also timed my first crash at ~2 hours just now. |
I've identified the cause of the bug during my testing and will be addressed for next release. |
Is this a quick fix still; something I can go ahead and change to get the bot working? Or will we need to wait for release? Thanks for checking into this and identifying the issue! |
We've initiated a PR that will address this. See issue #452 that goes into the specific details of why this is happening. Because another issue has been set up for this, we will be closing this one. I highly recommend to take any conversations from here to there. |
This has been discussed in the Discord server, but I thought I'd file an issue for the sake of completeness.
When a bot can't get a heartbeat, there's no connection logic to reconnect the bot afterwards. This means that bots, even on stable connections (I'm using a Contabo VPS), can have trouble staying online for any extended period of time.
For now, as a bodge, I'm testing using GNU timeout to restart my bot by force as a pre-emptive measure - if anyone wants the shell script, let me know
The text was updated successfully, but these errors were encountered: