Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock Scenario On Reconnect #513

Open
ErikMcClure opened this issue Feb 28, 2018 · 5 comments
Open

Deadlock Scenario On Reconnect #513

ErikMcClure opened this issue Feb 28, 2018 · 5 comments
Assignees
Labels
Milestone

Comments

@ErikMcClure
Copy link

@ErikMcClure ErikMcClure commented Feb 28, 2018

The Open() function calls OnEvent() twice after calling session.Lock(). Normally, this second call returns a Ready packet, which causes no problems. However, after a read error that triggers a reconnection, the discord API will NOT necessarily return a READY/RESUMED packet, because the API doesn't always know that a reconnection even happened, and will instead simply send whatever packet happened to be next. An example of the error messages gotten when this happens:

2018/02/28 06:03:52 [DG1] wsapi.go:226:listen() error reading from gateway wss://gateway.discord.gg/?v=6&encoding=json websocket, websocket: close 1006 (abnormal closure): unexpected EOF
2018/02/28 06:03:54 [DG1] wsapi.go:178:Open() Expected READY/RESUMED, instead got: &discordgo.Event{Operation:0, Sequence:362355, Type:"PRESENCE_UPDATE", RawData:json.RawMessage{ [cut]

The problem is that OnEvent assumes that the session write lock is NOT being held, because this is how it is normally called at wsapi:247. Consequently, two different packets will always deadlock if they happen to show up during the reconnect attempt:

  1. If a op 7 disconnect is, for whatever reason, sent immediately after a reconnect attempt, this will try to call Close() from inside of Open() and instantly deadlocks.
  2. If an op 11 heartbeat ack is sent, this calls s.Lock() and immediately deadlocks.

Even worse, should this simply be a normal event, this event is actually sent to event processing from inside of Open(), which could do anything or call anything in response to it, including Close(), which would also deadlock. Normally this isn't a problem, because the only event handler one expects is the OnReady event handler, but during a reconnect the Ready packet is often not sent and consequently any event handler could potentially be called.

There are two possible resolutions to this problem: Augment OnEvent() so that it can completely reject an unexpected event if a particular packet is being expected and ensure any code in the expected events doesn't do anything weird, or release the session lock before calling OnEvent() the second time, and make a custom handler for op 10 by breaking out the event preprocessing code from OnEvent() (or combine this with the above option and allow it to reject unexpected packets). Rejecting packets, however, could result in dropping potentially important events, such as user adds.

It's important to note that if the session lock is still being held while processing the Ready() packet, this still allows a user of the library to instantly deadlock it by doing something unusual inside of the OnReady handler, because the handler itself doesn't give any indication of the special circumstances in which it might be called.

@Mexican-Man
Copy link

@Mexican-Man Mexican-Man commented Aug 4, 2021

Any update on this bad boy? Looking at the code/stack trace, it looks like code involved has remained relatively untouched since this issue was posted. On bad days, I get this panic about every third time I try to disconnect; pretty rough.

@ErikMcClure
Copy link
Author

@ErikMcClure ErikMcClure commented Aug 4, 2021

I am no longer maintaining my discord bot that uses discordgo, so I won't be able to provide any additional details, or reproduce this issue.

@jame-developer
Copy link

@jame-developer jame-developer commented Sep 11, 2021

I haven't tested, actually I didn't used discordgo, but saw on the golang discord gophers server and checked it out ran tests, linters etc. and stumbled over

defer s.RUnlock()

Here is a defer call in a for loop!

@CarsonHoffman
Copy link
Collaborator

@CarsonHoffman CarsonHoffman commented Sep 13, 2021

@jame-developer is there an actually an issue here? If we hit line 825, we will never perform another iteration of the loop, by the return on line 836.

@jame-developer
Copy link

@jame-developer jame-developer commented Sep 15, 2021

@CarsonHoffman Yes you are right, in the current state of the code there is no issue, I was a bit to focused on the defer. My linters bringing up those lines and I had a bad experience with defer and for loops, so I was a bit to alerted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants