Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hang when Websocket Read is interrupted #108

Closed
wants to merge 2 commits into from

Conversation

@tuxcanfly
Copy link
Contributor

@tuxcanfly tuxcanfly commented Jul 3, 2014

Updated to catch the quit signal and break out of the loop. Tested that ^C shutsdown btcwallet as expected, with the same env in which it was hanging earlier.

@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 3, 2014

This helps when the SIGINT is received between messages but not when a message is already being read.

Loading

@jrick
Copy link
Member

@jrick jrick commented Jul 3, 2014

I'll need to take another look at how this is implemented by btcd, but I almost half wonder whether it would makes sense to not increment/decrement the rpcServer WaitGroup for this goroutine to receive the websocket client messages.

If that change is made, then the goroutine that reads from wsc.allRequests (which WebsocketClientRead sends to, and closes when finished) should be switched from a for-range to a for-select to include not just that channel, but also the rpcServer quit chan.

Loading

@jrick
Copy link
Member

@jrick jrick commented Jul 3, 2014

btcd does this the same way as in your PR, so this may be ok, but I'll let @davecgh comment on which design would be better. Personally I think the btcd design can still result in hangs if you time it right.

edit: nvm, I'm wrong. The wait group that btcd increments for the "inHandler" is specific for the client, not the whole server.

Loading

@davecgh
Copy link
Member

@davecgh davecgh commented Jul 3, 2014

This looks correct to me.

Loading

@jrick
Copy link
Member

@jrick jrick commented Jul 3, 2014

Not sure how it can be. If the server has not begun shutting down (so the select falls through to the default case) and ReadMessage is called and blocks, this can block the shutdown of the entire server until the client disconnects or sends an invalid websocket frame.

Loading

@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 3, 2014

Yes, as I mentioned in the earlier comment, there's still an issue when the timing is right and the server gets blocked on ReadMessage.

Loading

@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 4, 2014

I removed WebsocketClientRead from the waitgroup and it doesn't hang on ReadMessage anymore. But again with the right timing it hangs on RequestHandler, as per the following goroutine stacktrace:

goroutine 50 [chan receive]:
github.com/conformal/btcrpcclient.FutureNotifyReceivedResult.Receive(0x18b122c0, 0x0, 0x0)
        /home/tuxcanfly/Work/golang/src/github.com/conformal/btcrpcclient/notify.go:823 +0x7e
github.com/conformal/btcrpcclient.(*Client).NotifyReceived(0x18cfc5b0, 0xb6291e60, 0x1, 0x1, 0x0, 0x0)
        /home/tuxcanfly/Work/golang/src/github.com/conformal/btcrpcclient/notify.go:901 +0x62
main.(*Account).NewAddress(0x18aee1e0, 0x0, 0x0, 0x0, 0x0)
        /home/tuxcanfly/Work/golang/src/github.com/conformal/btcwallet/account.go:619 +0x3d5
main.GetNewAddress(0xb74eeae0, 0x18e74300, 0x0, 0x0, 0x0, 0x0)
        /home/tuxcanfly/Work/golang/src/github.com/conformal/btcwallet/rpcserver.go:1490 +0x293
main.(*rpcServer).RequestHandler(0x18a7e200)
        /home/tuxcanfly/Work/golang/src/github.com/conformal/btcwallet/rpcserver.go:995 +0xc8
created by main.(*rpcServer).Start
        /home/tuxcanfly/Work/golang/src/github.com/conformal/btcwallet/rpcserver.go:300 +0x84

I tried a similar workaround, i.e. remove RequestHandler from the waitgroup also and it works but not sure if that's the correct way.

Loading

@jrick
Copy link
Member

@jrick jrick commented Jul 4, 2014

Two things:

First, that looks like a btcrpcclient hang.

Second, you modified RequestHandler from a for-select, which reads from s.requests and s.quit, to a for-range that only reads from s.requests. Don't do this. It will now never close, since the s.requests channel is not closed anywhere.

Loading

@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 4, 2014

OK, after removing the RequestHandler from waitgroup I saw that it never received s.quit so I removed it form the for-select. I've reverted the changeset.

I tried that because I was still facing a hang, here's the complete goroutine dump: https://gist.github.com/tuxcanfly/ad54fdab0a90306cf491

Loading

@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 7, 2014

It's a hack but the last commit seems to help, I didn't find any hang on RequestHandler or FutureNotifyReceivedResult in my tests. Tried guarding the client calls with client.Disconnected in the caller itself, but it didn't help so I had to put it in NewAddress.

Loading

@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 7, 2014

As discussed on IRC, there's still one failing case here, although its less frequent:

https://gist.github.com/tuxcanfly/ef0bc10ddebacbfff9f9

Loading

@@ -617,8 +617,10 @@ func (a *Account) NewAddress() (btcutil.Address, error) {
if err != nil {
return nil, err
}
if err := client.NotifyReceived([]btcutil.Address{addr}); err != nil {
return nil, err
if !client.Disconnected() {
Copy link
Member

@jrick jrick Jul 8, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a btcrpclient fix that causes requests to error when the client has already shutdown, so this check shouldn't be necessary (and isn't really right to begin with, a disconnected client could reconnect and the request would then be reissued).

Loading

Copy link
Contributor Author

@tuxcanfly tuxcanfly Jul 8, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it was a hack. Removed.

Loading

@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 8, 2014

With the latest rpcclient, most hangs seem to have been solved. Only issue I'm seeing is:

goroutine 53 [chan send]:
main.(*rpcServer).RequestHandler(0x18b20360)
    /home/tuxcanfly/Work/golang/src/github.com/conformal/btcwallet/rpcserver.go:1055 +0x240
created by main.(*rpcServer).Start
    /home/tuxcanfly/Work/golang/src/github.com/conformal/btcwallet/rpcserver.go:305 +0x87

which seems to be fixed by:

- r.response <- handlerResponse{result, jsonErr}
+ select {
+ case r.response <- handlerResponse{result, jsonErr}:
+ case <-s.quit:
+ break out
+ }

I guess sending to the response chan was blocked because the receiver had already shutdown.

Loading

AcctMgr.Release()
case <-s.quit:
break out
default:
Copy link
Member

@jrick jrick Jul 8, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just make this an empty default case and let it fall through, then we don't lose another indent.

Loading

@jrick jrick closed this in 9036d36 Jul 8, 2014
@tuxcanfly tuxcanfly deleted the fix-ws-read-hang branch Jul 8, 2014
@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 9, 2014

Sorry for nagging but I found at least one instance of a possible hangup. In my tests, this was rare so consider it low priority.

https://gist.github.com/tuxcanfly/629e4b9db61441ffebd6

I tried out a fix by using a goroutine to fetch the response and it seems to help:

tuxcanfly@a439399

Loading

@tuxcanfly
Copy link
Contributor Author

@tuxcanfly tuxcanfly commented Jul 18, 2014

Loading

jrick added a commit to jrick/btcwallet that referenced this issue Nov 15, 2016
This is only being done for compatibility with older dcrwallet
versions that refuse to decode word list seeds that are not 33 words
long when entering them from the console (the gRPC API accepts seeds
as simply a byte array so long as they are within the correct length
range).

Fixes btcsuite#108.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants