Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues related to the 2015-04-02 14:32 CEST deadlock #100

Closed
stapelberg opened this issue Apr 2, 2015 · 0 comments
Closed

Fix issues related to the 2015-04-02 14:32 CEST deadlock #100

stapelberg opened this issue Apr 2, 2015 · 0 comments

Comments

@stapelberg
Copy link
Contributor

Symptom: clients disconnected after a while, /debug/pprof/goroutines reveals that > 70 goroutines are stuck on acquiring the applyMu mutex.

From the log on alp:

I0402 12:32:55.480121       1 api.go:97] Proxying request ("/robustirc/v1/0x13ced24bfa07d924/message") to leader "ridcully.robustirc.net:60667"
I0402 12:32:55.494351       1 api.go:97] Proxying request ("/robustirc/v1/0x13cad43f066bfec3/message") to leader "ridcully.robustirc.net:60667"
I0402 12:32:55.508024       1 robustirc.go:358] Apply(msg.Type=irc_from_client)
I0402 12:32:55.510380       1 server.go:1775] http: panic serving 172.17.42.1:40364: Assumption violated: current time 1427977975510311504 is older than the timestamp of the last processed message (1427977975515627970)
goroutine 3075187 [running]:
net/http.func·011()
        /home/michael/go/src/net/http/server.go:1130 +0xbb
github.com/robustirc/robustirc/ircserver.(*IRCServer).NewRobustMessage(0xc208354000, 0x2, 0x13ced089c4d38cbf, 0x0, 0xc221be8d80, 0xe, 0x0)
        /home/michael/gocode/src/github.com/robustirc/robustirc/ircserver/ircserver.go:239 +0x247
main.handlePostMessage(0x7fe9102a19c8, 0xc220823d60, 0xc2194c2dd0, 0xc211856f60, 0x1, 0x1)
        /home/michael/gocode/src/github.com/robustirc/robustirc/api.go:184 +0x717
github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc21ab040c0, 0x7fe9102a19c8, 0xc220823d60, 0xc2194c2dd0)
        /home/michael/gocode/src/github.com/julienschmidt/httprouter/router.go:293 +0x18e
net/http.(*ServeMux).ServeHTTP(0xc20800ca80, 0x7fe9102a19c8, 0xc220823d60, 0xc2194c2dd0)
        /home/michael/go/src/net/http/server.go:1541 +0x17d
net/http.serverHandler.ServeHTTP(0xc21bbc1500, 0x7fe9102a19c8, 0xc220823d60, 0xc2194c2dd0)
        /home/michael/go/src/net/http/server.go:1703 +0x19a
net/http.(*conn).serve(0xc21f1695e0)
        /home/michael/go/src/net/http/server.go:1204 +0xb57
created by net/http.(*Server).Serve
        /home/michael/go/src/net/http/server.go:1751 +0x35e
I0402 12:32:55.613561       1 robustirc.go:358] Apply(msg.Type=irc_from_client)

There are two problems to be addressed:

  1. panic() calls in HTTP handlers should lead to RobustIRC exiting. That fixes the deadlock itself.
  2. Possibly we should try to avoid calling NewRobustMessage on non-leaders in the first place. Still thinking about that.
stapelberg added a commit that referenced this issue Apr 2, 2015
See the comment for rationale. This is half of the fix for #100.

In the common case (stable leadership), this saves a bit of resources
and lock contention. In the case of leadership transferring, this change
might require the client to resend a message, i.e. incurs a roundtrip
(possibly with exponential backoff).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant