Auto reconnect capability #20

shusugmt · 2015-01-05T01:16:26Z

It would be nice if it reconnects automatically when connection to HipChat has lost.

I run my bot on my laptop. Once I close my PC, lita never reconnects automatically so I need to stop and launch it again every time, which is a bit annoying.

shusugmt · 2015-01-05T01:21:15Z

I've written a poc. @jimmycuadra before I update specs, can you see my implementation?
shusugmt/lita-hipchat@0d9369d

jimmycuadra · 2015-01-08T11:56:01Z

Thanks for the POC. I think trying to manage reconnection within the client like this is probably a little too complex. What do you think about making Lita crash (or just calling shut_down explicitly) when the connection is lost? Then it can easily be restarted by the user's preferred process manager/init system. I like the idea of just letting it crash and then starting up again with a totally clean slate.

shusugmt · 2015-01-09T00:25:22Z

@jimmycuadra Thank you. Your idea sounds better. I'll write a new poc on other branch.

justintime · 2015-04-30T16:18:33Z

Just an opinion, but it feels to me like you're hitting a nail with a sledgehammer by crashing lita. What if lita is using more than 1 adapter, like slack and hipchat? In this case, if hipchat goes down for more than a few seconds, lita will be in a join/leave cycle in slack until hipchat comes back up.

Also, many process managers will only attempt to restart a service a finite number of times before it gives up. If hipchat were down for long enough, it wouldn't ever be restarted if the process manager gave up.

jimmycuadra · 2015-04-30T19:18:37Z

A single Lita process can't be used with more than one adapter, so that is not a problem in this case. I'm not aware of a process manager that can't keep attempting to restart a process infinitely – do you have examples? What would you suggest rather than letting the process exit?

justintime · 2015-04-30T21:00:01Z

While I'm not new to relying on external services from long running processes, I'm really new to lita. Thanks for pointing out the single adapter thing.

systemd will often use restart=on-failure, which if lita exits with status 0, it wouldn't restart the service. In upstart, there's respawn-limit.

As to how to deal with it, I'd think that applying graceful degradation similar to the way @s2ugimot did in shusugmt/lita-hipchat@0d9369d would require less work from the sysadmin and would ultimately be a more "set-it-and-forget-it" approach.

Hubot does it similarly: hipchat/hubot-hipchat#174

BTW: didn't get to thank you for your ChatOps panel at DevOpsDaysRox. I was the one in the openspace that talked about the "clock in" feature that was the killer feature that got buy-in from the users. I've been working on replacing Hubot with Lita this week and have been really happy so far.

jimmycuadra · 2015-04-30T21:09:36Z

Interesting... thanks for the additional details. I'm not sure how I feel about where/how this issue should be handled. I'll keep thinking about it.

Glad you've been enjoying Lita so far! Getting feedback about it that can lead to more improvements is the best thanks I could ask for. :}

dblessing · 2015-05-01T04:02:26Z

Another point to consider - a process supervisor is not a given. sysvinit is still widely used.

I also imagine how I might feel if a Java or Rails app I manage simply exited because a 3rd party service went away for a moment and it didn't try to reconnect.

Thanks for continuing to think about this.

willejs · 2015-05-28T11:16:55Z

@jimmycuadra I'm not a huge fan of this approach. If the websocket is closed, its an error? We should handle errors right?

Would you be adverse to some retry logic that is configurable, perhaps an exponential backoff?

jimmycuadra · 2015-05-28T11:55:39Z

lita-hipchat doesn't use WebSockets, but in any case, what type of error are you referring to? If there's something specific that is recoverable, then probably yes. I'm still hesitant about adding reconnection logic inside the adapter. I'd love to hear more specifics on how letting a process manager restart it has been a problem for folks.

dblessing · 2015-05-28T14:41:35Z

@jimmycuadra Do you have any examples of other applications that take this approach? I'm not sure you'll get many specifics on how it's worked for people because I'm not sure it's a common thing to do. Currently we use CentOS 6 and sysvinit so there is no process supervisor. We don't have any processes that routinely bomb out when external services fail and I would call it a bug if they did.

jimmycuadra · 2015-06-18T08:41:51Z

I guess I'm amenable to this feature after more thought (thank you for all the comments). I don't love the idea of having to maintain it, but clearly there is demand and the goal is to make users' lives easier. I'm really just kind of gun shy about handling additional complexity in the adapter since XMPP and the xmpp4r library are already quite nasty, and I don't want to introduce things that might create more subtle bugs in the future. But if someone wants to take this on, I will accept a really solid PR. I think it will need to handle reconnection with an exponential backoff and a maximum number of attempts to be safe.

sjernigan · 2016-01-13T02:16:42Z

We're running lita in kubernetes with built in process monitoring. We have other processes running under God and even custom scripts. While I appreciate the views about it reconnecting. I also like simplicity. I think process monitoring isn't that unusual and would suggest running lita under something even if it did handle this reconnect issue. Afterall, this isn't only reason lita might fail and process monitors have a rich feature set for dealing with these events. So practically, if making lita crash is an easy development task, I'd welcome it as an acceptable solution that moves lita forward. When/if a pull request comes with reconnect feature, I'll heed Jimmy's caution and let it bake a while before upgrading. Till then, it's manual intervention every time Hipchat's server's have a blip. Thanks for everyone who's contributed to lita.

If the socket has already disconnected, e.g., we don't want to raise _another_ error (see litaio#34, litaio#20) when trying to close it cleanly.

jimmycuadra mentioned this issue Jan 9, 2015

Connection timed out when there is no activity #22

Closed

jimmycuadra mentioned this issue May 28, 2015

Lita stops if Slack closes websocket litaio/lita-slack#38

Closed

jimmycuadra added effort/hard status/accepted type/improvement labels Sep 16, 2015

jimmycuadra mentioned this issue Oct 29, 2015

FR: Reconnect when the link is disconnected. litaio/lita-slack#15

Closed

sjernigan mentioned this issue Jan 13, 2016

Disconnecting from HipChat #34

Open

alindeman added a commit to alindeman/lita-hipchat that referenced this issue Feb 19, 2016

Ignores socket errors that occur during shut_down

987125c

If the socket has already disconnected, e.g., we don't want to raise _another_ error (see litaio#34, litaio#20) when trying to close it cleanly.

alindeman mentioned this issue Feb 19, 2016

Ignores socket errors that occur during shut_down #38

Merged

alindeman added a commit to alindeman/lita-hipchat that referenced this issue Feb 21, 2016

Ignores socket errors that occur during shut_down

f6ab143

If the socket has already disconnected, e.g., we don't want to raise _another_ error (see litaio#34, litaio#20) when trying to close it cleanly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto reconnect capability #20

Auto reconnect capability #20

shusugmt commented Jan 5, 2015

shusugmt commented Jan 5, 2015

jimmycuadra commented Jan 8, 2015

shusugmt commented Jan 9, 2015

justintime commented Apr 30, 2015

jimmycuadra commented Apr 30, 2015

justintime commented Apr 30, 2015

jimmycuadra commented Apr 30, 2015

dblessing commented May 1, 2015

willejs commented May 28, 2015

jimmycuadra commented May 28, 2015

dblessing commented May 28, 2015

jimmycuadra commented Jun 18, 2015

sjernigan commented Jan 13, 2016

Auto reconnect capability #20

Auto reconnect capability #20

Comments

shusugmt commented Jan 5, 2015

shusugmt commented Jan 5, 2015

jimmycuadra commented Jan 8, 2015

shusugmt commented Jan 9, 2015

justintime commented Apr 30, 2015

jimmycuadra commented Apr 30, 2015

justintime commented Apr 30, 2015

jimmycuadra commented Apr 30, 2015

dblessing commented May 1, 2015

willejs commented May 28, 2015

jimmycuadra commented May 28, 2015

dblessing commented May 28, 2015

jimmycuadra commented Jun 18, 2015

sjernigan commented Jan 13, 2016