Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Gitter] Handle bad tokens more gracefully, so that all gateways don't go down #455

Closed
patcon opened this issue Jun 25, 2018 · 9 comments
Closed
Labels
enhancement New feature or request

Comments

@patcon
Copy link
Contributor

patcon commented Jun 25, 2018

Is your feature request related to a problem? Please describe.
Gitter token went stale and brought down the whole matterbridge app.

Describe the solution you'd like
Would be great if handled this gracefully and just logged an error

Describe alternatives you've considered
None.

Additional context
This is what happened recently with Gitter. But tokens can go stale for many reasons (a coworker resetting them, etc.) and would be nice if this didn't break the other gateways :)
http://blog.gitter.im/2018/06/11/gitter-token-leak-security-issue-notification/

Thanks for everything!

@42wim
Copy link
Owner

42wim commented Jun 27, 2018

It crashed and paniced ?
Do you maybe have a debug or log?

@42wim 42wim added the waiting for feedback Further information is requested label Jun 29, 2018
@patcon
Copy link
Contributor Author

patcon commented Jun 30, 2018

Ah shoot, don't have the logs that far back anymore, so will have to make time to repro. thanks! (I didn't notice "panick" in the output, but also noticed in another log file that it's easy to miss)

@akhmerov
Copy link

akhmerov commented Jul 6, 2018

We had a similar problem. I believe this is due to how matterbridge handles not being able to setup a relay (it refuses to continue).

I believe setting up a relay to a private mattermost channel and booting the bot out of it would have the same effect.

As a possible fix I suggest to introduce a possibly configurable way for matterbridge to handle failures to set up a gateway. This is especially important because it is not visible to the users whether the bridge is running or not.

As a sane default I'd say matterbridge could ignore all failures. Alternatively it could report changes in its status to a separate channel (relay initiated/relay failed/shutting down).

@42wim
Copy link
Owner

42wim commented Jul 7, 2018

Matterbridge refuses to start-up if a bridge is down. This seems like a sane default?

When all bridges are ok, matterbridge doesn't do anymore checks, it's the different libraries that handle the reconnects.

@akhmerov
Copy link

akhmerov commented Jul 7, 2018

Indeed, I was assuming a certain usage pattern, which isn't enough to reason what could be the best default, my apologies.

Yet, for some patterns of usage I believe a different treatment of failure would be more suitable. Let me describe our situation and explain why it is a relevant feature.

  • We use a single matterbridge instance to handle multiple gateways (about 5 between mattermost and gitter, and about 20 between different teams within mattermost).
  • The matterbridge is dockerised and configured via ansible
  • Whenever a new relay is added, the matterbridge container is restarted (or deleted and started anew).

If one gateway cannot be set up (can happen because gitter invalidates a token or because the bot user is booted from a private channel), all gateways silently go down (the only way for the users to observe that something doesn't work is to see that the messages stop being relayed).

This inability to see the failure, combined with all channels failing can be rather disruptive, and therefore I believe for our use case matterbridge continuing to work after failing to setup one gateway would be better.

@42wim 42wim added enhancement New feature or request and removed waiting for feedback Further information is requested labels Jul 14, 2018
@42wim
Copy link
Owner

42wim commented Jul 14, 2018

Ok, I'll make an option to continue even if a bridge has errors

@akhmerov
Copy link

Thanks!

@Arinerron
Copy link

@42wim What was the option (and how do I use it)? I don't know go.

@Arinerron
Copy link

Figured it out. Add IgnoreFailureOnStart="true" right after [general] in the toml conf file

zeridon pushed a commit to zeridon/matterbridge that referenced this issue Feb 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants