Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More than one reporter? #47

Closed
alexanderjardim-zz opened this issue Mar 18, 2014 · 5 comments
Closed

More than one reporter? #47

alexanderjardim-zz opened this issue Mar 18, 2014 · 5 comments

Comments

@alexanderjardim-zz
Copy link

Hi,

I am thinking of using Smart Stack for service discovery. Nowadays, I am using monit to keep my services alive and do health checks.

As I would be using nerve to do the same health checks, it came to my mind if it make any sense to have to configure the same health check in 2 places, and do each one twice, or if it would have any sense on having nerve report on my alarm system and zk, at the same time for the same unsuccessful health check.

So, does it make any sense to have more than one reporter registered at the same time?

@alexanderjardim-zz
Copy link
Author

Just marking my issue as a question

@igor47
Copy link
Collaborator

igor47 commented Mar 18, 2014

Short answer: just use both nerve and monit side by side.

Long answer:

There are two types of things here:

  • health checks
  • acting on the results of health checks

We have two separate components. Nerve only does health checks, and the only action it takes is publishing the results of those health checks. Synapse acts on the results of the health check to configure haproxy.

We have additional alerting capabilities at Airbnb which also consume the results of the health checks to generate alerts, like monit would do. However, we don't have any component that actively tries to restore health. This is because we're worried that actively trying to restore health would only cause more problems.

Ideally, if your code encounters bugs, it would fail fast. We use runit to run all of our services, so they would get automatically restarted. This is how we run nerve and synapse in prod as well.

However, if you are failing health checks because of failing upstream dependencies, restarting the service would not help and might cause harm as a starting service hammers your dependencies. This would argue against the use of monit for actively intervening in failing health checks.

@alexanderjardim-zz
Copy link
Author

Ok, forget about monit restarting my services. Point is: both monit and nerve will do same health checks. Monit will start my alarm routines and nerve will notify zk one node is out. Does it make sense to put both alarm reporting and service discovery reporting on nerve?

@igor47
Copy link
Collaborator

igor47 commented Mar 19, 2014

like i said, i think it is best to have a separate tool to do the alerting, which is linked with all of the rest of your systems alerting; we're planning on open-sourcing such a tool soon.

@jolynch
Copy link
Collaborator

jolynch commented Apr 18, 2017

I think there are lots of options here, either what igor has mentioned or you can do what we do at Yelp and monitor the other end of the equation in Synapse (check that enough instances are actually in HAProxy).

Since this hasn't had any action for a few years I'm going to close this. Feel free to re-open if these answers are insufficient :-)

@jolynch jolynch closed this as completed Apr 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants