New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring Framework #25

Closed
irskep opened this Issue Sep 29, 2011 · 8 comments

Comments

Projects
None yet
4 participants
@irskep
Contributor

irskep commented Sep 29, 2011

Tron needs a framework for doing monitoring.

Some potential monitoring systems:

  • Email notification of failures
  • Nagios notifications
  • IRC updates
  • Logging

Most we need a flexible framework of recording tron events and deciding what to pass onto configured monitors. Some of these may be integrated with how we do the web service framework. As in a nagios monitor could fairly easily use the web service interface to find the status of jobs. However sending an email on a failure would not work well with this system.

We now have support for crash emails, which is part of solving this guy.

Copied from rhettg#2

@irskep

This comment has been minimized.

Contributor

irskep commented Oct 25, 2011

I'm thinking some of this would be simpler and more flexible with an event hook system. Example config:

hooks:
  succeed:
    - "echo 'succeeded: %(action_name)s'"
  fail:
    - "email user@domain.com 'failed: %(action_name)s'"
@irskep

This comment has been minimized.

Contributor

irskep commented Mar 22, 2012

There has been no traffic on this. Still worth doing/keeping around as a ticket?

@dnephin

This comment has been minimized.

Contributor

dnephin commented Apr 14, 2012

I still like this idea, but it may be a few released out.

@justincinmd

This comment has been minimized.

justincinmd commented Oct 31, 2014

Are there any plans to do this? It would be extremely useful to have this for jobs with many downstream dependencies.

@dnephin

This comment has been minimized.

Contributor

dnephin commented Nov 1, 2014

I don't think we have any plans to work on this any time soon.

We have to figure out how to fix the underlying architectural issues with Tron before we make any additions to it.

@solarkennedy

This comment has been minimized.

Collaborator

solarkennedy commented Jan 26, 2018

I've started a prototype for monitoring in #335.

@solarkennedy

This comment has been minimized.

Collaborator

solarkennedy commented Feb 9, 2018

@irskep I'm going to close this ticket, as we have the check_tron_jobs command now. It isn't as flexible as running arbitrary commands for hooks, but it is useful for yelp developers as it uses the same pysensu-yelp interface available to them in other tools (like paasta).

@irskep

This comment has been minimized.

Contributor

irskep commented Feb 9, 2018

OK! I haven't worked on Tron since 2012, so you know best. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment