Add a command to query consumer status #194

ralphbean opened this Issue Oct 4, 2013 · 5 comments

3 participants

Fedora Infrastructure member

Background: fedmsg consumers run as plugins to a single fedmsg-hub daemon. For instance, a fedmsg-hub daemon runs on badges-backend01. It loads a single BadgesConsumer plugin that does the work of awarding badges.

It is possible, even common, for the daemon to start up fine but for one of its consumers to raise an exception at startup. The daemon continues to run, but with zero plugins loaded; it does zero work.

This is deceiving because when nagios looks at the process list, it sees fedmsg-hub running and it thinks everything is fine.

We need some fedmsg CLI command that interrogates a fedmsg twisted process and asks it for the list of consumers that it successfully loaded. We then need to write nagios/nrpe checks to ensure consumers we want to run are running.

Moksha had something like this in the past. Is it still there? Does it work?

This issue blocks infra ticket

Fedora Infrastructure member

The moksha cli that you speak of simply iterated over the entry-points to display the list of consumers. It didn't actually query a running Twisted reactor.

A couple of possibilities come to mind:

  • Add some sort of status service listening to a specific port that can be queried by nagios/etc to get consumer/producer stats
  • Create a "lockfile" for each consumer on disk so we could check in /var/run/fedmsg/consumers to see a list of them.
Fedora Infrastructure member

Add some sort of status service listening to a specific port that can be queried by nagios/etc to get consumer/producer stats

@lmacken, I think I like this the most. It could be implemented any one of a couple different ways:

  • A unix socket on the filesystem
  • A tcp socket
  • A zeromq ipc REQ/REP socket.

It would probably fit best as a new feature for moksha instead of just fedmsg, no?

Is there any reason not to exit if any of the consumer/producers crash?

@p3ck, a third party package might install a consumer on the system. If that package has a bug, you don't want it to stop your consumers from running at all.

We ran into this early on with fedmsg where the old-old fedoracommunity webapp had installed an unused moksha consumer on our app servers. When we tried installing fedmsg for the very first time, the hubs would crash on startup when they loaded in that consumer which we never intended to run.

Fedora Infrastructure member

There is progress on this in mokshaproject/moksha#11. Yay!

Fedora Infrastructure member

This is all done and deployed.

@ralphbean ralphbean closed this May 12, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment