Switch branches/tags
Nothing to show
Find file
Fetching contributors…
Cannot retrieve contributors at this time
2 lines (1 sloc) 50 KB = {"spm":{"readme":"[Scalable Performance Monitoring\n(SPM)]( is the enterprise-class,\ncloud-based System/OS, JVM, and Application Performance Monitoring\nSaaS.\n\nNotable features:\n* rich graphs and charts\n* slice and dice by time, server, and one or more application-specific\n dimensions at once\n* averages, sums, min, max, etc.\n* integrated alerts\n* integrated email report subscriptions\n* no loss of precision/granularity over time\n* multiple choices of metric granularity\n\nCurrently, SPM has built-in detailed monitoring for:\n* [Apache Solr](\n* [ElasticSearch](\n* [SenseiDB](\n* [HBase](\n\n-\n-\n","git":""},"ganglia":{"readme":"Ganglia is a scalable distributed monitoring system for high-performance \ncomputing systems such as clusters and Grids. It is based on a hierarchical \ndesign targeted at federations of clusters. It leverages widely used \ntechnologies such as XML for data representation, XDR for compact, portable \ndata transport, and RRDtool for data storage and visualization. It uses \ncarefully engineered data structures and algorithms to achieve very low \nper-node overheads and high concurrency. The implementation is robust, has been \nported to an extensive set of operating systems and processor architectures, \nand is currently in use on thousands of clusters around the world. It has been \nused to link clusters across university campuses and around the world and can \nscale to handle clusters with 2000 nodes.\n\nGanglia is a BSD-licensed open-source project that grew out of the University \nof California, Berkeley Millennium Project which was initially funded in large \npart by the National Partnership for Advanced Computational Infrastructure \n(NPACI) and National Science Foundation RI Award EIA-9802069. NPACI is funded \nby the National Science Foundation and strives to advance science by creating a \nubiquitous, continuous, and pervasive national computational infrastructure: \nthe Grid. Current support comes from Planet Lab: an open platform for \ndeveloping, deploying, and accessing planetary-scale services.\n\n\n","git":""},"rivermuse":{"readme":"\nOnce touted as the ultimate opensource central event management and logging console.\nused to be vapourware for a long time .. \n\n\n\nRiverMuse CORE is an Open Source Fault and Event Management platform.\n\n\n\n\n\n\n\n","git":""},"mod-log-firstbyte":{"readme":"Apache 2.x module that measures time to first byte and exposes this as a new\nlog format string. Code originally developed at Google and largely abandoned now.\n\nBe nice to get some more traction on this and even get it built into the official\nApache release.\n\n\n","git":""},"redis-cluster-monitor":{"readme":"# redis-cluster-monitor\n\n---\n\nUPDATE: please note that this is just a toy. A proper \"redis-cluster\" is on the\nofficial Redis roadmap. There's also the [twine\nproject]( to consider.\n\n---\n\nRedis supports master-slave (1:N) replication but does not support *automatic*\nfailover. That is, if the master \"goes down\" for any reason, your sysadmin\n(read: you) has to reconfigure one of the Redis slaves to be the new master.\n\nOne could use monit or God or whatever alongside redis-cli to check if a host is\nup, then send the other hosts a SLAVEOF command to reconfigure the cluster\naround a new master.\n\nI created a Python script that does this instead. It only took an hour to do so\nno big loss. OK, I didn't think of using redis-cli initially :)\n\nPerhaps this project could at least be a part of the infrastructure (perhaps in\nconcept only) for the forthcoming \"redis-cluster\" project.\n\nThe [original mailing list thread](\ndiscusses the origins of this project a bit.\n\n## Requirements\n\nYou must have already installed the Redis Python client.\n\nA cluster of Redis instances :)\n\n## Usage\n\n1. Configure your redis cluster with one master and 0 or more slaves.\n1. Run on a host outside this cluster:\n\n python ip1:port ip2:port ... ipN:port\n\n## Example\n\nHere's a sample cluster running all on the same host for simplicity's sake.\n\n $ egrep '^(port|slaveof) ' *.conf\n redis-master.conf:port 6379\n redis-slave1.conf:port 6380\n redis-slave1.conf:slaveof 6379\n redis-slave2.conf:port 6380\n redis-slave2.conf:slaveof 6379\n\nFire up the master:\n\n $ ./redis-server redis-master.conf \n 29 Oct 20:45:27 - Server started, Redis version 1.050\n 29 Oct 20:45:27 - DB loaded from disk\n 29 Oct 20:45:27 - The server is now ready to accept connections on port 6379\n\nFire up the first slave:\n\n $ ./redis-server redis-slave1.conf \n 29 Oct 20:45:47 - Server started, Redis version 1.050\n 29 Oct 20:45:47 - DB loaded from disk\n 29 Oct 20:45:47 - The server is now ready to accept connections on port 6380\n 29 Oct 20:45:48 . DB 0: 2 keys (0 volatile) in 4 slots HT.\n 29 Oct 20:45:48 . 0 clients connected (0 slaves), 3280 bytes in use, 0 shared objects\n 29 Oct 20:45:48 - Connecting to MASTER...\n 29 Oct 20:45:49 - Receiving 35 bytes data dump from MASTER\n 29 Oct 20:45:49 - MASTER <-> SLAVE sync succeeded\n\nFire up the second slave:\n\n $ ./redis-server redis-slave2.conf \n 29 Oct 20:46:15 - Server started, Redis version 1.050\n 29 Oct 20:46:15 - DB loaded from disk\n 29 Oct 20:46:15 - The server is now ready to accept connections on port 6381\n 29 Oct 20:46:16 . DB 0: 2 keys (0 volatile) in 4 slots HT.\n 29 Oct 20:46:16 . 0 clients connected (0 slaves), 3280 bytes in use, 0 shared objects\n 29 Oct 20:46:16 - Connecting to MASTER...\n 29 Oct 20:46:16 - Receiving 35 bytes data dump from MASTER\n 29 Oct 20:46:16 - MASTER <-> SLAVE sync succeeded\n\nFire up the monitor which will auto-determine the role of each host in the cluster.\nWatch as (by default) every 5 seconds it checks on the cluster:\n\n $ python\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n \nKill the master on port 6379. Watch the monitor's output:\n\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n INFO:redis-cluster-monitor:2 slave(s) with as master.\n WARNING:redis-cluster-monitor:... is down!\n INFO:redis-cluster-monitor:picking new master from slaves:,\n INFO:redis-cluster-monitor:master is offline; picked new master to be\n INFO:redis-cluster-monitor:commiting MASTER status for\n INFO:redis-cluster-monitor:making existing slave a slave of\n INFO:redis-cluster-monitor:commiting SLAVE status for as slave of\n INFO:redis-cluster-monitor:1 slave(s) with as master.\n WARNING:redis-cluster-monitor:... is down!\n INFO:redis-cluster-monitor:1 slave(s) with as master.\n ...\n WARNING:redis-cluster-monitor:... is down!\n INFO:redis-cluster-monitor:1 slave(s) with as master.\n\n## Issues?\n\nPlease let me know if you find any issues. \n\nThere's likely to be some corner cases as I just started this project.\n\n## Hang on a second!\n\nRight, you might be wondering how this is useful since you still have a single\nwrite-master and N read-slaves in a cluster but most if not all Redis client\nlibraries require a single host:port to connect to. If the master goes down in\nthe cluster another will be brought up but your client will have no idea that\nthis happened.\n\nExactly.\n\nSo, this isn't useful *yet* in practice. You need a smarter client library that\nis \"cluster-aware\". I'll be patching the Python Redis client to this end soon.\nThe long term goal is a redis-cluster project where all these smarts will live.\n\n## License\n\nMIT\n\n## Copyright\n\nCopyright (C) 2009 Fictorial LLC.\n","git":""},"zabbix":{"readme":"Zabbix is an enterprise-class open source distributed monitoring solution.\n\n\n","git":""},"datadog":{"readme":"Datadog is a cloud-based service that brings into one convenient place metrics and events from your systems, applications and cloud providers.\n\n\n","git":""},"cucumber-nagios":{"readme":"cucumber-nagios allows you to write high-level behavioural tests of web application, and plug the results into Nagios.\n\n","git":"git://"},"etsy-nagios_tools":{"readme":"Nagios tools from the good folks at Etsy. For example, \"check_graphite\"\n\n\n","git":""},"nagios":{"readme":"Nagios is a popular open source computer system and network monitoring software \napplication. It watches hosts and services, alerting users when things go wrong \nand again when they get better.\n\n\n","git":"git://"},"cepmon":{"readme":"Send your graphite metric stream through the powerful Esper CEP engine for better real-time data analysis.\n\n\n\n","git":""},"critical":{"readme":"Critical is my take on network/infrastructure monitoring. Here are the big\nideas:\n\n* Infrastructure as code: The monitoring system should be an internal DSL so it\n can natively interact with any part of your infrastructure you can find or\n write a library for. You should also be able to productively alter its guts if\n you need to. This is a monitoring system for ops people who write code and\n coders who do ops.\n* Client-based: This scales better, and is actually easier to configure if you\n use configuration management, which you should be doing anyway.\n* Continuous verification: Critical has a single shot mode in\n addition to the typical daemonized operation. This allows you to verify the\n configuration on a host after making changes and then continuously monitor the\n state of the system using the same verification tests.\n* Declarative: Declare what the state of your system is supposed to be.\n* Alerting and Trending together: a client/agent can do both of these at the\n same time with less configuration overhead. It makes sense to keep them\n separate on the server side.\n* Licensing: \"Do what thou wilt shall be the whole of the law,\" except for\n patent trolls, etc. So, Apache 2.0 it is.\n \n","git":""},"molog":{"readme":"\nMolog : \n\n\n\nBuild a scalable monitor and search solution for application and system logs with Nagios.\n\nMolog is processing layer for a scalable logging infrastructure which consumes LogStash processed logs from RabbitMQ and sends updates to Nagios and ElasticSearch.\n\n* A stand alone daemon with configurable parallel workers.\n* Workers consume LogStash processed messages from RabbitMQ and perform ignore regex matching on them.\n* Regex rules can be applied to all LogStash generated fields.\n* Sends Nagios passive check results back to RabbitMQ.\n* Nagios check results can be consumed from RabbitMQ into Nagios using Krolyk.\n* Forwards and stores all messages to ElasticSearch.\n* Only stores references to ElasticSearch records in a MongoDB instance.\n* Provides REST API for querying and manipulating references and regexes.\n* Includes molog_cli an interactive REST client to manipulate records, matches and regexes. \n\nMoLog is written in Python. \n","git":""},"servo":{"readme":"[link](\n\nThe goal of Servo is to provide a simple interface for exposing and publishing application metrics in Java.\n\nThe primary requirements are:\n\n+ **Leverage JMX**: JMX is the standard monitoring interface for Java and can be queried by many existing tools.\n+ **Keep it simple**: It should be trivial to expose metrics and publish metrics without having to write lots of code such as \"MBean interfaces\":\n+ **Flexible publishing**: Once metrics are exposed, it should be easy to regularly poll the metrics and make them available for internal reporting systems, logs, and services like \"Amazon's CloudWatch\":\n","git":""},"jmxtrans":{"readme":"# jmxtrans\nInput: JMX\nOutput: Statsd, Graphite, Ganglia, RRD\n\nSort of a logstash for JMX metrics. Uses JSON for config files:\n\n[](Project on github)\n","git":""},"byteman":{"readme":"Byteman is a tool which simplifies tracing and testing of Java programs. Byteman allows you to insert extra Java code into your application, either as it is loaded during JVM startup or even after it has already started running. The injected code is allowed to access any of your data and call any application methods, including where they are private. You can inject code almost anywhere you want and there is no need to prepare the original source code in advance nor do you have to recompile, repackage or redeploy your application. In fact you can remove injected code and reinstall different code while the application continues to execute.\n\nThe simplest use of Byteman is to install code which traces what your application is doing. This can be used for monitoring or debugging live deployments as well as for instrumenting code under test so that you can be sure it has operated correctly. By injecting code at very specific locations you can avoid the overheads which often arise when you switch on debug or product trace. Also, you decide what to trace when you run your application rather than when you write it so you don't need 100% hindsight to be able to obtain the information you need.\n\n[link](\n","git":""},"esper":{"readme":"Esper is a component for complex event processing (CEP), available \nfor Java as Esper, and for .NET as NEsper.\n\nEsper and NEsper enable rapid development of applications that \nprocess large volumes of incoming messages or events. Esper and NEsper \nfilter and analyze events in various ways, and respond to conditions\nof interest in real-time.\n\n[tutorial](\n[related](\n\n","git":""},"overwatch":{"readme":"Overwatch is a monitoring application designed for flexibility in all aspects,\nfrom how data is collected to the way notifications are handled.\n\n\n","git":""},"response":{"readme":"Response - Monitoring doesn't have to suck.\n\nResponse is an simple Graphite proxy with plugable alerting support. It's first priority is to deliver messages to Graphite rapidly. In the event that Graphite goes down, messages will be buffered until Graphite is back. Eventually Redis will be optionally supported to provide arbitrary levels of durability/persistence \n\n\n\n","git":""},"nervous":{"readme":"oNervous - Monitoring doesn't have to suck.\nAbout Nervous\n\nNervous is a simple plugin based monitoring system with support for sending data to Graphite or Response. Nervous makes it really easy to get data into graphite. \n\n\n","git":""},"shinken":{"readme":"\n\nNagios rewrite in Python. Lean and clean code base that is actually maintainable.\n\nDistributed architecture based on Pyro. Scalable while retaining the configuration aspects of Nagios.\n\nCan act as a conduit to send performance data to Graphite and PNP4Nagios. Graphite frontend integration.\n\nModular nature permits integration and flexbility to any part of the system.\n\n\n\n\n\n\n","git":""},"folsom":{"readme":"Erlang application metrics gathering library.\n\n\n","git":""},"hobbit":{"readme":"\nAnd advanced version of Big Brother \n\n\n\n\n\nAs of 2010-07-09, this project may now be found at\n","git":""},"nagios-api":{"readme":"A REST-like, JSON interface to Nagios complete with CLI interface\n\n[link](\n","git":""},"cloudkick-plugins":{"readme":"[link](\n[homepage](\n\nA collection of custom plugins for the Cloudkick agent and Monitoring-As-A-Service provider. Easily adapted to any number of monitoring tools.\n","git":""},"boundary":{"readme":"[product page](\n\n[github team page](\n\nBoundary is a network-monitoring as a service platform providing 1s status updates and real time aggregation of bandwith by port and IP. \n\nCurrently in private beta.\n","git":""},"isotope11_traffic_lights":{"readme":"## Monitoring your Continuous Integration Server with Traffic Lights and an Arduino\n### A way for Isotope11 to visually monitor the testing status of its many software projects.\nToday I am going to walk through our recent continuous integration Traffic light\nnotifier project that we just finished at the office. This project stemmed from\nmy company's desire to immediately know if a developer has broken a software\nproject, and what better way to do that than to have a huge red light flashing\nin your face. We connected an old salvaged traffic light fixture to our Jenkins\nCI-server that monitors the testing status of all of our current software\nprojects. If all our tests are passing, the light stays green, if any test fails\nthe light turns red to provide a visual notification of a problem. While Jenkins\nis running a test suite on any project, the yellow light will flash to let us\nknow of the activity.\n\n<iframe width=\"560\" height=\"315\" src=\"\" frameborder=\"0\" allowfullscreen></iframe>\n\nSo how does one connect a 48” tall traffic light to a continuous integration\nserver? With a Ruby script, an Arduino, and a few relays of course.\n\n<a href=\"\" title=\"Traffic Light by knewter, on Flickr\"><img src=\"\" width=\"375\" height=\"500\" alt=\"Traffic Light\"></a>\n\nThe Ruby code will create a serial connection with the Arduino to send data,\nthen create a web connection with the CI server to request the build status data\nvia our CI server's built in API. A quick look through the returned data will\ngive us a chance to see if there are any problems – if so, we'll send a signal\nto the Arduino to change the light status, otherwise it stays green. The Ruby\nscript requires 3 gem dependencies to run: faraday, json, and serialport – all\navailable from (eg. `gem install faraday`).\n\n # Isotope11 continous integration server Traffic-light\n # Ruby script to monitor json output from Jenkins CI-server, and output the status of projects to a Traffic-light. \n # If all builds are passing, the light is green. \n # If a job is currently building, the yellow light flashes. \n # If any job is failing, the red light is turned on and green turned off.\n require \"serialport\"\n require \"json\"\n require \"faraday\"\n require \"net/http\"\n\n # create a new Serial port for the Arduino Uno which uses port /dev/ttyACM0. Older Arduinos should use /dev/ttyUSB0\n sp =\"/dev/ttyACM0\", 9600)\n\n # wait for connection\n sleep(1)\n\n # create a new Faraday connection with the Jenkins server to read the status of each job\n conn ='')\n puts 'go to loop'\n\n loop do\n begin\n # grab the json from the jenkins api\n response = conn.get('/api/json')\n # parse the response into a list of jobs that are being monitored\n jobs = JSON.parse(response.body)[\"jobs\"]\n\n # search each job to see if it contains either \"anime\" (building) or \"red\" (failing)\n should_blink = jobs.detect{|j| j[\"color\"] =~ /anime/ }\n should_red = jobs.detect{|j| j[\"color\"] =~ /red/ }\n rescue\n # if no response, assume server is down – turn on Red and Yellow lights solid\n server_down = true\n end\n\n # check results of job colors\n if should_blink\n # something is building... flash yellow light!\n puts \"Something is building... flash yellow light!\"\n sp.write(\"1\")\n else\n # nothing is building... turn yellow light Off.\n #sp.write(\"2\")\n end\n\n if should_red\n # something is red... turn On red light!\n puts \"Something is broken... turn On red light!\"\n sp.write(\"3\")\n else\n # nothing is red... turn On green light.\n sp.write(\"4\")\n end\n\n if server_down\n sp.write(\"5\")\n end\n\n # wait 5 seconds\n sleep(5)\n end\n\n # close serial data line\n sp.close\n\nThe Arduino board is fitted inside of the traffic light housing, and mounts to a\nperforated prototyping board from Radio Shack using some male-pin headers. Above\nthe Arduino, are two small PC mount relays capable of switching up to 1 amp at\n120vac – perfect for some low wattage light bulbs. The relay coils are\ncontrolled using a 5v signal, and only consume about 90mA at that voltage level,\nso we can use the Arduino's onboard 5v regulator to power the relay coils.\nUnfortunately, we cannot simply drive the relays directly from an Arduino pin\nbecause it can only supply around 40mA per pin and the inductive switching\nproperties present in a relay might cause damage to the Arduino. Instead, we can\nuse 2 small N-type signal transistors (either bjt or mosfet) to interface\nbetween each relay and the Arduino output pin. Building the relay board might\nrequire some hands-on tinkering, but is a rewarding task when complete (circuit\nschematic file included).\n\n<a href=\"\" title=\"Arduino mounted inside Traffic Light by knewter, on Flickr\"><img src=\"\" width=\"500\" height=\"375\" alt=\"Arduino mounted inside Traffic Light\"></a>\n<a href=\"\" title=\"traffic_light schematic by knewter, on Flickr\"><img src=\"\" width=\"500\" height=\"368\" alt=\"traffic_light schematic\"></a>\n\nThe Arduino code is simple, basically listening on the serial port for 1 of\nabout 5 signals. If the Arduino detects a recognized serial byte, it will carry\nout a function to control the Traffic lights - there is no extra fluff, just\nwhat is needed. If you are having trouble locating an old Traffic light, or\nwould like to build a smaller desktop version of the notifier, you can do so\nwith only an Arduino and a few LEDs (red, yellow, and green) - you don't even\nhave to solder anything!\n\n<a href=\"\" title=\"Poor man's traffic light by knewter, on Flickr\"><img src=\"\" width=\"500\" height=\"281\" alt=\"Poor man's traffic light\"></a>\n\n // Isotope11 CI-server traffic light\n // Arduino Uno with 2 relays (SPDT) attached to pins 4 and 7\n // 2-9-12\n\n // declare variables and output pins:\n int inByte; // create a variable to hold the serial input byte\n long lastTx = 0; // create a “long” variable type to hold the millisecond timer value\n int yellow_light = 4; // create an output variable attached to pin 4\n int red_green_light = 7; // create an output variable attached to pin 7\n\n void setup() {\n Serial.begin(9600); // start Arduino serial monitor at 9600bps\n pinMode(yellow_light, OUTPUT); // set up pin 4 as an output\n pinMode(red_green_light, OUTPUT); // set up pin 7 as an output\n }\n\n void loop() {\n // check serial buffer\n if (Serial.available() > 0){\n inByte =; // read serial byte\n Serial.println(inByte); // print serial byte\n lastTx = millis(); // set the lastTx time-stamp variable equal to the current system timer value\n\n // the serial bits “49” - “53” are detected when the numeric buttons “1” – “5” are pressed on the keyboard.\n switch(inByte){\n case 49: // if serial value received is \"49\" (number 1), blink yellow light\n digitalWrite(yellow_light, HIGH);\n delay(1000);\n digitalWrite(yellow_light, LOW);\n break;\n case 50: // if serial value is \"50\" (number 2), turn yellow light off\n digitalWrite(yellow_light, LOW);\n break;\n case 51: // if serial value is \"51\" (number 3), turn red light on (green off)\n digitalWrite(red_green_light, HIGH);\n break;\n case 52: // if serial value is \"52\" (number 4), turn green light on (red off)\n digitalWrite(red_green_light, LOW);\n break;\n case 53: // if serial value is \"53\" (number 5), turn green and yellow lights on solid (api error)\n digitalWrite(red_green_light, LOW);\n digitalWrite(yellow_light, HIGH); \n } \n }\n else {\n if ((millis() - lastTx) > 10000) {\n // it has been more than 10 seconds (10000 milliseconds) since any serial information has been received\n // assume there is a break in the PC connection, and turn red and yellow lights on solid.\n digitalWrite(red_green_light, HIGH);\n digitalWrite(yellow_light, HIGH);\n }\n }\n }\n\n\n### Repo\n[The github repository is here.](\n\n### Parts list:\n1. an old Traffic light\n2. Arduino Uno, Radio Shack part # 276-128 - $34.99\n3. PC prototyping board, Radio Shack part #276-168 - $3.19\n4. (2) PC pin relays, Radio Shack part #275-240 - $4.69 ea\n5. (2) NPN transistors or mosfets, Radio Shack part #276-2016 - $1.19 ea\n6. (2) 10kohm resistors, Radio Shack part #271-1335 - $1.19 pk\n7. (2) 100kohm resistors, Radio Shack part #271-1347 - $1.19 pk\n8. (20) male-pin breakaway headers, Sparkfun part#PRT-00116 - $1.50\n9. (4) 8mm bolts, 1” long (with nuts) - $1.00\n\n### Tools needed:\n1. wire/wire snips\n2. solder/soldering iron\n3. drill/drill bit\n","git":""},"vacuumetrix":{"readme":"[link](\n\nSucks up metrics from various external sources and puts the data into internal systems like Graphite and Ganglia.\n\n\n","git":""},"javasimon":{"readme":"Java Simon is a simple monitoring API that allows you to follow and better understand your application. Monitors (familiarly called Simons) are placed directly into your code and you can choose whether you want to count something or measure time/duration.\n\n[link](\n","git":""},"logster":{"readme":"Logster is a utility for reading log files and generating metrics in Graphite or \nGanglia. It is ideal for visualizing trends of events that are occurring in your \napplication/system/error logs. For example, you might use logster to graph the \nnumber of occurrences of HTTP response code that appears in your web server logs.\n","git":""},"metrics":{"readme":"A Java application metrics gathering library.\n\n\n\n","git":""},"":{"readme":"# Tool Repo\n===========\nThis repository serves as a sort of master repository of various tools that people have come across.\n\nThe format for structure should be as follows:\n\n\ttop-level repo -\n\t\t\tproject_name -\n\t\t\t\t (review/information)\n\t\t\t\t git submodule to repo (if appropriate)\n\nPlease do not put any actual code in here.\n\n\n## Example\n\nAdding a repo:\n\n\tmkdir project_name\n\tgit submodule add project_name/repo\n\techo \"[link](\" > project_name/\n\tgit add project_name\n\tgit commit -am 'added project_name'\n","git":""},"munin":{"readme":"Munin is a networked resource monitoring tool that can help analyze resource \ntrends and \"what just happened to kill our performance?\" problems. It is \ndesigned to be very plug and play. A default installation provides a lot of \ngraphs with almost no work.\n\n\n","git":""},"collectd-graphite":{"readme":"This plugin acts as bridge between collectd's huge base of \navailable plugins and graphite's excellent graphing capabilities. \nIt sends collectd data directly to your graphite server.\n\n","git":""},"cube":{"readme":"# Cube\n\n**Cube** is a system for collecting timestamped events and deriving metrics. By collecting events rather than metrics, Cube lets you compute aggregate statistics *post hoc*. It also enables richer analysis, such as quantiles and histograms of arbitrary event sets. Cube is built on [MongoDB]( and available under the [Apache License](/square/cube/blob/master/LICENSE).\n\n[See the wiki.](/square/cube/wiki)\n","git":""},"porkchop":{"readme":"Porkchop is a simple network-based system information server. You write plugins for it and it responds with the data based on your request.\n\nCheck the readme for more info.\n\n[link](\n","git":""},"graylog2":{"readme":"[homepage](\n\n[server source](\n\n[webui source](\n\nGraylog2 is a log storing framework using a MongoDB backend and a very nice UI for filtering and search.\n\nTaulia is hosting a live Graylog2 [demo]( Log in with the user admin or user and the password graylog2\n","git":""},"groundwork":{"readme":"\n\n\nGroundWork \n\nbuild on top of nagios, Cacti and other toolso\n\n@botchagalupe once coined the term \"To pull a GroundWork\" meaning , taking an Open Source project, building a wrapper around it then selling it off a your own.\n\n\n","git":""},"extopus":{"readme":"Tobi Oetiker is working on a new Monitoring Aggregator \n\n\n`\n \n according to the home page : \n\n \"Extopus is an aggregating frontend to monitoring systems. Its plug-in architecture provides an easy route to integrating output from a wide array of monitoring systems into a single instance of Extopus.\n\nIntegration can range from simple iframe-ing a particular page from another monitoring system to accessing data from another system using rpc calls and displaying the results in a custom presentation plugin inside the Extopus frontend.\n\nThe Extopus backend is written in Perl (using Mojolicious) and the frontend is written in javascript (with Qooxdoo). This architecture provides a high level of implementation flexibility and allowed us to create a highly interactive end user experience.\n\nWhether you have a small setup with a few hundred or a large one with millions of items, Extopus will provide a user friendly interface to accessing your monitoring data.\"\n","git":""},"statsd":{"readme":"A network daemon for aggregating statistics (counters and timers), rolling them \nup, then sending them to graphite.\n\n-\n-\n\nThere is a wide variety of statsd-compatible servers written in almost every language.\nFor additional statsd server implementations, see here:\n","git":""},"parfait":{"readme":"Parfait is a performance monitoring library for Java which provides mechanisms for collecting counter and timing metrics, then exposing them through a variety of mechanisms including JMX beans and the open-source cross-platform [Performance Co-Pilot](\n\n\n[link](\n","git":""},"collectd":{"readme":"collectd gathers statistics about the system it is running on and stores \nthis information. Those statistics can then be used to find current\nperformance bottlenecks (i.e. performance analysis) and predict future \nsystem load (i.e. capacity planning). Or if you just want pretty graphs \nof your private server and are fed up with some homegrown solution you're\nat the right place, too ;).\n\n","git":""},"rocksteady":{"readme":"Rocksteady is a java application that reads metrics from RabbitMQ, parse them and turn them into events so Esper(CEP) can query against those metric and react to events match by the query.\n\nRocksteady is an effort to utilize complex event process engine to analyze user defined metric. End goal is to derive root cause conclusion based on metric driven events. Rocksteady is only the metric analysis part of the whole picture, but we also present a solution including metric convention, metric sending, load balancing, and graphing that work well for us.\n\n\n","git":""},"graphite":{"readme":"Graphite is an enterprise-scale monitoring tool that runs well on cheap \nhardware. It was originally designed and written by Chris Davis at Orbitz in \n2006 as side project that ultimately grew to be a foundational monitoring tool. \nIn 2008, Orbitz allowed Graphite to be released under the open source Apache \n2.0 license. Since then Chris has continued to work on Graphite and has \ndeployed it at other companies including Sears, where it serves as a pillar of \nthe e-commerce monitoring system. Today many large companies use it.\n\n\n","git":""},"xymon":{"readme":"\n\n\nXymon is a system for monitoring of hosts and networks, inspired by the Big Brother system. It provides real-time monitoring, an easy web-interface, historical data, availability reports and performance graphs. Xymon was previously known as \"Hobbit\"\n\n\n\n\n","git":""},"logstash":{"readme":"[link](\ncollects, parses, stores logs.\n\nCreated by Jordan Sissel (@jordansissel) who now works at Loggly.\n","git":""},"remote_syslog":{"readme":"[link](\n","git":""},"bigbrother":{"readme":"\nThere isn't anything out there that is more oldschool than Big Brother\n\nThe blinking green , red and orange lights are in the mind of every senior unix person.\n\n\n\n\n","git":""},"stajistics":{"readme":"Stajistics is a free monitoring and runtime performance statistics collection API for Java.\n\n[link](\n","git":""},"rackspace-cloud-monitoring":{"readme":"## Rackspace Cloud Monitoring\n\nRackspace Cloud Monitoring analyzes cloud services and dedicated infrastructure\nusing a simple, yet powerful API. The API currently includes monitoring for\nexternal services. The key benefits you receive from using this API include the\nfollowing:\n\n### Use of Domain Specific Language I(DSL)\nThe Rackspace Cloud Monitoring API uses a DSL, which makes it a powerful tool\nfor configuring advanced monitoring features. For example, typically complex\ntasks, such as defining triggers on thresholds for metrics or performing an\ninverse string match become much easier with a concise, special purpose language\ncreated for defining alarms. For more information, see Alarms.\n\n### Monitoring from Multiple Datacenters\nRackspace Cloud Monitoring allows you to simultaneously monitor the performance\nof different resources from multiple datacenters and provides a clear picture of\noverall system health. It includes tunable parameters to interpret mixed results\nwhich help you to create deliberate and accurate alerting policies. See Alert\nPolicies for more information.\n\n### Alarms and Notifications\nWhen an alarm occurs on a monitored resource, Rackspace Cloud Monitoring sends\nyou a notification so that you can take the appropriate action to either prevent\nan adverse situation from occurring or rectify a situation that has already\noccurred. These notifications are sent based on the severity of the alert as\ndefined in the notification plan.\n\n## Links\n\n[Documentation](\n\n## Contact\n\n* IRC - #cloudmonitoring @ Freenode\n* Email - cmbeta [at] rackspace [dot] com\n","git":""},"epicnms":{"readme":"Monitoring framework for time based numeric data.\n\n[link](\n","git":""},"flapjack":{"readme":"Flapjack is a scalable and distributed monitoring system. It natively\ntalks the Nagios plugin format, and can easily be scaled from\n1 server to 1000.\n\nFlapjack tries to adhere to the following tenets:\n\n* it should be simple to set up, configure, and maintain\n* it should easily scale from a single host to multiple\n\n","git":""},"nagios-dashboard":{"readme":"","git":""},"clockwork":{"readme":"Clockwork is a cron replacement. It runs as a lightweight, long-running Ruby \nprocess which sits alongside your web processes (Mongrel/Thin) and your \nworker processes (DJ/Resque/Minion/Stalker) to schedule recurring work at \nparticular times or dates. For example, refreshing feeds on an hourly basis,\nor send reminder emails on a nightly basis, or generating invoices once a\nmonth on the 1st.\n\n","git":""},"moncli":{"readme":"\n\n\nMoncli is a generic MONitoring CLIent which executes and processes requests on an external system in order to interact with the host's local information sources which are normally not available over the network. Once Moncli has executed and evaluated the request it submits the check results back into the message broker infrastructure, where the results are ready to be by another process.\n","git":""},"opentsdb":{"readme":"OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on \ntop of HBase. OpenTSDB was written to address a common need: store, index\nand serve metrics collected from computer systems (network gear, \noperating systems, applications) at a large scale, and make this data \neasily accessible and graphable.\n\nThanks to HBase's scalability, OpenTSDB allows you to collect many thousands\nof metrics from thousands of hosts and applications, at a high rate (every \nfew seconds). OpenTSDB will never delete or downsample data and can easily \nstore billions of data points. As a matter of fact, StumbleUpon uses it to \nkeep track of hundred of thousands of time series and collects over \n100 million data points per day in their main production cluster.\n\n\n","git":""},"resmon":{"readme":"Resmon is a lightweight utility for local host monitoring that can be queried\nby tools such as nagios over http. One of the main design goals is portability:\nthat resmon should require nothing more than a default install of Perl. Built\nwith the philosophy that \"we are smart because we are dumb,\" that is, local\nrequirements should be minimal to ease deployment on multiple platforms.\n\n\n","git":""},"cacti":{"readme":"\n","git":""},"disqus-nagios-plugins":{"readme":"[link](\n\nOnly check_graphite.rb for now.\n","git":""},"pandora-fms":{"readme":"\n\nPandora FMS is famous for being the guys that were throwing bottles of beer from the 2nd floor of the Roi d'Espagne during a Fosdem Beer event.\n\nIt was also the last year that the Roi d'Espagne hosted the Fosdem Beer event.\n\n\n\n\n","git":""},"riemann":{"readme":"A network event stream processor with three years of private production use. Clojure and protobufs.\n\n[Riemann](\n","git":""},"zenoss":{"readme":"\n\n","git":""},"sensu":{"readme":"[link]\n\nSensu is a ruby-based monitoring framework using Redis for data storage\nand RabbitMQ for communication.\n\nCreated at by @portertech and released as open source in 2011.\n\n\n\n\n","git":""},"jolokia":{"readme":"Jolokia is remote JMX with JSON over HTTP.\n\nIt is fast, simple, polyglot and has unique features. It's JMX on Capsaicin.\n\n[link](\n","git":""},"boomerang":{"readme":"End user oriented web performance testing and beaconing\n","git":""},"metis":{"readme":"Metis\n=====\n\nMetis is an implementation of the Nagios NRPE daemon in Ruby. It provides an easy framework to write your own monitors in Ruby, have them running in a daemon, and distribute/share them with others.\n\nGoals\n-----\n\nMetis is built around the idea of:\n\n* **Monitors in Ruby** \n Why? Ruby is a great language with a rich ecosystem of users, gems/libraries, and culture of testing. The existing ecosystem of Nagios monitors has a lot of variance in what language they're in. Some are bash, python, perl, etc. That is an awesome strength, but also means less commonality.\n* **Testable monitors** \n We test our applications don't we? Our monitors are what are supposed to be making sure our applications and servers are running correctly. They should be tested too. And they should be distributed with tests as well. You're running a monitor from a 3rd party on all yours servers... do you have full confidence it was written well and is bug free?\n* **Easy distribution of monitors** \n Nagios has a great community and tons of available monitors for you to grab. But grabbing monitors others have written can be hairy. They can have varying dependencies such as modules from CPAN in Perl, or EasyInstall in Python. If you don't know those languages, can be easily confused. They have varying requirements, such as a check in python require v2.7 while your OS release only has v2.6. Metis focuses on building in dependency handling and any framework to describe the configuration of the checks.\n* **Easy deployment** \n Metis works to cleanly separate monitor definition from configuration. It utilizes a simple ruby DSL modeled after [Chef]( for configuration of monitor parameters (username/passwords, warning/critical thresholds) as well as the monitor definition itself. It also strives for easy integration with chef-server, so that the two can work hand-in-hand for self configuration.\n* **Making monitors simple** \n If you've ever written any of your own Nagios monitors, there can sometimes be a lot of setup. Beyond just performing the check, you might also need to parse command-line parameters, remembering exit codes, and ensuring the proper messages get propagated. Its wasted time and effort. Metis provides a quick and simple way to define the output of your monitor and returns the most important parts.\n\nInstallation\n------------\n\nInstalling Metis is a simple matter of running:\n\n```\ngem install metis\nmetis-server\n```\n\nBoom, you're up and ready... though you won't have any monitors defined.\n\n\nDefining Monitors\n-----------------\n\nMonitors are defined as `define` blocks containing configuration attributes and an `execute` block that defines what to actually do. The checks are defined in files under `checks/*.rb` from the working directory by default.\n\nA simple monitor might be:\n\n```ruby\ndefine :simple do\n execute do\n \"Hello World\"\n end\nend\n```\n\nYou can set the result of the monitor using `critical()`, `warn()`, or `ok()`. By default, Metis will assume the monitor is OK and if the `execute` block returns a string, set it as the message.\n\n```ruby\ndefine :eod do\n execute do\n warn(\"Getting close to the end of the day\") if >= 21\n critical(\"Real close now!\") if >= 23\n ok(\"We're all good\")\n end\nend\n```\n\nMonitors can define attributes that can be configured outside of the monitor logic itself using the `attribute` keyword. They are then accessible within the monitor through a `params` hash. For instance, to make a configurable warning/critical threshold:\n\n```ruby\ndefine :eod do\n attribute :warning, :default => 21\n attribute :critical, :default => 23\n execute do\n warn(\"Getting close to the end of the day\") if >= params[:warning]\n critical(\"Real close now!\") if >= params[:critical]\n ok(\"We're all good\")\n end\nend\n```\n\nHow to set these will be covered in the next section.\n\nMonitors can also define external libraries or gems they might be dependent on using `require_gem`. These will only be required when the monitor is triggered, return a critical result and message if not found, and soon be installed as a part of the deployment process.\n\n```ruby\ndefine :check_mysql do\n require_gem 'mysql'\n execute do\n # Connect to mysql and query it\n end\nend\n```\n\nConfiguring Monitors\n--------------------\n\nBy default, Metis will look for a `config.rb` file in the working directory that should contain all the extra configuration settings for monitors. Building on the `:eod` example from the last section, you could configure its alert thresholds using the `configure` block:\n\n```ruby\nconfigure :eod do\n warning 21\n critical 23\nend\n```\n\nIf you defined a more advanced monitor that required username/passwords to connect to a resource, you could include all of those:\n\n```ruby\nconfigure :check_mysql do\n username \"foo\"\n password \"bar\"\n port 3306\nend\n```\n\nTesting Monitors\n----------------\n\nHelpers for writing tests against your monitors will be coming soon.\n\n\nContributing\n------------\n\n* Fork the project.\n* Make your feature addition or bug fix.\n* Add tests for it. This is important.\n* Commit, do not mess with Rakefile, version, or history.\n (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)\n* Send me a pull request. Bonus points for topic branches.\n\nCopyright\n---------\n\nCopyright (c) 2011 Ken Robertson. See LICENSE for details.\n","git":""},"hyperic":{"readme":"\nHyperic HQ, \n\ngenerated a lot of fuzz back in 2007.\nthey then got acquired by Springsource \nwhich then got acquired by VMWare\ngone mostly silent\n\n\n\n","git":""},"pencil":{"readme":"Graphite dashboard system\n\n\n","git":""},"opennms":{"readme":"OpenNMS is the world’s first enterprise grade network management application\nplatform developed under the open source model.\n\n\n","git":""},"reconnoiter":{"readme":"Reconnoiter's goal is to better the world of monitoring by marrying fault \ndetection and trending together. Through ease of configuration and ongoing \nmaintenance, Reconnoiter encourages monitoring important technical metrics \nalong side critical business metrics to improve awareness and ultimately \naccountability.\n\n\n","git":""},"naglite2":{"readme":"Full screen Nagios viewer intended for NOC/monitoring screens\n","git":""},"fitb":{"readme":"FITB is a tool that automatically polls every port on a list of switches you give it. Simple configuration, precise polling, easy searching and automatic discovery of both new ports and ports that go offline are the goals of FITB. \n","git":"git://"},"extrememon":{"readme":"\n\n\n\nFrom the Extreme Monitoring Manifesto:\n\nLive, with Subsecond temporal resolution where possible, as fast as doesn’t disrupt service, elsewhere\nDisplay on a meaningful representation, and in real-time.\n Simple Text-based Internet-Friendly Subscription Push API \n Implicit Provisioning (Test-driven infrastructure)\n Agent push the data as it is gathered\n Hot-pluggable components\n Re-use as many ubiquitous technologies as possible\n Extremon-Display is an implementation of the first and third, 6th and 7th targets: \n Live, with Subsecond temporal resolution where possible,\n Simple Text-based Internet-Friendly Subscription Push API\n Hot-pluggable components\n and Re-use as many ubiquitous technologies as possible\n\nsee\n\nFor the second target, see the ExtreMon-Display project\n\n\n\n\n","git":""}}