Skip to content

Commit

Permalink
Merge remote-tracking branch 'refs/remotes/naparuba/master' into WebU…
Browse files Browse the repository at this point in the history
…I_doc

Conflicts:
	doc/source/01_introduction/about.rst
  • Loading branch information
mohierf committed Oct 22, 2015
2 parents 27e5d71 + a75b3c9 commit 15e029b
Show file tree
Hide file tree
Showing 53 changed files with 1,542 additions and 223 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ INSTALLED_FILES
.settings
.gitignore
modules/*
.ropeproject
.coverage

# Logs and databases #
######################
Expand Down
28 changes: 28 additions & 0 deletions Changelog
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,34 @@
Shinken ChangeLog
########################

2.4.2 - 01/10/2015
-------------------
CORE ENHANCEMENT
Add: Service excludes/includes/overrides extension
Add: Implementation of inital_state
Add: Broks when host/service downtime is scheduled
Add: Solaris SMF manifests
Enh: Allow white space between foreach elements
Enh: Increase default http thread_pool to 16

CORE FIXES
Fix: Duplicate Service from template using duplicate_foreach with the same name
Fix: Service_(includes|excludes) template recursion
Fix: Service description with multi level inheritance
Fix: Default business rule notification options

2.4.1 - 15/07/2015
-------------------
CORE ENHANCEMENT
Add: Safe Pickle
Add: Better Debian 8 Jessie support

CORE FIXES
Fix: Display_name when using duplicate_foreach
Fix: template definition loop did segfault python
Fix: Service Description inheritance when using several level of inheritance
Fix: cpu looping for receiver

2.4 - 04/05/2015
-------------------
CORE ENHANCEMENT
Expand Down
78 changes: 37 additions & 41 deletions doc/source/01_introduction/about.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,44 +5,40 @@
About Shinken
==============

Shinken is an open source monitoring framework written in Python under the terms of the `GNU Affero General Public License`_ .
It was created in 2009 as a simple proof of concept of a `Nagios`_ patch. The first release of Shinken was the December 1st of 2009 as simple monitoring tool.
Since the 2.0 version (April 2014) Shinken is described as a monitoring framework due to its high number of modules.
For the same reason, modules are now in separated repositories. You can find some in the `shinken-monitoring organization's page`_ on Github
Shinken is an open source monitoring framework written in Python under the terms of the `GNU Affero General Public License`_. It was created in 2009 as a simple proof of concept of a `Nagios`_ patch. The first release of Shinken was the December 1st of 2009 as simple monitoring tool. Since the 2.0 version (April 2014) Shinken is described as a monitoring framework due to its high number of modules. For the same reason, modules are now in separated repositories. You can find some in the `shinken-monitoring organization's page`_ on Github.



Shinken Project
================
The Shinken Project
===================

Shinken is now an open source monitoring *framework* but was first created to be a open source monitoring *solution*.
Shinken is now an open source monitoring *framework*, but was first created to be a open source monitoring *solution*.
This difference is important for the team, a framework does not have the same use than an all in one solution.
The main idea when developing Shinken is the flexibility which is our definition of framework.
Nevertheless, Shinken was first made differently and we try to keep all the good things that made it a monitoring solution :
* Easy to install : install is mainly done with pip but some packages are available (deb / rpm) and we are planning to provide nightly build.
* Easy for new users : once installed, Shinken provide a simple command line interface to install new module and packs.
* Easy to migrate from Nagios : we want Nagios configuration and plugins to work in Shinken so that it is a "in place" replacement.
Plugins provide great flexibility and are a big legacy codebase to use. It would be a shame not to use all this community work
* Multi-platform : python is available in a lot of OS. We try to write generic code to keep this possible.
* Utf8 compliant : python is here to do that. For now Shinken is compatible with 2.6-2.7 version but python 3.X is even more character encoding friendly.
* Independent from other monitoring solution : our goal is to provide a modular *tool* that can integrate with others through standard interfaces). Flexibility first.
* Flexible : in an architecture point view. It is very close to our scalability wish. Cloud computing is make architecture moving a lot, we have to fit to it.
* Fun to code : python ensure good code readability. Adding code should not be a pain when developing.
The main idea when developing Shinken is the flexibility which is our definition of framework. Nevertheless, Shinken was first made differently and we try to keep all the good things that made it a monitoring solution:
* Easy to install: install is generally done with pip, but some packages are available (deb / rpm) and we are planning to provide nightly builds.
* Easy for new users: once installed, Shinken provides a simple command line interface to install new modules and packs.
* Easy to migrate from Nagios: we want Nagios configuration and plugins to work in Shinken, so that it is a "in place" replacement.
Plugins provide great flexibility and are a big legacy codebase to use. It would be a shame not to use all this community work.
* Multi-platform: Python is available for many Operating Systems. We try to write generic code to keep this possible.
* Utf8 compliant: Python is here to do that. Shinken is currently compatible with Python 2.6-2.7 version, but Python 3.X is even more character encoding friendly.
* Independent from other monitoring solutions: Our goal is to provide a modular *tool* that can integrate with others through standard interfaces). Flexibility first.
* Flexible: From an architecture point of view, scalability is our first design principle. Cloud computing is make architecture moving a lot, we have to fit to it.
* Fun to code: Python ensures good code readability. Adding code should not be a pain when developing.

This is basically what Shinken is made of. Maybe add the "keep it simple" Linux principle and it's prefect. There is nothing we don't want, we consider every features / ideas.
This is basically what Shinken is made of. Maybe add the "keep it simple" Linux principle and it is prefect. There is nothing we don't want, we consider every features / ideas.


Features
=========

Shinken has a lot of features, we started to list some of them in the last paragraph. Let's go into details:
Shinken has a lot of featuress, we started to list some of them in the last paragraph. Let us go into detail:

* Role separated daemons : we want a daemon to do one thing but doing it good. There are 6 of them but one is not compulsory.
* Great flexibility : you didn't got that already? Shinken modules allow it to talk to almost everything you can imagine.
* Role separated daemons: we want a daemon to do one thing, and one thing well. There are 6 of them but one is not compulsory.
* Great flexibility: you didn't get that already? Shinken modules allow it to talk to almost everything you can imagin.

Those to points involve all the following :
Those to points involve all the following:

* Data export to databases :
* Data export to databases:

* Graphite
* InfluxDB
Expand All @@ -54,7 +50,7 @@ Shinken has a lot of features, we started to list some of them in the last parag
* MySQL (NDO reimplementation)
* Oracle (NDO reimplementation)

* Integration with web user interface :
* Integration with web user interfaces:

* WebUI (Shinken own User Interface: https://github.com/shinken-monitoring/mod-webui/wiki)
* Thruk
Expand All @@ -66,7 +62,7 @@ Shinken has a lot of features, we started to list some of them in the last parag
* Centreon (With NDO, not fully working, not recommended)


* Import config from databases :
* Import config from databases:

* GLPI
* Amazon EC2
Expand All @@ -75,7 +71,7 @@ Shinken has a lot of features, we started to list some of them in the last parag
* Canonical Landscape


* Shinken provide sets of configuration, named packs, for a huge number of services :
* Shinken provides sets of configurations, named packs, for a huge number of services:

* Databases (Mysql, Oracle, MSSQL, memcached, mongodb, influxdb etc.)
* Routers, Switches (Cisco, Nortel, Procurve etc.)
Expand All @@ -85,36 +81,36 @@ Shinken has a lot of features, we started to list some of them in the last parag
* Application (Weblogic, Exchange, Active Directory, Tomcat, Asterisk, etc.)
* Storage (IBM-DS, Safekit, Hacmp, etc.)

* Smart SNMP polling : The SNMP Booster module is a must have if you have a huge infrastructure of routers and switches.
* Smart SNMP polling: The SNMP Booster module is a must have if you have a huge infrastructure of routers and switches.

* Scalability : no server overloading, you just have to install new daemons on another server and load balancing is done.
* Scalability: no server overloading, you just have to install new daemons on another server and load balancing is done.


But Shinken is even more :
But Shinken is even more:

* Realm concept : you can monitor independent environments / customer
* DMZ monitroing : some daemon have passive facilities so that firewall don't block monitoring.
* Business impact : Shinken can differentiate impact of a critical alert on a toaster versus the web store
* Efficient correlation between parent-child relationship and business process rules
* High availability : daemons can have spare ones.
* Business rules : For a higher level of monitoring. Shinken can notify you only if 3 out 5 of your server are down
* Very open-minded team : help is always welcome, there is job for everyone.
* Realm concept: you can monitor independent environments / customers.
* DMZ monitroing: some daemon have passive facilities, so that firewalls don't block monitoring.
* Business impacts: Shinken can differentiate impact of a critical alert on a toaster versus the web store.
* Efficient correlation between parent-child relationship and business process rules.
* High availability: daemons can have spare ones.
* Business rules: For a higher level of monitoring, Shinken can notify you only if 3 out 5 of your server are down
* Very open-minded team: help and suggestions are always welcome.


Release cycle
==============


Shinken team is trying to setup a new release cycle with an objective of 4 release per year.
Each release is divided into three part : re-factoring (few weeks), features (one month), freezing (one month).
The Shinken team is setting up a new release cycle with an objective of 4 release per year.
Each release is divided into three parts: re-factoring (few weeks), features (one month), freezing (one month).
Roadmap is available in a `specific Github issue`_, feature addition can be discussed there.
Technical point of view about a specific feature are discussed in a separated issue.
The technical point of view about a specific feature is discussed in a separate, individual issue.


Release code names
===================

I (Jean Gabès) keep the right to name the code name of each release. That's the only thing I will keep for me in this project as its founder. :)
Jean Gabès keeps the right to name the code-name of each release. That is the only thing Jean will keeps for himself in this project as its founder. :)


.. _Nagios: http://www.nagios.org
Expand Down
1 change: 1 addition & 0 deletions doc/source/05_thebasics/notifications.rst
Original file line number Diff line number Diff line change
Expand Up @@ -185,3 +185,4 @@ If you want to try out a non-traditional method of notification, you might want
.. _Network Audio System (NAS): http://radscan.com/nas
.. _QuickPage: http://www.qpage.org/
.. _Sendpage: http://www.sendpage.org/
.. _SMSEagle: http://www.smseagle.eu/shinken.php
81 changes: 76 additions & 5 deletions doc/source/07_advanced/objectinheritance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -518,6 +518,18 @@ Its syntax is:

It could be summarized as "*For the service bound to me, named ``xxx``, I want the directive ``yyy`` set to ``zzz`` rather tran the inherited value*"

The service description selector (represented by ``xxx`` in the previous example) may be:

A service name (default)
The ``service_description`` of one of the services attached to the host.

``*`` (wildcard)
Means *all the services attached to the host*

A regular expression
A regular expression against the ``service_description`` of the services attached to the host (it has to be prefixed by ``r:``).


Example:

::
Expand All @@ -543,8 +555,22 @@ Example:
...
}
...
define host {
host_name web-back-02
hostgroups web
service_overrides *,notification_options c
...
}
...
define host {
host_name web-back-03
hostgroups web
service_overrides r:^HTTP,notification_options r
...
}
...

In the previous example, we defined only one instance of the HTTP service, and we enforced the service ``notification_options`` for the web servers composing the backend. The final result is the same, but the second example is shorter, and does not require the second service definition.
In the previous example, we defined only one instance of the HTTP service, and we enforced the service ``notification_options`` for some web servers composing the backend. The final result is the same, but the second example is shorter, and does not require the second service definition.

Using packs allows an even shorter configuration.

Expand All @@ -566,6 +592,20 @@ Example:
...
}
...
define host {
use http
host_name web-back-02
service_overrides HTTP,notification_options c
...
}
...
define host {
use http
host_name web-back-03
service_overrides HTTP,notification_options r
...
}
...

In the packs example, the web server from the front-end cluster uses the value defined in the pack, and the one from the backend cluster has its HTTP service (inherited from the HTTP pack also) enforced its ``notification_options`` directive.

Expand Down Expand Up @@ -612,26 +652,57 @@ In this situation, there is several ways to manage the situation:

None of these options are satisfying.

There is a last solution that consists of excluding the corresponding service from the specified host. This may be done using the ``service_excludes directive``.
There is a last solution that consists of excluding the corresponding service from the specified host. This may be done using the ``service_excludes`` directive.

Its syntax is:

::

service_excludes xxx

The service description selector (represented by ``xxx`` in the previous example) may be:

A service name (default)
The ``service_description`` of one of the services attached to the host.

``*`` (wildcard)
Means *all the services attached to the host*

A regular expression
A regular expression against the ``service_description`` of the services attached to the host (it has to be prefixed by ``r:``).

Example:


::

define host {
use web-fromt
use web-front
host_name web-back-01
...
}

define host {
use web-fromt
use web-front
host_name web-back-02 ; The virtual server
service_excludes Management interface
...
}
...
define host {
use web-front
host_name web-back-03 ; The virtual server
service_excludes *
...
}
...
define host {
use web-front
host_name web-back-04 ; The virtual server
service_excludes r^Management
...
}
...


In the case you want the opposite (exclude all except) you can use the service_includes directive
In the case you want the opposite (exclude all except) you can use the ``service_includes`` directive which is its corollary.
6 changes: 5 additions & 1 deletion doc/source/08_configobjects/host.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ parents *host_names*
hostgroups *hostgroup_names*
check_command *command_name*
initial_state [o,d,u]
initial_output *output*
**max_check_attempts** **#**
check_interval #
retry_interval #
Expand Down Expand Up @@ -148,7 +149,10 @@ check_command
This directive is used to specify the *short name* of the :ref:`command <configobjects/command>` that should be used to check if the host is up or down. Typically, this command would try and ping the host to see if it is "alive". The command must return a status of OK (0) or Shinken will assume the host is down. If you leave this argument blank, the host will *not* be actively checked. Thus, Shinken will likely always assume the host is up (it may show up as being in a "PENDING" state in the web interface). This is useful if you are monitoring printers or other devices that are frequently turned off. The maximum amount of time that the notification command can run is controlled by the :ref:`host_check_timeout <configuration/configmain#host_check_timeout>` option.

initial_state
By default Shinken will assume that all hosts are in UP states when in starts. You can override the initial state for a host by using this directive. Valid options are: **o** = UP, **d** = DOWN, and **u** = UNREACHABLE.
By default Shinken will assume that all hosts are in PENDING state when in starts. You can override the initial state for a host by using this directive. Valid options are: **o** = UP, **d** = DOWN, and **u** = UNREACHABLE.

initial_output
As of the initial state, the initial check output may also be overridden by this directive.

max_check_attempts
This directive is used to define the number of times that Shinken will retry the host check command if it returns any state other than an OK state. Setting this value to 1 will cause Shinken to generate an alert without retrying the host check again.
Expand Down
6 changes: 5 additions & 1 deletion doc/source/08_configobjects/service.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ servicegroups servicegroup_names
is_volatile [0/1]
**check_command** ***command_name***
initial_state [o,w,u,c]
initial_output *output*
**max_check_attempts** **#**
**check_interval** **#**
**retry_interval** **#**
Expand Down Expand Up @@ -164,13 +165,16 @@ check_command
If at least one of the apaches on servers websrv1 and websrv2 is OK and if the oracle database on dbsrv1 is OK then the rule and thus the service is OK

initial_state
By default Shinken will assume that all services are in OK states when in starts. You can override the initial state for a service by using this directive. Valid options are:
By default Shinken will assume that all services are in PENDING state when in starts. You can override the initial state for a service by using this directive. Valid options are:

* **o** = OK
* **w** = WARNING
* **u** = UNKNOWN
* **c** = CRITICAL.

initial_output
As of the initial state, the initial check output may also be overridden by this directive.

max_check_attempts
This directive is used to define the number of times that Shinken will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Shinken to generate an alert without retrying the service check again.

Expand Down
2 changes: 2 additions & 0 deletions for_fedora/systemd/shinken-arbiter.service
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ After=syslog.target
[Service]
Type=forking
ExecStart=/usr/sbin/shinken-arbiter -d -r -c /etc/shinken/shinken.cfg
KillMode=process
TimeoutStopSec=3

[Install]
WantedBy=multi-user.target
2 changes: 2 additions & 0 deletions for_fedora/systemd/shinken-broker.service
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ After=syslog.target
[Service]
Type=forking
ExecStart=/usr/sbin/shinken-broker -d -c /etc/shinken/daemons/brokerd.ini
KillMode=mixed
TimeoutStopSec=3

[Install]
WantedBy=multi-user.target
2 changes: 2 additions & 0 deletions for_fedora/systemd/shinken-poller.service
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ After=syslog.target
[Service]
Type=forking
ExecStart=/usr/sbin/shinken-poller -d -c /etc/shinken/daemons/pollerd.ini
KillMode=process
TimeoutStopSec=3

[Install]
WantedBy=multi-user.target

0 comments on commit 15e029b

Please sign in to comment.