Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Fetching contributors…
Cannot retrieve contributors at this time
328 lines (306 sloc) 20.5 KB
Shinken ChangeLog
0.6 - 03/05/2011
* Add : discovery script and rule engine
* Enh : lot of code clean from Gregory Starck, big Thanks :)
* Add : manage several brokers by realm
* Add : LiveStatus Wait commands
* Add : Recevier daemon, for distributed realms in distant LAN
* Add : passive poller for DMZ management
* Add : (David Guenault) a way for people that install shinken lib in non standard place (like /opt) to get it working for python path.
* Add : poller and reactionner modules
* Add : nrpe poller module for tuning nrpe calls without having to 'forks'
* Add : human_timestamp_log option so the log will output time as human readable format.
* Add : make the local log rotate with the logging module, max keep file at 5.
* Add : the crash catching in the log for all daemons
* Add : enable by default local log, should be enable by default, and if the admin want it down, he can.
* Add : (Eric Beaulieu) Windows installation script
* Add : Thte configuration is read as UTF8
* Add : round robin dispatching way between the worker of a module for the satellites.
* Add : set a check_interval to satellites, so we do not try every second to ping them, each a minute is quite enouth.
* Add : sending USR1 do a memory dump of the daemon
* Add : Manage Pyro4.2 and higer.
* Add : manage no notification period for hosts and services
* Add : print the sleep_time as a load average, so will smooth in one minute average, so is a good scheduler load indicator
* Add : force LANG=us and UTF8 in init scripts. No sense admin can set another langage for system and hope for tools to work.
* Add : print the latencies for scheduler in 95percentile way
* Add : when a hook point failed set the module to restart, not to kill, even external ones
* Add : support for multisite's double-requests (extcmd+query) to the livestatus module
* Add : now the nagios.cmd named pipe reading is an external module
* Add : statistics for the livestatus module (multisite needs it)
* Add : retention modules for broker and arbiter
* Add : reationner tag, like poller tag, but for reactionner
* Add : (Denis GERMAIN) check_shinken plugin and interface in the arbiter to get data.
* Add : (Rémi BUISSON) common init script for easy launch
* Add : + in libexec.
* Add : hot poller addition commands
* Add : automatic VMware host-> VM dependencies modules, with VMotion support
* Add : "Unknown phase", so limit useless notifications
* Fix :the livestatus module so that multiple external commands in one request are possible. (thruk uses this for mass operations)
* Fix :the livestatus, so that host_name and service_description are shown correctly for downtiems and comments.
* Fix : (reported by : Raynald de Lahondès) for python2.7, if shell launch, do not go in shlex split pass.
* Fix : catch an exception when deregistering Pyro. (When the scheduler was started as the only process and no communication happened
* Fix : Add the path of the current script (ex: bin/shinken-arbiter) and ".." to the list of paths where Python looks for shinken modules.
* Fix: default config had duplicate command name define: check_http
* Fix : clean Command(s) now subclass Item(s) as any "shinken.objects"
* Fix : (reported by Raynald de Lahondès) if a hostgroup member got no host defined, create a dummy service for no host, so failed.
* Fix : bad UTC hour for scheduler.
* Fix : in a rare case, the picle need to call the __init__ of notification before the __get_state__, but it will call it without args.
* Fix : (reported by : Raynald de Lahondès) bad hostdep definition (link host) was badly detected.
* Fix : utf8 management in the db class.
* Fix : the * member was wrong in service calling hostgroup of this kind.
* Fix : the schedule immediate check now work without having to force it.
* Fix : correct the check launch under python2.7 with utf8 char.
* Fix : (reported by Denis GERMAIN) set by default retention update inverval to 60 minutes.
* Fix : bad sys.path management for ndo module
* Fix : too much loop in daemons! do_loop_tunr must WAIT a little! To managed in the main_loop part I think.
* Fix : if the scheduler gotthe same conf but got a wait a new one before, it do a zombie scheduler then....
* Fix : (reported by : Hienz Michael) do not close local file in the daemon pass.
* Fix : (Raynald de Lahondès) for bsd systems.
* Fix : service without host will be just droped, like Nagios.
* Fix : Nagios allow elements without contacts. Do alike.
* Fix : add protection against adding again and again the same actions ifthe scheduler ask it, like for orphaned checks.
* Fix : sys.path management, not far more clear and error proof
* Fix : (reported by capibaru) add a strip pass before loading cfg files.
* Fix : raise only one log for orphaned instead of xK useless logs.
* Fix : (Venelin Petkov) missing customs on services copy of a based one (multiple hosts, hostgroups, etc)
* Fix : (reported by Ronny Lindner) manage the not host expressions for services.
0.5.1 - 19/01/2011
*Add : Business rules
*Add : Downtime for contacts
*Add : Escalations based on time, with notification period shorting capabilities
*Add : options allowed_hosts and max_logs_age to the livestatus broker
*Add : some rarely used operators to the livestatus module (!>=)
*Add : SSL connexions between daemons with certificates and a CA
*Add : module exception/kill catch in the scheduler.
*Add : use the binary format for the pickle, so it take less space.
*Add : (Hartmut Goebel) use universal open way for conf reading.
*Add : support for unix sockets to the livestatus module
*Add : criticity value for host/services, with problem/impacts max criticity management
*Add : min_criticity definition in cotnact and notificationways.
*Add : pylint and coverage pass in the integration server
*Add : the new column pnpgraph_present to the livestatus module
*Add : now create the pickle retention file with a .tmp, so in case of problem, we do not lost the old one.
*Add : event handlers command can now be send by external commands
*Fix : (Laurent Guyon) select with no timeout in NSCA arbiter module.
*Fix : shinken init script: enable use of another "default" shinken file than hardcoded one by env variable.
*Fix : (current_service_groups needs to return an empty list instead of string) in the livestatus module
*Fix : ' -h install' now also exit
*Fix : () crash for some bad conf, should raise a message instead.
*Fix : missing check for no args in 'shinken' init script
*Fix : a bug in livestatus Servicegroup.members, minor cosmetics, test case for thruk
*Fix : a bug in host.parents livestatus representation to make thruk happy
*Fix : check for /dev/shm access for the satellites.
*Clean : Redesign of the livestatus module
*Fix : testing with multisite and thruk
*Clean: factorized .is_correct() call for all object types & added log to see more clearly wherer the error is.
*Clean: factorization/simplification of code in (and related) for spawning checks processes.+ clean of old deprecated commented code (& "related" too).
*Fix : downtime and comment are now pickle in a dict, not a list.
*Fix : pickle pass for look at tyype, so downtime and comment from 0.4 still ok.
*Fix : acknoledge got too much information in the pickle pass, making the pickle save very very huge. Now fit from 100Mo to..2Mo :)
*Clean : big clean of hasattr->getattr with default value
*Clean : repalce dict for properties with real objects
*Fix : Implement in_check_period/in_notification_period for livestatus to make multisite happy
*Fix : Remove a leftover atribute from timeperiod&daterange
*Fix : Transmit dateranges in timeperiod-full_status-broks
*Fix : Replaced the deprecated StatsGroupBy, implemented Stats: for log entries, making Multisite happy with shinken-livestatus
*Fix : manage the 'null' for inheritance.
*Fix : add timeout to the status_dat module so that the status.dat is written even if no broks are sent.
*Fix : escalations were offset of notif number by -1.
*Fix : Replace Queue with an own implementation of LifoQueue for Python 2.4 (livestatus)
*Fix : Fallback to sqlite 1.x for Python 2.4 (livestatus)
*Fix : bug in the table structure where logging messages are kept (livestatus)
*Fix : problem/impacts should be list, not string.
*Fix : missing customs values in host/service tables in livestatus and Thruk was not happy.
*Fix : is_impact/is_problem got bad format in lviestatus tables.
*Fix : (Kristoffer Moegle) missing check in generic object configuration module.
*Fix : a bug in livestatus. Catch the exception if a peer is not listening for the response
*Fix : support for hosts without check_command (assumed to be always up)
*Fix : hostgroup realm assoc was broken. Now it's tested.
*Fix : (Maximilien Bersoult) fix mysql_db module search path.
*Fix : bug in compensate time when thecore got event handler
*Fix : a bug in the npcd module (spoolfile timestamp extension was float, not int)
*Fix : windows registry paths.
*Fix : problem with Nagios retention that was not happy about host properties type.
*Fix : pickle/nagios retention was loading a retention host/service in the comment.ref link!
*Fix : now only previously notified contacts are send for recovery notifications.
*Fix : bug in NDO module for hostgroups
*Fix : (0.5.1) bugs in LiveStatus module for Service get_full_name call and queries with no space after :
0.4 - 08/12/2010
*Add : Service generators
*Add : "Limit :" in livestatus
*Add : the scheduler now save retention data before stop or take a new conf
*Add : the broker clean quit the modules before quitting
*Add : better output to know which external process for the broker is who in the log
*Add : NodeSet lib use if available for the [X-Y] keys in service generators.
*Add : retention modules, Memcache, Redis or simple file.
*Add : lot of tests, even a end_to_end one for Ha and load balanced installations
*Add : user can put what he want as MACRO in resources.cfg
*Add : lot of log output, and clean a lot of others
*Add : conf sample for PNP integration.
*Add : (Nicolas Dupeux) add a NSCA server module for the Arbiter! (only XOR and none encryption from now)
*Add : now the retention_update_interval parameter is managed.
*Add : the ! character before a host_name is now managed in the services. (even if host was defined in a hostgroup). And with test.
*Add : perfdata command management for host/service
*Add : manage modules in the Arbtier and in the schedulers
*Add : nowthe whole documentation is done in the wiki
*Add : obsess_over_host/service and executing oc*p_commands like eventhandlers
*Add : "templates" and modes and more for service/host perf data module.
*Add : now host with no address are fill with host_name for this value.
*Add : timeperiod inheritance
*Add : Allow "members *" in a hostgroup definition
*Add : manage inherit_parents for dependencies.
*Add : system time change catch for satellites.
*Add : enable_environment_macros now create or not the env dict for checks
*Add : O*HP command management
*Fix : Some missing properties in the livestatus tables
*Fix : Some missing properties in the NDO/Mysql export
*Fix : parents property was not stripped(), and a error value was not catched as error
*fix : missing some errors catch in contact definitions
*Fix : Nagios allow contact_name to miss if there is an alias
*Fix : Nagios allow a contact with no 'action' if his options are n/n
*Fix : Resolv macro can loop forever with special output. Now limit it at 32 loop max!
*Fix : the env_macros were enable if we use the tweaks, not good. And they are REALLY CPU killers.
*Fix : LiveStatus : do not close the socket before we are sure the other peer send us nothing. If so, we can close it.
*Fix : solve a case where config files do not end with a line return and will mix parameters.
*Fix : broker spare not look at pollers/reactionners when he come active.
*Fix : now the poller/reactionner REALLY raise broks to the broker (it was clear before...)
*Fix : bug in the Broker that make in some cases broks lost for extarnal modules if they come from the arbiter. (like logs)
*Fix : add a workaround to the livestatus module so it can handle requests from thruk 0.71.1 (which uses strange Stats: requests)
*Fix : rename all 'binaries' without the .py extention so distrib will be happy.
*Fix : livestatus work now with Python 2.4
*Fix : (Hermann Lauer) important bug in status.dat (and in fact all other 'external modules') that make the brok not manage in the good order in some case or Arbtier restart. Thanks a lot to Hermann Lauer that help me a LOT with all the debug logs!
*Fix : (Zoran Zaric) big indentation cleanup
*Fix : error handling in timeperiod inheritance
*Fix : clean on the default configuration
*Fix : manage additional_freshness_latency parameter with a test for check_freshness now.
*Fix : can create a zip file (egg) for the librairy under Centos. It's not a good thing. It should avoid it
*Fix : From Nicolas Dupeux : error in livestatus split.
*Fix : From Nicolas Dupeux : fix typos in host code
0.3 - 06/10/2010
*Add : complex hostgroup amtching with & ( ) | !
*Add : resultmodulation code and tests
*Add : brok information about problem/impacts
*Add : livestatus export about problem/impacts
*Add : clean quit on daemons (pid file and sub processes)
*Add : maintenance_period parameter in hosts and services
*Add : even more unit test cases
*Add : now external commands raised in livestatus module are taken by the arbiter
*Add : satellites states are now exported by livestatus
*Add : arbiter module managment
*Add : GLPI import arbiter module :)
*Add : notificationways for contacts
*Add : warning about unmanaged parameters
*Add : log rotation and syslog managment
*Fix : install crash is now catch with Pyro 4 in Centos (python 2.4)
*Fix : host/service dep where not filled with default properties
*Fix : catch realm configuration errors
*Fix : but in status.dat about parents printing
*Fix : problem with the Collums talbe in livestatus module
*Fix : next valid time was one minute delay for cases with excludes
*Fix : livestatus export in json was bad for service group members
*Fix : bug in windows check launch
*Clean : dispatcher code about useless options
*Clean : tests cases setUp
0.2 - 06/09/2010
*New code layout
*Installation is easy with the process (first version from Maximilien Bersoult)
*Now compatible with Pyro 3 AND 4
*Now compatible with Python 2.4 and 2.5 too
*Add sticky acknowledgement. Non-permanent ack-comments are now automatically removed
*Add host acknowledgement and acknowledgement stickiness
*Finished service problem acknowledgement. one more testcase
*Now broker get broks from pollers and reactionners. (Useful for Logs)
*Give Broker a way to make broks :) (like for it's own log)
*Add a problem/incident change states when apply. But it do not interfer with the standard check way of doing (or at least should not).
*Add some LSB init.d scripts
*Add max_plugins_output_length parameter to limit the checks output size.
*"Hack" the old nagios parameters : now status_file and nagios_log are catched. If the user defined them, but do not defined the good broker modules, we create them "on the fly". I hope one day we will remove it...
*Nested macros are managed (like USERN in ARGN macro).
*Add a pass about changing Nagios2 properties to Nagios3 ones.
*Add json outputformat to the livestatus module
*Add a broker module npcdmod (plus test_npcdmod) which writes a perfdata file suitable for pnp4nagios
*Add check_period implicitly inheritate to service from host.
*Redesign of the notifications (far easier to understand than the old async way)
*Notice about unused parameters and explain why it can be removed from conf.
*Catch non standard return code in so we can add stderr to the output for such cases.
*Now arbiter host_name property is not mandatory. But WARNING : for a multiple arbiter conf, it must be set.
*Updated cfg documentation (Author: Luke L <>)
*Add documentation about date range format because it was not documented.
*Update the nagios to shinken migration file
*Change the way broks are send from Arbiter to Broker : before, the Broker connect to the Arbiter, take broks like for schedulers. But Arbiter also connect to broker. That's a nightmare about hangout. Now, Arbiter push the broks. It's far more easy and efficient.
*Add template handling to servicedependencies
*Add test_dependencies as the regression test
*Less status_dat verbosity :)
*Add a last_perf_data + macros to access last perfdatas as in
*HUGE clean on shinken-specific.cfg file.
*Add a README file
*Add a little note about how migrate from Nagios to Shinken
*Add a hint about how solve 'cannot find my own arbiter' error message.
*Add bin directory with some bash scripts to launch/stop the whole application.
*Relative path, now we can have a easy portable sample configuration. (Gerhard)
*Add two missing operators in
*Big clean up conf sample!
*No more modulespath need in brokerd.ini. Will be easier for packagers.
*Acknowledgement test cases
*Add some hard tests about timeperiods calculations
*Add a script for Hudson test (launch all tests)
*Add a problem/impact test.
*Now external modules can return objects (from now nobody use it, but it can be useful in the future)
*Make easier to raise checks/notificatiosn from in deep objects class.
*repository cosmetics (Luke L)
*Fix : now merlin is correctly filled with update program value
*Bug Fix : ndo do not have command_file, so do not export it.
*Bug fix : retention load was loading not good tab (impacts ones) and so cause problem with remove (not the good object!) (nicolas.dupeux)
*Fix a bug in ACKNOWLEDGE_SVC_PROBLEM ext. command. Sticky can be 0/1/2, not bool
*Bug fix in find_day_by_weekday_offset.
*Bug Fix : when a date was calcl before teh ref time for a weekday it was not recalculated, so problem.
*Bug Fix : error in get_end_of_day. It was given the first secon of the next day, so some exclude make problem with it.
*Bug Fix : shell like commands where not good :(. Thanks to Gilles Seban for pointing it and to Hiren Patel for giving a list of shell caracters (so we know if we should use shell or not :) )
*Bug fix: external commands to send checks should work now
*Bug fix : Arbiter do not more crash when scheduler is down and broker is up (not initialized make a missign parameter)
*Bug Fix : cehck orphaned was badly status set. Thanks Pylint.
*Bug fix : host in unreach were set DOWN un state, but unreach in state_id. Now it say clearly it's UNREACHABLE.
*Bug: retention was loading services objects from retention file. It's not good at all.
*Fix a and -> or bug in the dependancies
*Fix a bug in livestatus. state_type is now a number instead of HARD/SOFT
*Fix a bug in the livestatus module. Eventhandler command is now serializable
*Fix a bug in execute_unix. If there is an exception during plugin execution, use it's string representation as plugin_output
*Fix a bug in the livestatus module. Multiline input is now possible
*Bug : patch from David GUÉNAULT about stopping all brokers
*Correct launched/lanched type everywhere (Grégory Starck)
*Fixed scheduler.add so that master notifications (without contact) don't create a status brok
*Patch from Nicolas DUPEUX about typo correction in
*Reduce CPU comsumption of livestatus broker (thanks flox for the patch)
*Fix a bug in the npcdmod test case
*Fix : configurations files can be mix if the previous do not finished with a line return (Sebastian Reimers)
*Fix: Correct a bad default arbiter pid configuration (Sebastian Reimers)
*Bug fix a missing save of about path in relative mode
*Global external commands now create an update_program_status_brok instead of program_status_brok
*Fix a bug in the status_dat_broker (incorrect servicegroup-definition in objects.cache)
*Fix : add Gerhard in print screen :)
*Bug Fix: add duplication check for elements (and groups). Only service is allowed to have duplicated (will Warning, but no error).
*Bug Fix : patch from Nicolas Dupeux. Thruk socket shutdowns are now handled in an exception
*Bux Fix (Sven Velt): patch about recursive dir load and check timeperiod typo
*NO MORE nap in code, now all are shinken :)
0.1 - 31/05/2010
*Initial realease
Jump to Line
Something went wrong with that request. Please try again.