Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

328 lines (306 sloc) 21.011 kb
########################
Shinken ChangeLog
#######################
0.6 - 03/05/2011
------------------
ENHANCEMENTS
* Add : discovery script and rule engine
* Enh : lot of code clean from Gregory Starck, big Thanks :)
* Add : manage several brokers by realm
* Add : LiveStatus Wait commands
* Add : Recevier daemon, for distributed realms in distant LAN
* Add : passive poller for DMZ management
* Add : (David Guenault) a way for people that install shinken lib in non standard place (like /opt) to get it working for python path.
* Add : poller and reactionner modules
* Add : nrpe poller module for tuning nrpe calls without having to 'forks'
* Add : human_timestamp_log option so the log will output time as human readable format.
* Add : make the local log rotate with the logging module, max keep file at 5.
* Add : the crash catching in the log for all daemons
* Add : enable by default local log, should be enable by default, and if the admin want it down, he can.
* Add : (Eric Beaulieu) Windows installation script
* Add : Thte configuration is read as UTF8
* Add : round robin dispatching way between the worker of a module for the satellites.
* Add : set a check_interval to satellites, so we do not try every second to ping them, each a minute is quite enouth.
* Add : sending USR1 do a memory dump of the daemon
* Add : Manage Pyro4.2 and higer.
* Add : manage no notification period for hosts and services
* Add : print the sleep_time as a load average, so will smooth in one minute average, so is a good scheduler load indicator
* Add : force LANG=us and UTF8 in init scripts. No sense admin can set another langage for system and hope for tools to work.
* Add : print the latencies for scheduler in 95percentile way
* Add : when a hook point failed set the module to restart, not to kill, even external ones
* Add : support for multisite's double-requests (extcmd+query) to the livestatus module
* Add : now the nagios.cmd named pipe reading is an external module
* Add : statistics for the livestatus module (multisite needs it)
* Add : retention modules for broker and arbiter
* Add : reationner tag, like poller tag, but for reactionner
* Add : (Denis GERMAIN) check_shinken plugin and interface in the arbiter to get data.
* Add : (Rémi BUISSON) common init script for easy launch
* Add : sendmailhost.pl + sendmailservices.pl in libexec.
* Add : hot poller addition commands
* Add : automatic VMware host-> VM dependencies modules, with VMotion support
* Add : "Unknown phase", so limit useless notifications
FIXES
* Fix :the livestatus module so that multiple external commands in one request are possible. (thruk uses this for mass operations)
* Fix :the livestatus, so that host_name and service_description are shown correctly for downtiems and comments.
* Fix : (reported by : Raynald de Lahondès) for python2.7, if shell launch, do not go in shlex split pass.
* Fix : catch an exception when deregistering Pyro. (When the scheduler was started as the only process and no communication happened
* Fix : Add the path of the current script (ex: bin/shinken-arbiter) and ".." to the list of paths where Python looks for shinken modules.
* Fix: default config had duplicate command name define: check_http
* Fix : clean Command(s) now subclass Item(s) as any "shinken.objects"
* Fix : (reported by Raynald de Lahondès) if a hostgroup member got no host defined, create a dummy service for no host, so failed.
* Fix : bad UTC hour for scheduler.
* Fix : in a rare case, the picle need to call the __init__ of notification before the __get_state__, but it will call it without args.
* Fix : (reported by : Raynald de Lahondès) bad hostdep definition (link host) was badly detected.
* Fix : utf8 management in the db class.
* Fix : the * member was wrong in service calling hostgroup of this kind.
* Fix : the schedule immediate check now work without having to force it.
* Fix : correct the check launch under python2.7 with utf8 char.
* Fix : (reported by Denis GERMAIN) set by default retention update inverval to 60 minutes.
* Fix : bad sys.path management for ndo module
* Fix : too much loop in daemons! do_loop_tunr must WAIT a little! To managed in the main_loop part I think.
* Fix : if the scheduler gotthe same conf but got a wait a new one before, it do a zombie scheduler then....
* Fix : (reported by : Hienz Michael) do not close local file in the daemon pass.
* Fix : (Raynald de Lahondès) setup.py for bsd systems.
* Fix : service without host will be just droped, like Nagios.
* Fix : Nagios allow elements without contacts. Do alike.
* Fix : add protection against adding again and again the same actions ifthe scheduler ask it, like for orphaned checks.
* Fix : sys.path management, not far more clear and error proof
* Fix : (reported by capibaru) add a strip pass before loading cfg files.
* Fix : raise only one log for orphaned instead of xK useless logs.
* Fix : (Venelin Petkov) missing customs on services copy of a based one (multiple hosts, hostgroups, etc)
* Fix : (reported by Ronny Lindner) manage the not host expressions for services.
0.5.1 - 19/01/2011
------------------
ENHANCEMENTS
*Add : Business rules
*Add : Downtime for contacts
*Add : Escalations based on time, with notification period shorting capabilities
*Add : options allowed_hosts and max_logs_age to the livestatus broker
*Add : some rarely used operators to the livestatus module (!>=)
*Add : SSL connexions between daemons with certificates and a CA
*Add : module exception/kill catch in the scheduler.
*Add : use the binary format for the pickle, so it take less space.
*Add : (Hartmut Goebel) use universal open way for conf reading.
*Add : support for unix sockets to the livestatus module
*Add : criticity value for host/services, with problem/impacts max criticity management
*Add : min_criticity definition in cotnact and notificationways.
*Add : pylint and coverage pass in the integration server
*Add : the new column pnpgraph_present to the livestatus module
*Add : now create the pickle retention file with a .tmp, so in case of problem, we do not lost the old one.
*Add : event handlers command can now be send by external commands
FIXES
*Fix : (Laurent Guyon) select with no timeout in NSCA arbiter module.
*Fix : shinken init script: enable use of another "default" shinken file than hardcoded one by env variable.
*Fix : (current_service_groups needs to return an empty list instead of string) in the livestatus module
*Fix : 'setup.py -h install' now also exit
*Fix : () crash for some bad conf, should raise a message instead.
*Fix : missing check for no args in 'shinken' init script
*Fix : a bug in livestatus Servicegroup.members, minor cosmetics, test case for thruk
*Fix : a bug in host.parents livestatus representation to make thruk happy
*Fix : check for /dev/shm access for the satellites.
*Clean : Redesign of the livestatus module
*Fix : testing with multisite and thruk
*Clean: factorized .is_correct() call for all object types & added log to see more clearly wherer the error is.
*Clean: factorization/simplification of code in action.py (and related) for spawning checks processes.+ clean of old deprecated commented code (& "related" too).
*Fix : downtime and comment are now pickle in a dict, not a list.
*Fix : pickle pass for look at tyype, so downtime and comment from 0.4 still ok.
*Fix : acknoledge got too much information in the pickle pass, making the pickle save very very huge. Now fit from 100Mo to..2Mo :)
*Clean : big clean of hasattr->getattr with default value
*Clean : repalce dict for properties with real objects
*Fix : Implement in_check_period/in_notification_period for livestatus to make multisite happy
*Fix : Remove a leftover atribute from timeperiod&daterange
*Fix : Transmit dateranges in timeperiod-full_status-broks
*Fix : Replaced the deprecated StatsGroupBy, implemented Stats: for log entries, making Multisite happy with shinken-livestatus
*Fix : manage the 'null' for inheritance.
*Fix : add timeout to the status_dat module so that the status.dat is written even if no broks are sent.
*Fix : escalations were offset of notif number by -1.
*Fix : Replace Queue with an own implementation of LifoQueue for Python 2.4 (livestatus)
*Fix : Fallback to sqlite 1.x for Python 2.4 (livestatus)
*Fix : bug in the table structure where logging messages are kept (livestatus)
*Fix : problem/impacts should be list, not string.
*Fix : missing customs values in host/service tables in livestatus and Thruk was not happy.
*Fix : is_impact/is_problem got bad format in lviestatus tables.
*Fix : (Kristoffer Moegle) missing check in generic object configuration module.
*Fix : a bug in livestatus. Catch the exception if a peer is not listening for the response
*Fix : support for hosts without check_command (assumed to be always up)
*Fix : hostgroup realm assoc was broken. Now it's tested.
*Fix : (Maximilien Bersoult) fix mysql_db module search path.
*Fix : bug in compensate time when thecore got event handler
*Fix : a bug in the npcd module (spoolfile timestamp extension was float, not int)
*Fix : windows registry paths.
*Fix : problem with Nagios retention that was not happy about host properties type.
*Fix : pickle/nagios retention was loading a retention host/service in the comment.ref link!
*Fix : now only previously notified contacts are send for recovery notifications.
*Fix : bug in NDO module for hostgroups
*Fix : (0.5.1) bugs in LiveStatus module for Service get_full_name call and queries with no space after :
0.4 - 08/12/2010
------------------
ENHANCEMENTS
*Add : Service generators
*Add : "Limit :" in livestatus
*Add : the scheduler now save retention data before stop or take a new conf
*Add : the broker clean quit the modules before quitting
*Add : better output to know which external process for the broker is who in the log
*Add : NodeSet lib use if available for the [X-Y] keys in service generators.
*Add : retention modules, Memcache, Redis or simple file.
*Add : lot of tests, even a end_to_end one for Ha and load balanced installations
*Add : user can put what he want as MACRO in resources.cfg
*Add : lot of log output, and clean a lot of others
*Add : conf sample for PNP integration.
*Add : (Nicolas Dupeux) add a NSCA server module for the Arbiter! (only XOR and none encryption from now)
*Add : now the retention_update_interval parameter is managed.
*Add : the ! character before a host_name is now managed in the services. (even if host was defined in a hostgroup). And with test.
*Add : perfdata command management for host/service
*Add : manage modules in the Arbtier and in the schedulers
*Add : nowthe whole documentation is done in the wiki
*Add : obsess_over_host/service and executing oc*p_commands like eventhandlers
*Add : "templates" and modes and more for service/host perf data module.
*Add : now host with no address are fill with host_name for this value.
*Add : timeperiod inheritance
*Add : Allow "members *" in a hostgroup definition
*Add : manage inherit_parents for dependencies.
*Add : system time change catch for satellites.
*Add : enable_environment_macros now create or not the env dict for checks
*Add : O*HP command management
FIXES
*Fix : Some missing properties in the livestatus tables
*Fix : Some missing properties in the NDO/Mysql export
*Fix : parents property was not stripped(), and a error value was not catched as error
*fix : missing some errors catch in contact definitions
*Fix : Nagios allow contact_name to miss if there is an alias
*Fix : Nagios allow a contact with no 'action' if his options are n/n
*Fix : Resolv macro can loop forever with special output. Now limit it at 32 loop max!
*Fix : the env_macros were enable if we use the tweaks, not good. And they are REALLY CPU killers.
*Fix : LiveStatus : do not close the socket before we are sure the other peer send us nothing. If so, we can close it.
*Fix : solve a case where config files do not end with a line return and will mix parameters.
*Fix : broker spare not look at pollers/reactionners when he come active.
*Fix : now the poller/reactionner REALLY raise broks to the broker (it was clear before...)
*Fix : bug in the Broker that make in some cases broks lost for extarnal modules if they come from the arbiter. (like logs)
*Fix : add a workaround to the livestatus module so it can handle requests from thruk 0.71.1 (which uses strange Stats: requests)
*Fix : rename all 'binaries' without the .py extention so distrib will be happy.
*Fix : livestatus work now with Python 2.4
*Fix : (Hermann Lauer) important bug in status.dat (and in fact all other 'external modules') that make the brok not manage in the good order in some case or Arbtier restart. Thanks a lot to Hermann Lauer that help me a LOT with all the debug logs!
*Fix : (Zoran Zaric) big indentation cleanup
*Fix : error handling in timeperiod inheritance
*Fix : clean on the default configuration
*Fix : manage additional_freshness_latency parameter with a test for check_freshness now.
*Fix : setup.py can create a zip file (egg) for the librairy under Centos. It's not a good thing. It should avoid it
*Fix : From Nicolas Dupeux : error in livestatus split.
*Fix : From Nicolas Dupeux : fix typos in host code
0.3 - 06/10/2010
------------------
ENHANCEMENTS
*Add : complex hostgroup amtching with & ( ) | !
*Add : resultmodulation code and tests
*Add : brok information about problem/impacts
*Add : livestatus export about problem/impacts
*Add : clean quit on daemons (pid file and sub processes)
*Add : maintenance_period parameter in hosts and services
*Add : even more unit test cases
*Add : now external commands raised in livestatus module are taken by the arbiter
*Add : satellites states are now exported by livestatus
*Add : arbiter module managment
*Add : GLPI import arbiter module :)
*Add : notificationways for contacts
*Add : warning about unmanaged parameters
*Add : log rotation and syslog managment
FIXES
*Fix : install crash is now catch with Pyro 4 in Centos (python 2.4)
*Fix : host/service dep where not filled with default properties
*Fix : catch realm configuration errors
*Fix : but in status.dat about parents printing
*Fix : problem with the Collums talbe in livestatus module
*Fix : next valid time was one minute delay for cases with excludes
*Fix : livestatus export in json was bad for service group members
*Fix : bug in windows check launch
*Clean : dispatcher code about useless options
*Clean : tests cases setUp
0.2 - 06/09/2010
------------------
ENHANCEMENTS
*New code layout
*Installation is easy with the setup.py process (first version from Maximilien Bersoult)
*Now compatible with Pyro 3 AND 4
*Now compatible with Python 2.4 and 2.5 too
*Add sticky acknowledgement. Non-permanent ack-comments are now automatically removed
*Add host acknowledgement and acknowledgement stickiness
*Finished service problem acknowledgement. one more testcase
*Add REMOVE_HOST/SVC_ACKNOWLEDGEMENT external command
*Now broker get broks from pollers and reactionners. (Useful for Logs)
*Give Broker a way to make broks :) (like for it's own log)
*Add a problem/incident change states when apply. But it do not interfer with the standard check way of doing (or at least should not).
*Add some LSB init.d scripts
*Add max_plugins_output_length parameter to limit the checks output size.
*"Hack" the old nagios parameters : now status_file and nagios_log are catched. If the user defined them, but do not defined the good broker modules, we create them "on the fly". I hope one day we will remove it...
*Nested macros are managed (like USERN in ARGN macro).
*Add a pass about changing Nagios2 properties to Nagios3 ones.
*Add json outputformat to the livestatus module
*Add a broker module npcdmod (plus test_npcdmod) which writes a perfdata file suitable for pnp4nagios
*Add check_period implicitly inheritate to service from host.
*Redesign of the notifications (far easier to understand than the old async way)
*Notice about unused parameters and explain why it can be removed from conf.
*Catch non standard return code in actions.py so we can add stderr to the output for such cases.
*Now arbiter host_name property is not mandatory. But WARNING : for a multiple arbiter conf, it must be set.
*Updated cfg documentation (Author: Luke L <lukehasnoname@gmail.com>)
*Add documentation about date range format because it was not documented.
*Update the nagios to shinken migration file
*Change the way broks are send from Arbiter to Broker : before, the Broker connect to the Arbiter, take broks like for schedulers. But Arbiter also connect to broker. That's a nightmare about hangout. Now, Arbiter push the broks. It's far more easy and efficient.
*Add template handling to servicedependencies
*Add test_dependencies as the regression test
*Less status_dat verbosity :)
*Add a last_perf_data + macros to access last perfdatas as in https://sourceforge.net/apps/trac/shinken/ticket/76
*HUGE clean on shinken-specific.cfg file.
*Add a README file
*Add a little note about how migrate from Nagios to Shinken
*Add a hint about how solve 'cannot find my own arbiter' error message.
*Add bin directory with some bash scripts to launch/stop the whole application.
*Relative path, now we can have a easy portable sample configuration. (Gerhard)
*Add two missing operators in livestatus.py
*Big clean up conf sample!
*No more modulespath need in brokerd.ini. Will be easier for packagers.
*Acknowledgement test cases
*Add some hard tests about timeperiods calculations
*Add a test.sh script for Hudson test (launch all tests)
*Add a problem/impact test.
*Now external modules can return objects (from now nobody use it, but it can be useful in the future)
*Make easier to raise checks/notificatiosn from in deep objects class.
*repository cosmetics (Luke L)
FIXES
*Fix : now merlin is correctly filled with update program value
*Bug Fix : ndo do not have command_file, so do not export it.
*Bug fix : retention load was loading not good tab (impacts ones) and so cause problem with remove (not the good object!) (nicolas.dupeux)
*Fix a bug in ACKNOWLEDGE_SVC_PROBLEM ext. command. Sticky can be 0/1/2, not bool
*Bug fix in find_day_by_weekday_offset.
*Bug Fix : when a date was calcl before teh ref time for a weekday it was not recalculated, so problem.
*Bug Fix : error in get_end_of_day. It was given the first secon of the next day, so some exclude make problem with it.
*Bug Fix : shell like commands where not good :(. Thanks to Gilles Seban for pointing it and to Hiren Patel for giving a list of shell caracters (so we know if we should use shell or not :) )
*Bug fix: external commands to send checks should work now
*Bug fix : Arbiter do not more crash when scheduler is down and broker is up (not initialized make a missign parameter)
*Bug Fix : cehck orphaned was badly status set. Thanks Pylint.
*Bug fix : host in unreach were set DOWN un state, but unreach in state_id. Now it say clearly it's UNREACHABLE.
*Bug: retention was loading services objects from retention file. It's not good at all.
*Fix a and -> or bug in the dependancies
*Fix a bug in livestatus. state_type is now a number instead of HARD/SOFT
*Fix a bug in the livestatus module. Eventhandler command is now serializable
*Fix a bug in execute_unix. If there is an exception during plugin execution, use it's string representation as plugin_output
*Fix a bug in the livestatus module. Multiline input is now possible
*Bug : patch from David GUÉNAULT about stopping all brokers
*Correct launched/lanched type everywhere (Grégory Starck)
*Fixed scheduler.add so that master notifications (without contact) don't create a status brok
*Patch from Nicolas DUPEUX about typo correction in service.py
*Reduce CPU comsumption of livestatus broker (thanks flox for the patch)
*Fix a bug in the npcdmod test case
*Fix : configurations files can be mix if the previous do not finished with a line return (Sebastian Reimers)
*Fix: Correct a bad default arbiter pid configuration (Sebastian Reimers)
*Bug fix a missing save of shinken-reactionner.py about path in relative mode
*Global external commands now create an update_program_status_brok instead of program_status_brok
*Fix a bug in the status_dat_broker (incorrect servicegroup-definition in objects.cache)
*Fix : add Gerhard in print screen :)
*Bug Fix: add duplication check for elements (and groups). Only service is allowed to have duplicated (will Warning, but no error).
*Bug Fix : patch from Nicolas Dupeux. Thruk socket shutdowns are now handled in an exception
*Bux Fix (Sven Velt): patch about recursive dir load and check timeperiod typo
*NO MORE nap in code, now all are shinken :)
0.1 - 31/05/2010
------------------
ENHANCEMENTS
*Initial realease
Jump to Line
Something went wrong with that request. Please try again.