Fetching contributors…
Cannot retrieve contributors at this time
118 lines (73 sloc) 7.77 KB
-- Fix jobmon_status_text table so that it works properly with a pg_dump of wherever pg_jobmon is installed (Github Issue #2, Pull Request #5).
-- Fixed bug in check_job_status() that would cause it to erroneously return an error about the input argument not being greater than or equal to the longest configured threshold. This would occur if you had an entry in the job_check_config table with (active = false) and a longer threshold than any actually active jobs.
-- Fixed check_job_status() to obey "active" column setting properly in all cases. Was still returning failed jobs even though active had been set to false in some cases.
-- Enforce there only being one row in the dblink mapping table. WARNING: If you have more than a single entry in this table, all functions that use pg_jobmon will break the next time they run after you install this update. They could have likely broken at any other time, it's just that the single row being returned when jobmon authenticated itself was the right one. You were lucky! Ensure only a single, correct entry before updating to this version.
-- Renamed dblink_mapping table to dblink_mapping_jobmon. This was causing issues with other extensions with a similiarly named table (mimeo) when they're installed in the same schema.
-- Avoid some false positives in check_job_status() that were reporting currently running or incomplete jobs as being blocked by another transaction
-- New configuration column "escalate" to provide alert escalation capability. Allows for when a job fails at a certain alert level a given number of times, the alert level in the monitoring is raised to the next level up. (Experimental. Please provide feedback)
-- Adjust monitor trigger on job_log table to be able to handle custom alert codes better.
-- Included alert_text value in description returned by check_job_status to make it clearer how many of each alert status occurred.
-- Fixed pgTAP tests to pass properly and account for new return values. Added tests for escalation.
-- Fix "make install" to work in PostgreSQL 9.3.x without throwing an error.
-- Bugfix: Fixed show_running() to only match against non-idle queries when joining against pg_stat_activty. Still a chance of false result (see doc file), but much less likely now.
-- Fixed failing jobs not showing up in check_job_status() if no missing jobs existed
-- Critical Bug Fix: Version 1.0 accidentally removed the creation of the trigger on the job_log table so that failing jobs would never cause check_job_status() to report a failed job. Jobs that were configured to run within a certain time period were still monitored for. This only affects new installations of pg_jobmon since 1.0. If you've upgraded from a previous version, the trigger is still working properly.
-- Redesigned check_job_status() to return more detailed, and more easily filtered data on the current status of running jobs. Please check how your monitoring software used this function to ensure it can handle the new output format properly. Each problem job is returned in its own row instead of all results being returned in a single row. If a single row is still desired, the highest alert level job in alphabetical order of job_name is always returned first, so a LIMIT 1 can be used as an easy solution. More advanced filtering is now possible, though. See the updated doc for some examples.
-- Wrote pgTAP tests and some other custom tests to better validate future changes
-- Fixed unhandled case in check_job_status where if a job had been run but never finished with an end_time or status set in job_log, it wouldn't raise an alert that is was missing
-- Set functions that can be marked as STABLE
-- Made the check_job_status() error clearer for when a job is added to job_check_config but has never run. Would say a job was missing, but wouldn't say which one.
-- check_job_status() will now report when sensitivity threshold is broken for jobs giving a level 2 (WARNING) status. Ex: A job with a sensitivity of zero will show up with a level 2 alert if it has shown up in job_log with just a single level 2 status.
-- IMPORTANT NOTE: fail_job() & _autonomous_fail_job() functions have been dropped and recreated with new arguments. Please check function permissions before and after update.
-- fail_job() can now take an optional second argument to set the final alert code level that the job should fail with in the job_log table. Allows jobs to fail with level 2 (WARNING) instead of only level 3 (CRITICAL). Default is level 3.
-- New check_job_status() function that doesn't require an argument. Will automatically get longest threshold interval from job_check_config table if it exists and use that. Recommend using only this version of fuction from now on.
-- check_job_status(interval) will now throw an exception if you pass an interval that is shorter than the longest job period that is being monitored. If nothing is set in the config table, interval doesn't matter, so will just run normally checking for 3 consecutive failures. Changed documentation to only mention the no-argument version since that's the safest/easiest way to use it.
-- Added ability for check_job_status() to monitor for three level 2 alerts in a row. Added another column to job_check_log table to track alert level of the job failure. Fixed trigger function on job_log table to set the alert level in job_check_log.
-- Updated show_running() function to be compatible with PostgreSQL 9.2
-- Updated Makefile to allow setting of grep binary if needed during building.
-- Created CHANGELOG file.
-- Update monitor function to handle procpid column name change to pid in pg_stat_activity
-- Added port column to dblink_mapping table to allow changing the default port.
-- No code changes to extension itself. Restructured sql file organization and modified Makefile to handle it accordingly.
-- Did make the cancel_job test a little clearer on what to do in the testjob function.
-- Created job_log_clear() function to clear log data previous to a given interval. Logs it too!
-- Fixed job_detail table to have ON DELETE CASCADE foreign key to make this function easier to write
-- Adds sql_job() and sql_step() functions for simple query logging.
-- Fixed check_job_status to use the end_time instead of the start_time in job_log to determine the last run of a job. Also more clearly lets you know if a job is blocked by an object lock
-- Make column names in job_status_text table more consistent with the check_job_status() function return column names.
-- Fix check_job_status() returning extra spaces and ; in the alert_text when a job has failed
-- Fix check_job_status() to use alert_text value for code 3 instead of hardcoded value 'CRITICAL'
-- See update sql file for important instructions.
-- Turn off the dump of table data for the log tables. pg_dump isn't handling this properly and will dump all the data out for these tables even in a --schema-only dump.
-- Data for other tables is minimal, and more critical, so not removing their dump settings.
-- Data for these tables can be dumped if it's needed by temporarily removing the table from the extention and then adding it back.
-- Fixed bug in check_job_status result when all jobs were ok
IMPORTANT NOTE: Version 0.3.0 introduced backward incompatibilities and a direct upgrade is not possible. Please backup data tables and reinstall extension.
All constraints were given specific names so future updates can more easily be done using the extension system.
update_step() parameters have changed.
close, fail and cancel functions can no longer take a custom config table. Was a nice idea but doesn't work as I'd hoped.
job_alert_nagios table changed to job_status_text.
job_detail elapsed_time column data type changed.