Skip to content

Commit

Permalink
v1.2.10-stable-luminosity
Browse files Browse the repository at this point in the history
v1.2.10-stable-luminosity
  • Loading branch information
earthgecko committed Nov 19, 2018
2 parents afbe92d + 8d89f59 commit 2faefc8
Show file tree
Hide file tree
Showing 11 changed files with 203 additions and 31 deletions.
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def setup(app):
# The short X.Y version.
version = u'1.2'
# The full version, including alpha/beta/rc tags.
release = u'1.2.9-stable'
release = u'1.2.10-stable'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
5 changes: 4 additions & 1 deletion docs/development/dawn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,13 @@ Example usage:

.. code-block:: bash
# Fetch
# Fetch needs wget installed
wget -O /tmp/skyline.dawn.sh https://raw.githubusercontent.com/earthgecko/skyline/master/utils/dawn/skyline.dawn.sh
# Always review scripts before running them
cat /tmp/skyline.dawn.sh
if [ -f /etc/redhat-release ]; then
yum -y install net-tools
fi
# Determine public IP address
USE_IP=$(ifconfig | grep -v "127.0.0.1" | grep "inet addr:" | cut -d':' -f2 | cut -d' ' -f1)
if [ -f /etc/redhat-release ]; then
Expand Down
14 changes: 9 additions & 5 deletions docs/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ Welcome to Skyline.
What is Skyline?
----------------

Skyline is your ears and eyes, it is remarkably good at telling you when a state
changes. It does not just chew bubblegum, it blows bubbles too.

But really Skyline is ...
For those interested in anomaly detection and deflection in streamed time series
data.

Expand All @@ -18,14 +22,14 @@ detection data?

Skyline is a Python based anomaly detection/deflection stack that analyses,
anomaly detects, deflects, fingerprints and learns vast amounts of streamed
time series data.
time series data. Skyline has a number of isolated modules/apps that:

- Skyline ingests streamed metric time series data - skyline/horizon
- Skyline uses a ```CONSENSUS``` of 3-sigma algorithms to detect anomalies on
- ingests streamed metric time series data - skyline/horizon
- use a ```CONSENSUS``` of 3-sigma algorithms to detect anomalies on
batch processed, streamed metric time series data - skyline/analyzer - anomaly
detector
- It handles large and small seasonality in the data - skyline/mirage -
anomaly deflector and detector
- Handle large and small seasonality in the data - skyline/mirage - anomaly
deflector and detector
- You can train it on what is NOT anomalous and it learns - skyline/ionosphere -
anomaly deflector
- It records all your anomalies - skyline/panorama - anomaly memory
Expand Down
1 change: 1 addition & 0 deletions docs/releases.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Release Notes
:maxdepth: 1
:glob:

releases/1_2_10
releases/1_2_9
releases/1_2_8
releases/1_2_7
Expand Down
87 changes: 87 additions & 0 deletions docs/releases/1_2_10.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
==============================
1.2.10 - the luminosity branch
==============================

v1.2.10-luminosity - November 19, 2018

Bug fixes release
-----------------------

- Bug fixes are described below.

Changes from v1.2.9
-------------------

- Bumped version to v1.2.10
- Stop analyzer stalling if there is no stale_metrics_to_alert_on list to delete
(2492)
- Remove Ionosphere check files is key exists instead of failing check file this
was here for initially debugging, no longer needed (2680)
- To reduce the amount of I/O used by Mirage in this loop check and reduce the
number of log entries for 'not alerting - Ionosphere metric' a check is made
if the metric_name has already been check, if so continue (2682)
- Handle if the metrics_var_file has not been set and is still False so that the
path.isfile does not error with:
``'TypeError: coercing to Unicode: need string or buffer, bool found'``
- Handle 0.0 float in value variable (2708, 2234)
- Noted wget and net-tools required in dawn docs

Update notes
------------

- NOTE: If you are running v1.x you CANNOT upgrade from v1.x directly to v1.2.10
- You can only upgrade to v1.2.9 from v1.2.8
- There is no change to the DB
- There are no changes to settings.py

How to update from v1.2.9
-------------------------

- There is no requirement for a full upgrade as described in previous release
notes, but you can if you want to as per previous release notes or if you have
some other reason related to configuration management or git referencing.
- To simply upgrade v.1.2.9 in-situ do

.. code-block:: bash
CURRENT_SKYLINE_PATH="/opt/skyline/github/skyline" # Your Skyline path
GITHUB_TREE_URL="https://raw.githubusercontent.com/earthgecko/skyline/v1.2.10-stable-luminosity"
cd $CURRENT_SKYLINE_PATH
cp skyline/skyline_version.py skyline/skyline_version.py.v1.2.8.bak
cp skyline/analyzer/analyzer.py skyline/analyzer/analyzer.py.v1.2.8.bak
cp skyline/mirage/mirage.py skyline/mirage/mirage.py.v1.2.8.bak
cp skyline/ionosphere/ionosphere.py skyline/ionosphere/ionosphere.py.v1.2.8.bak
cp skyline/ionosphere/learn.py skyline/ionosphere/learn.py.v1.2.8.bak
cp skyline/webapp/ionosphere_backend.py skyline/webapp/ionosphere_backend.py.v1.2.8.bak
wget -O skyline/skyline_version.py "${GITHUB_TREE_URL}/skyline/skyline_version.py"
wget -O skyline/analyzer/analyzer.py "${GITHUB_TREE_URL}/skyline/analyzer/analyzer.py"
wget -O skyline/mirage/mirage.py "${GITHUB_TREE_URL}/skyline/mirage/mirage.py"
wget -O skyline/ionosphere/ionosphere.py "${GITHUB_TREE_URL}/skyline/ionosphere/ionosphere.py"
wget -O skyline/ionosphere/learn.py "${GITHUB_TREE_URL}/skyline/ionosphere/learn.py"
wget -O skyline/webapp/ionosphere_backend.py "${GITHUB_TREE_URL}/skyline/webapp/ionosphere_backend.py"
# Restart analyzer, mirage, ionosphere and webapp
SKYLINE_SERVICES="analyzer
mirage
ionosphere
webapp"
for i in $SKYLINE_SERVICES
do
/etc/init.d/$i restart
done
- Check the logs

.. code-block:: bash
# How are they running
tail -n 20 /var/log/skyline/*.log
# Any errors - each app
find /var/log/skyline -type f -name "*.log" | while read skyline_logfile
do
echo "#####
# Checking for errors in $skyline_logfile"
cat "$skyline_logfile" | grep -B2 -A10 -i "error ::\|traceback" | tail -n 60
echo ""
echo ""
done
13 changes: 9 additions & 4 deletions skyline/analyzer/analyzer.py
Original file line number Diff line number Diff line change
Expand Up @@ -1782,12 +1782,17 @@ def smtp_trigger_alert(alert, metric, context):
except:
logger.error(traceback.format_exc())
logger.error('error :: could not send alert on stale digest email')
del stale_metrics_to_alert_on
del alert
del metric
# @modified 20181117 - Feature #2492: alert on stale metrics
# Stop analyzer stalling if there is no
# stale_metrics_to_alert_on list to delete
try:
del stale_metrics_to_alert_on
del alert
del metric
except:
pass
else:
logger.info('there are no stale metrics to alert on')
del stale_metrics_to_alert_on

run_time = time() - now
total_metrics = str(len(unique_metrics))
Expand Down
22 changes: 16 additions & 6 deletions skyline/ionosphere/ionosphere.py
Original file line number Diff line number Diff line change
Expand Up @@ -624,9 +624,15 @@ def engine_disposal(engine):
if not check_done:
logger.info('check done check - no check cache key - %s' % ionosphere_check_cache_key)
else:
logger.error('error :: a check cache key exists - %s' % ionosphere_check_cache_key)
logger.error('error :: failing check to prevent multiple iterations over this check')
fail_check(skyline_app, metric_failed_check_dir, str(metric_check_file))
# @modified 20181113 - Task #2680: Remove Ionosphere check files is key exists
# This was here for initially debugging, no longer needed
# logger.error('error :: a check cache key exists - %s' % ionosphere_check_cache_key)
# logger.error('error :: failing check to prevent multiple iterations over this check')
# fail_check(skyline_app, metric_failed_check_dir, str(metric_check_file))
logger.info('a check cache key exists - %s' % (ionosphere_check_cache_key))
logger.info('to prevent multiple iterations over this check removing %s' % (
str(metric_check_file)))
self.remove_metric_check_file(str(metric_check_file))
return
try:
check_process_start = int(time())
Expand Down Expand Up @@ -697,9 +703,13 @@ def engine_disposal(engine):
value = None

if not value:
logger.error('error :: failed to load value variable from check file - %s' % (metric_check_file))
fail_check(skyline_app, metric_failed_check_dir, str(metric_check_file))
return
# @modified 20181119 - Bug #2708: Failing to load metric vars
if value == 0.0:
pass
else:
logger.error('error :: failed to load value variable from check file - %s' % (metric_check_file))
fail_check(skyline_app, metric_failed_check_dir, str(metric_check_file))
return

from_timestamp = None
try:
Expand Down
10 changes: 7 additions & 3 deletions skyline/ionosphere/learn.py
Original file line number Diff line number Diff line change
Expand Up @@ -751,9 +751,13 @@ def learn_engine_disposal(engine):
value = None

if not value:
logger.error('error :: learn :: failed to load value variable from check file - %s' % (metric_check_file))
remove_work_list_from_redis_set(learn_metric_list)
continue
# @modified 20181119 - Bug #2708: Failing to load metric vars
if value == 0.0:
pass
else:
logger.error('error :: learn :: failed to load value variable from check file - %s' % (metric_check_file))
remove_work_list_from_redis_set(learn_metric_list)
continue

from_timestamp = None
try:
Expand Down
46 changes: 40 additions & 6 deletions skyline/mirage/mirage.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,12 @@ def __init__(self, parent_pid):
# @added 20170603 - Feature #2034: analyse_derivatives
# @modified 20180519 - Feature #2378: Add redis auth to Skyline and rebrow
if settings.REDIS_PASSWORD:
self.redis_conn = StrictRedis(password=settings.REDIS_PASSWORD, unix_socket_path=settings.REDIS_SOCKET_PATH)
self.redis_conn = StrictRedis(
password=settings.REDIS_PASSWORD,
unix_socket_path=settings.REDIS_SOCKET_PATH)
else:
self.redis_conn = StrictRedis(unix_socket_path=settings.REDIS_SOCKET_PATH)
self.redis_conn = StrictRedis(
unix_socket_path=settings.REDIS_SOCKET_PATH)

def check_if_parent_is_alive(self):
"""
Expand Down Expand Up @@ -262,7 +265,7 @@ def mirage_load_metric_vars(self, metric_vars_file):
return False
except:
logger.info(traceback.format_exc())
logger.error('error :: failed to load metric variables from check file - %s' % (metric_check_file))
logger.error('error :: failed to load metric variables from check file - %s' % (metric_vars_file))
return False

logger.info('debug :: metric_vars for %s' % str(metric))
Expand Down Expand Up @@ -393,8 +396,13 @@ def spin_process(self, i, run_timestamp):
logger.error('error :: failed to read value variable from check file - %s' % (metric_check_file))
return
if not value:
logger.error('error :: failed to load value variable from check file - %s' % (metric_check_file))
return
# @modified 20181119 - Bug #2708: Failing to load metric vars
if value == 0.0:
pass
else:
logger.error('error :: failed to load value variable from check file - %s' % (metric_check_file))
return


# if len(metric_vars.hours_to_resolve) == 0:
# return
Expand Down Expand Up @@ -1193,27 +1201,53 @@ def smtp_trigger_alert(alert, metric, second_order_resolution_seconds, context):
logger.error(traceback.format_exc())
logger.error('error :: failed to add an Ionosphere anomalous_metric for %s' % base_name)

# @added 20181114 - Bug #2682: Reduce mirage ionosphere alert loop
# To reduce the amount of I/O used by Mirage in this loop check
# and reduce the number of log entries for 'not alerting - Ionosphere metric'
# a check is made if the metric_name has already been check, if
# so continue
not_alerting_for_ionosphere = 'none'

for alert in settings.ALERTS:
# @added 20181114 - Bug #2682: Reduce mirage ionosphere alert loop
not_an_ionosphere_metric_check_done = 'none'

for metric in self.anomalous_metrics:
# @added 20161228 - Feature #1830: Ionosphere alerts
# Branch #922: Ionosphere
# Bringing Ionosphere online - do alert on Ionosphere
# metrics if Ionosphere is up
metric_name = '%s%s' % (settings.FULL_NAMESPACE, str(metric[1]))
if metric_name in ionosphere_unique_metrics:

# @added 20181114 - Bug #2682: Reduce mirage ionosphere alert loop
if not_alerting_for_ionosphere == metric_name:
continue

ionosphere_up = False
try:
ionosphere_up = self.redis_conn.get('ionosphere')
except Exception as e:
logger.error('error :: could not query Redis for ionosphere key: %s' % str(e))
if ionosphere_up:
logger.info('not alerting - Ionosphere metric - %s' % str(metric[1]))

# @added 20181114 - Bug #2682: Reduce mirage ionosphere alert loop
not_alerting_for_ionosphere = metric_name

continue
else:
logger.error('error :: Ionosphere not report up')
logger.info('taking over alerting from Ionosphere if alert is matched on - %s' % str(metric[1]))
else:
logger.info('not an Ionosphere metric checking whether to alert - %s' % str(metric[1]))
# @modified 20181114 - Bug #2682: Reduce mirage ionosphere alert loop
# logger.info('not an Ionosphere metric checking whether to alert - %s' % str(metric[1]))
if not_an_ionosphere_metric_check_done == metric_name:
# Do not log multiple times for this either
not_an_ionosphere_metric_check_done = metric_name
else:
logger.info('not an Ionosphere metric checking whether to alert - %s' % str(metric[1]))
not_an_ionosphere_metric_check_done = metric_name

ALERT_MATCH_PATTERN = alert[0]
METRIC_PATTERN = metric[1]
Expand Down
2 changes: 1 addition & 1 deletion skyline/skyline_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# @modified 20170109 - Feature #1854: Ionosphere learn
# Added learn
# __version_info__ = ('1', '1', '0')
__version_info__ = ('1', '2', '9')
__version_info__ = ('1', '2', '10')
__branch__ = 'luminosity'
__version_tag__ = 'stable'

Expand Down
32 changes: 28 additions & 4 deletions skyline/webapp/ionosphere_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -478,7 +478,22 @@ def new_load_metric_vars(metric_vars_file):

metric_vars_ok = False
metric_vars = ['error: could not read metrics vars file', metric_vars_file]
if path.isfile(metric_vars_file):

# @added 20181114 - Bug #2684: ionosphere_backend.py - metric_vars_file not set
# Handle if the metrics_var_file has not been set and is still False so
# that the path.isfile does not error with
# TypeError: coercing to Unicode: need string or buffer, bool found
metric_vars_file_exists = False
if metric_vars_file:
try:
if path.isfile(metric_vars_file):
metric_vars_file_exists = True
except:
logger.error('error :: metric_vars_file %s ws not found' % str(metric_vars_file))

# @modified 20181114 - Bug #2684: ionosphere_backend.py - metric_vars_file not set
# if path.isfile(metric_vars_file):
if metric_vars_file_exists:
try:
# @modified 20170104 - Feature #1842: Ionosphere - Graphite now graphs
# Feature #1830: Ionosphere alerts
Expand All @@ -491,9 +506,14 @@ def new_load_metric_vars(metric_vars_file):
metric_vars = new_load_metric_vars(metric_vars_file)
metric_vars_ok = True
except:
trace = traceback.format_exc()
logger.error(trace)
metric_vars_ok = False
logger.error(traceback.format_exc())
# logger.error(traceback.format_exc())
fail_msg = metric_vars
logger.error('%s' % fail_msg)
logger.error('error :: failed to load metric_vars from: %s' % str(metric_vars_file))
raise # to webapp to return in the UI

# TODO
# Make a sample ts for lite frontend
Expand Down Expand Up @@ -613,9 +633,13 @@ def new_load_metric_vars(metric_vars_file):
password = str(settings.WEBAPP_AUTH_USER_PASSWORD)
try:
if settings.WEBAPP_AUTH_ENABLED:
r = requests.get(url, timeout=2, auth=(user, password))
# @modified 20181106 - Bug #2668: Increase timeout on requests panorama id
# r = requests.get(url, timeout=2, auth=(user, password))
r = requests.get(url, timeout=settings.GRAPHITE_READ_TIMEOUT, auth=(user, password))
else:
r = requests.get(url, timeout=2)
# @modified 20181106 - Bug #2668: Increase timeout on requests panorama id
# r = requests.get(url, timeout=2)
r = requests.get(url, timeout=settings.GRAPHITE_READ_TIMEOUT)
panorama_resp = True
except:
logger.error(traceback.format_exc())
Expand Down

0 comments on commit 2faefc8

Please sign in to comment.