## Anomaly Detection with Azure Machine Learning APIs

When is it good to have an anomaly detection service?  Perhaps to watch out for:

* Too many login failures
* Spikes or dips in customer checkouts
* An increase in the dynamic range of file ingestion speeds in a cloud service
* An upward trend in system temperature

These are cases found from monitoring a system where a closer look may be called for.  They are indicative of abnormal or anomalous behavior and could indicate a problem.  The data could be streaming from a device or come from log files, but no matter the source an anomaly detection model could help predict when a system needs to be examined further.

#### For python 2 and 3 compatibility we have a few imports

In [2]:
import json

# Import compatibility libraries (python 2/3 support)
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

# Python 3
try:
    from urllib.request import urlopen, Request
    from urllib.parse import urlparse, urlencode
    from http.client import HTTPSConnection
# Python 2.7
except ImportError:
    from urlparse import urlparse
    from urllib import urlencode
    from urllib2 import Request, urlopen
    from httplib import HTTPSConnection 

**Data**

This is non-seasonal time series data.

In [8]:
body = json.loads('''
{
  "data": [
    [ "9/21/2014 11:05:00 AM", "1.3" ],
    [ "9/21/2014 11:10:00 AM", "9.09" ],
    [ "9/21/2014 11:15:00 AM", "2.4" ],
    [ "9/21/2014 11:20:00 AM", "2.5" ],
    [ "9/21/2014 11:25:00 AM", "2.6" ],
    [ "9/21/2014 11:30:00 AM", "2.1" ],
    [ "9/21/2014 11:35:00 AM", "3.5" ],
    [ "9/21/2014 11:40:00 AM", "0" ],
    [ "9/21/2014 11:45:00 AM", "2.8" ],
    [ "9/21/2014 11:50:00 AM", "2.3" ]
  ],
  "params": {
    "tspikedetector.sensitivity": "4",
    "zspikedetector.sensitivity": "4",
    "trenddetector.sensitivity": "3.25",
    "bileveldetector.sensitivity": "3.25",
    "postprocess.tailRows": "0"
  }
}
''')

print(body)

{u'params': {u'postprocess.tailRows': u'0', u'trenddetector.sensitivity': u'3.25', u'zspikedetector.sensitivity': u'4', u'tspikedetector.sensitivity': u'4', u'bileveldetector.sensitivity': u'3.25'}, u'data': [[u'9/21/2014 11:05:00 AM', u'1.3'], [u'9/21/2014 11:10:00 AM', u'9.09'], [u'9/21/2014 11:15:00 AM', u'2.4'], [u'9/21/2014 11:20:00 AM', u'2.5'], [u'9/21/2014 11:25:00 AM', u'2.6'], [u'9/21/2014 11:30:00 AM', u'2.1'], [u'9/21/2014 11:35:00 AM', u'3.5'], [u'9/21/2014 11:40:00 AM', u'0'], [u'9/21/2014 11:45:00 AM', u'2.8'], [u'9/21/2014 11:50:00 AM', u'2.3']]}


**The headers and parameters**

The subscription key for Microsoft Azure DataMarket was placed in `config.json`.  The key can be found by going to your account in the [Azure DataMarket](https://datamarket.azure.com/account/keys) (you may need to register).

In [9]:
f = urlopen('https://gist.githubusercontent.com/antriv/a6962d2c7580a0f7db4b7aabd6d768c5/raw/38a66f77c7fd0641324c8cbbff77828207041edc/config.json')
url = f.read()
CONFIG = json.loads(url)

subscription_key = CONFIG['subscription_key_ADM']

import base64
creds = base64.b64encode('userid:' + subscription_key)

headers = {'Content-Type':'application/json', 'Authorization':('Basic '+ creds)} 

# params will be added to POST in url request
# right now it's empty because for this request we don't need any params
# although we could have included 'selection' and 'offset' - see docs
params = urlencode({})

**Make the request using the REST API**

Note, we are using non-seasonal time series mock data.

In [11]:
try:
    
    # Post method request - note:  body of request is converted from json to string

    conn = HTTPSConnection('api.datamarket.azure.com')
    
    conn.request("POST", "/data.ashx/aml_labs/anomalydetection/v2/Score/", 
                 body = json.dumps(body), headers = headers)
    
    response = conn.getresponse()
    data = response.read()
    conn.close()
except Exception as e:
    print("[Error: {0}] ".format(e))
    
try:
    # Print the results - json response format
    print(json.dumps(json.loads(json.loads(data)['ADOutput']), 
               sort_keys=True,
               indent=4, 
               separators=(',', ': ')))
except Exception as e:
    print(data)

{
    "ColumnNames": [
        "Time",
        "Data",
        "TSpike",
        "ZSpike",
        "rpscore",
        "rpalert",
        "tscore",
        "talert"
    ],
    "ColumnTypes": [
        "DateTime",
        "Double",
        "Double",
        "Double",
        "Double",
        "Int32",
        "Double",
        "Int32"
    ],
    "Values": [
        [
            "9/21/2014 11:05:00 AM",
            "1.3",
            "0",
            "0",
            "-0.687952590518378",
            "0",
            "-0.687952590518378",
            "0"
        ],
        [
            "9/21/2014 11:10:00 AM",
            "9.09",
            "1",
            "0",
            "-1.07030497733224",
            "0",
            "-0.884548154298423",
            "0"
        ],
        [
            "9/21/2014 11:15:00 AM",
            "2.4",
            "0",
            "0",
            "-1.30229513613974",
            "0",
            "-1.173800281031",
            "0"
        ],
        [


Output column meaning from [docs](https://azure.microsoft.com/en-us/documentation/articles/machine-learning-apps-anomaly-detection/):
* Time (input)
* Data (input)
* TSpike: Binary indicator to indicate whether a spike is detected by TSpike Detector (1 = spike)
* ZSpike: Binary indicator to indicate whether a spike is detected by ZSpike Detector (1 = spike)
* RPScore: A floating number representing anomaly score on bidirectional level change
* RPAlert: 1/0 value indicating there is an bi directional level change anomaly based on the input sensitivity
* TScore: A floating number representing anomaly score on positive trend
* TAlert: 1/0 value indicating there is a positive trend anomaly based on the input sensitivity