In [None]:
########################################################################
# File            : $Id: $
# Version         : $Revision: 001 $
# Modified On     : $Date: 4th August 2021$
#
# Language        : Python/Jupyter

## Unsupervised Anomaly Detection in Time Series Data

Many applications require being able to decide whether a new observation belongs to the same distribution as existing observations (it is an inlier), or should be considered as different (it is an outlier). Often, this ability is used to monitor the Assets.


The workflow of this notebook is as follows: <br>

1. [Provide Credential.](#packageLoad)
2. [Load Dataset.](#dataLoad)
3. [Compose Anomaly Service and Submit Job.](#pipelineCreation)
4. [Monitor Job](#thresholdstats)
5. [Result Analysis](#groundtruth)

### Credentials

This notebook requires two credentials. Please obtain your own credentials when customizing this notebook for your own work. Please visit __[Anomaly Detection @ IBM](https://developer.ibm.com/apis/catalog/ai4industry--anomaly-detection-product/Introduction)__ for trial subscription.

In [12]:
# Credentials required for running notebook

Client_ID = "replace-with-valid-client-ID"
Client_Secret = "replace-with-valid-client-Secret"

### Load Dataset<a id="dataLoad"></a>

In the data below, we have single sensor field representing values from one component. The actual meaning of the values isn't that important for the purpose of this example.

The task of anomaly detection is to predict either '1' or '-1' values along with anomaly score. Anomaly label '1' signifies that these samples at that time points are normals. Anomaly label '-1' means these samples are outliers or anomalous. Anomaly models are used to generate alarm in real time.

In [None]:
# try reading datasets from local files
import pandas as pd
datafile_name = 'sample.csv'
data_df = pd.read_csv("./datasets/univariate/" + datafile_name)

data_df.head(10)

Below plot shows the sensor data for each of the variables, the user can also choose/resample data based upon the domain knowledge.

In [None]:
data_df.plot(subplots=True, figsize=(15, 5))

### Anomaly Service Creation and Job Submission <a id="pipelineCreation"></a>

Now, we compose anomaly service. User need to provide a local file name and some meta-data information about the data (target_column, time_column, time_format, etc). The detail of these parameters are available at __[IBM API Hub @ IBM](https://developer.ibm.com/apis/catalog/ai4industry--anomaly-detection-product/api/API--ai4industry--anomaly-detection-api#batch_uni)__ for Univariate Anomaly detection service.

In [None]:
file_path = './datasets/univariate/' + datafile_name
files = {'data_file': (datafile_name, open(file_path, 'rb'))}
data = {
    'target_column': 'Value',
    'time_column': 'Time',
    'time_format': None,
    'prediction_type': 'entire',
    'algorithm_type': 'DeepAD',
    'lookback_window': 'auto',
    'observation_window': 10,
    'scoring_method': 'iid',
    'scoring_threshold': 10,
    'anomaly_estimator': 'Default',
}

headers = {
    'X-IBM-Client-Id': Client_ID,
    'X-IBM-Client-Secret': Client_Secret,
    'accept': "application/json",
    }

import requests
post_response = requests.post("https://api.ibm.com/ai4industry/run/anomaly-detection/timeseries/univariate/batch", 
                              data=data,
                              files=files,
                              headers=headers)

post_r_json = post_response.json()
anomaly_service_jobId = None
if 'jobId' in post_r_json:
    anomaly_service_jobId = post_r_json['jobId']
    print ('submitted successfully job : ', post_r_json['jobId'])
else:
    print (post_r_json)

### Monitor Anomaly Job <a id="pipelineCreation"></a>

Each anomaly detection service call generates one job id. We now track the progress of job. Detail of job execution is covered at __[Get Result](https://developer.ibm.com/apis/catalog/ai4industry--anomaly-detection-product/api/API--ai4industry--anomaly-detection-api#get_result_by_id)__.

In [None]:
import http.client

conn = http.client.HTTPSConnection("api.ibm.com")

headers = {
    'X-IBM-Client-Id': Client_ID,
    'X-IBM-Client-Secret': Client_Secret,
    'accept': "application/json"
    }

conn.request("GET", "/ai4industry/run/result/"+anomaly_service_jobId, headers=headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))

### Anomaly Results <a id="groundtruth"></a>

Now we plot the anomaly score and anomaly label detected by the service.

In [None]:
anomaly_score = [an array]
anomaly_label = [an array]

# plot anomaly score
plt.plot(anomaly_score)
plt.title('Prediction')
plt.xlabel('Observation')
plt.ylabel('Anomaly Score')
plt.show()


# plot anomaly label
plt.plot(anomaly_label)
plt.title('Ground Truth')
plt.xlabel('Observation')
plt.ylabel('Anomaly Score')
plt.show()