## Goal
The objective of this project is to build one or more anomaly detection models to determine the anomalies using the other columns as features

## Load Dataset
First I import the required libraries

In [23]:
import pandas as pd #load the data into panda data frames
import warnings #ignore warnings
warnings.filterwarnings('ignore')

The dataset I have chosen is the Rouge Agent Key Hold dataset from [The Numenta Anomaly Benchmark (NAB)](https://github.com/numenta/NAB). It times the key holds for several users of a computer, where the anomalies represent a change in the user. Data are ordered, timestamped, single-valued metrics.

In [24]:
key_hold_df = pd.read_csv('../data/rogue_agent_key_hold.csv')
key_hold_df.head()

Unnamed: 0,timestamp,value
0,2014-07-06 20:10:00,0.064535
1,2014-07-06 20:15:00,0.064295
2,2014-07-06 20:20:00,0.06388
3,2014-07-06 20:25:00,0.065692
4,2014-07-06 20:35:00,0.056301


## Feature Engineering
The data seems to have NaN or odd looking values to clean. We are simply given the timestamp and values columns, which should be good to work with.

In [25]:
key_hold_df.isna().sum()

timestamp    0
value        0
dtype: int64

I use the timestamp data to create a year, month, date and time column. This will provide more features to feed into the model and will help build visualizations on different scales.

In [26]:
key_hold_df[['Year','Month', 'Date']] = key_hold_df.timestamp.str.split("-",expand=True)
key_hold_df['Date'], key_hold_df['Time'] = key_hold_df['Date'].str.split(' ', 1).str
key_hold_df.head()

Unnamed: 0,timestamp,value,Year,Month,Date,Time
0,2014-07-06 20:10:00,0.064535,2014,7,6,20:10:00
1,2014-07-06 20:15:00,0.064295,2014,7,6,20:15:00
2,2014-07-06 20:20:00,0.06388,2014,7,6,20:20:00
3,2014-07-06 20:25:00,0.065692,2014,7,6,20:25:00
4,2014-07-06 20:35:00,0.056301,2014,7,6,20:35:00


And I type cast all those columns into integers to use them for visualizations.

In [32]:
date_cols = ['Year', 'Month', 'Date']
for col in date_cols:
    key_hold_df[col] = key_hold_df[col].astype(int)

## Visualizations

## Anomaly Detection Models