# What is it

Breakout detection is based on time series data. It is characterized by two steady states and an intermediate transition period. There are two general forms.

* __Mean shift:__ 
 * A sudden jump in the time series corresponds to a mean shift. A sudden jump in CPU utilization from 40% to 60% would exemplify a mean shift.
* __Ramp up:__
 * A gradual increase in the value of the metric from one steady state to another constitutes a ramp up. A gradual increase in CPU utilization from 40% to 60% would exemplify a ramp up.

<img src="images/breakout_detection_mean_shift.png" alt="Drawing" style="width: 500px;"/>

# Existing Packages

## Twitter
Has a Breakout Detection package that is best used with R.

"Our main motivation behind creating the package has been to develop a technique to detect breakouts which are robust, from a statistical standpoint, in the presence of anomalies." - *Twitter*

### How it works
Underlying algorithm is **e-Divisive with Medians (EDM)**. EDM can also be used detect change in distribution in a given time series.  EDM uses robust statistical metrics, viz., median, and estimates the statistical significance of a breakout through a permutation test.

EDM uses the **Behavioral Change Point Analysis (BCPA)** which is a method of identifying hidden shifts in the underlying parameters of a time series, developed specifically to be applied to animal movement data which is irregularly sampled. The purpose of the BCPA is to identify the locations where changes are abrupt (assumed to correspond to discrete changes in an animal's behavior).

"The most significant drawback of the BCPA as implemented here is that the parameter values themselves are somewhat difficult to interpret. The most satisfying development would be to estimate meaningful parameters, for example the mean true velocity and characteristic time scale of auto-correlation, directly from the data. This is the focus of ongoing research." - eliezg@uw.edu.

For more information on BCPA: http://wiki.cbr.washington.edu/qerm/index.php/Behavioral_Change_Point_Analysis

### How to install / basic commands
Needs to be installed using R

__Installation:__  
`install.packages("devtools")  
devtools::install_github("twitter/BreakoutDetection")
library(BreakoutDetection)`

__For documentation:__  
`help(breakout)`

__To try using dummy data:__  
`data(Scribe) res = breakout(Scribe, min.size=24, method='multi', beta=.001, degree=1, plot=TRUE)
res$plot`

The breakout function takes a data.frame with a column named 'timestamp' and a column named 'count'. It will only look at and compare these two columns. Therefore, you may need to group your data by the report date to use it.

__Group_by:__  
`data_grouped <- aggregate(colTIME ~ colSUMMED_COL, data, sum)`