# Changepoints.jl

This is a Julia package for the detection of multiple changepoints in time series.

- Detection is based on optimising a cost function over segments of the data.
- Implementations of the most efficient search algorithms (PELT , Binary Segmentation).
- A wide choice of parametric cost functions already implemented such as a change in mean/variance/mean and variance for Normal errors.
- Changepoint algorithms have an interface which allows users to input their own cost functions

## Installation and Loading
Changepoints requires Julia version 0.4. To install Changepoints run the following command inside a Julia session:

julia> Pkg.add("Changepoints")

The package is usually used in conjunction with the Distributions and Gadfly packages.

In [None]:
using Gadfly, Changepoints, Distributions

## Documentation

Most of the functionality of Changepoints has been documented.

In [None]:
?@PELT

## Simulation of Changepoints

This code simulates a time series of length `n` with segments that have lengths drawn from a Poisson distribution with mean λ. In this case the variance is fixed to 1.0 but for each new segment a new mean is drawn from a Uniform distribution.

In [None]:
n = 1000                   # Sample size
λ = 70                     # Freq. of changepoints
μ, σ = Uniform(0,5), 1.0 
data, cps = @changepoint_sampler n λ Normal(μ, σ)

The package supports Gadfly for convenient plotting of the results. (Gadfly is an optional dependency for the package and must be explicitly loaded. If Gadfly was loaded after the Changepoints package then the user must run `Changepoints.Gadfly_init()` in order to load the extra plotting functionality.

In [None]:
plot(data, cps)

### Exercise

Using the above code as a template, try simulating and plotting the following time series:

 1. Normal distribution with changing variance and mean
 2. Poisson distribution with changing frequency
 
 Give each of the outputted time series unique names so they can be reused.

## Finding Changepoints

The package has currently implemented the binary segmentation and PELT algorithms. Both of these algorithms take as input a segment cost function. The package contains many different segment cost models, see for example `?NormalMeanSegment` for a full list. The following code constructs a cost function from data assuming Normally distributed variates with changing mean and fixed variance.

In [None]:
seg_cost = NormalMeanSegment(data, σ )

In [None]:
seg_cost(1, 5)

Rather than having to remember the name of the function to construct the appropriate cost function, the package provides a macro which creates segment costs in a much more intuitive way:

In [None]:
seg_cost = @segment_cost data Normal(?, σ)

The `?` above denotes a parameter whose value changes.

Once a segment cost function has been constructed, we can run our changepoint algorithm with specified penalty.

In [None]:
pen = 3.0
pelt_cps, pelt_cost = PELT(seg_cost, n, pen);
bs_cps, bs_cost = BS(seg_cost, n, pen);

More macros are provided to cut out the step of explicitly constructing segment costs:

In [None]:
pelt_cps, pelt_cost = @PELT data Normal(?, \sigma) pen
bs_cps, bs_cost = @BS data Normal(?, \sigma) pen

### Exercise
For each of the time series you constructed above, run PELT. Try constructing the segment cost explicitly as well as using the convenience macros.