# Streaming Metrics

This notebook demonstrates the usage of some of goetia's internal streaming/rolling metrics over sliding windows. These are used internally for saturation detection, and offer some generalized functionality for computing live statistics over streams.

In [2]:
from goetia.saturation import SlidingWindow, RollingPairwise, SlidingCutoff
import numpy as np

## The `SlidingWindow` Class

`SlidingWindow` is the core, base class for running functions on a fixed-length windows over streaming data. The user provides a `window_size` for the length of the window (the number of most-recent observations) and a function `func` to apply to this window. The `push` method adds an observed value, and once `window_size` values have been added, runs the provided function on the window.

`func` should take an iterable. For example, we can create a `SlidingWindow` of length 5 apply `np.mean` to the window. The result of `push` is a tuple with the result of the function applied on the window and the time; in the simplest case here, time is just the index for the value. Note that, until the initial window is filled, `nan` is returned.

In [3]:
W = SlidingWindow(5, np.mean)

In [4]:
for i in range(10):
    print(W.push(i))

(nan, 0)
(nan, 1)
(nan, 2)
(nan, 3)
(2.0, 4)
(3.0, 5)
(4.0, 6)
(5.0, 7)
(6.0, 8)
(7.0, 9)


### Windows of Size 2

Often times, we simply want to compare each value to the previous, ie, use a window of size 2. The `RollingPairwise` class is a convenience class that sets `SlidingWindow` with the correct parameters.

In [12]:
W = RollingPairwise(lambda tup: tup[1] - tup[0])

In [13]:
for v in range(0, 20, 2):
    print(W.push(v))

(nan, 0)
(2, 1)
(2, 2)
(2, 3)
(2, 4)
(2, 5)
(2, 6)
(2, 7)
(2, 8)
(2, 9)


### Window Functions Using Time

We can also supply a window function that makes use of time; a time value can be provided by calling `push` with an `(observation, time)` tuple, or the default behavior of using the index can be used. We pass `uses_time` if our window function expects time values. For example, we can calculate our value per time:

In [17]:
def diff_per_time(vals):
    # unpack the two observations and their times
    ((xv, xt), (yv, yt)) = vals
    elapsed = yt - xt
    diff = yv - xv
    return diff / elapsed

W = RollingPairwise(diff_per_time, uses_time=True)

In [18]:
obs = [(10, 0), (15, 2), (17, 5), (20, 10)]
for ob in obs:
    print(W.push(ob))

(nan, 0)
(2.5, 2)
(0.6666666666666666, 5)
(0.6, 10)


## The `SlidingCutoff` Class

In some cases, we want 