# Smoothing - Filtering - Forcasting: Looking at moving averages

## Intro
In this Notebook we will take a look at different ways to compute moving averages (aka rolling means aka rolling averages aka ...) for a simple dataset.

The target is to get a visual understanding on: 
- The general idea behind MA
- How MA are computed
- The differences between
    - moving average
    - weighted moving average
    - exponentially weighted moving average 

Another aim of this Notebook is to show why moving averages are important and to give some application examples.

This Notebook is more "follow along and watch it unfold" then do-it-yourself, but feel free to try-out and adjust the code to your liking.

A large part of this Notebook is code used to generate the animations. It's not necessary to go through that in detail.

In [None]:
#Only run if necessary:
#!brew install ffmpeg

## Let's get started!



First we start by importing the relevant libraries. Here we mostly use basic stuff, additionally we use matplotlib.animation to generate animations and Ipython.display.Video to show the generated animation. 

Working with Timeseries / signals, Statsmodel and Scipy are generally good starting points. Here, we only import statsmodels.api to generate the dataset we are using.

In [None]:
# Importing required libraries

#Data handling
import numpy as np
import pandas as pd

#Visualisation
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import Video

#Statistics
import statsmodels.api as sm

#General
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline


## Helping functions the backbone of this notebook
As mentioned in the Intro, you don't have to go through them in detail.

### Colors

Most important for this is, obviously, getting the right colors. So lets start by setting them up!

In [None]:
#neufische colors
NF_ORANGE = '#ff5a36'
NF_BLUE = '#163251'

def _color_palette(n_cols):
    """This generates a palette with N colors, to ensure every relevant point has it's specific color.  
    The colors are ligher shades of the neuefische blue.

    Args:
        n_cols (int): number of returned colors. If more then 20 colors are requested, the first ones will be the same grey.

    Returns:
        List: List of Colors in Hex
    """
    dark_colors=sns.color_palette("light:#163251", as_cmap=False,n_colors=9).as_hex() #NF_blue basis, palette used for the darker shades
    bright_colors = sns.color_palette("light:#9ea9b6", as_cmap=False,n_colors=15).as_hex() #Add some lighter shades, start shade is overlapping with dark
    colors=bright_colors+dark_colors[4:]
    if(n_cols>20):
        colors=["#f0f1f2"]*(n_cols-20)+colors
    return colors[-n_cols:]

What we do here is setting up an function that basically returns a list of hexcodes for as many colors as we want. To keep things elegant, we are using lighter and darker shades of Blue

>__Exercise__: Generate color palettes with 3,5,10 and 30 Colors and visualize them with ```    sns.color_palette() ```

In [None]:
for i in [1,3,5,10,30]:
    display(sns.color_palette(_color_palette(i)))

### Helper functions


Next lets define some helpful functions:

In [None]:
def _slicer(df):
    """Generator function to slice DataFrames

    Args:
        df (DataFrame): Given Data

    Yields:
        IteratorObject: The slices of the input data. First slice is the first row, second slice the first and second row third slice the first three rows etc.
    """
    for i in range(len(df)):
        yield df.iloc[:i+1]

The slices of the input data. First slice is the first row, second slice the first and second row third slice the first three rows etc.

>__Exercise__: Create a simple dataframe and use _slicer() to generate one slice (tip: use next()) and all slices

In [None]:
df=pd.DataFrame({"I packed my bag and in it I put:":["a fish","some data","a coach","a PC","and a python"]})
next(_slicer(df))

In [None]:
for slice in _slicer(df):
    display(slice)

### Computing the rolling mean

Here it gets more insteresting the _weights()-functions creates weights to compute a mean for different objectives(but we'll get to that later).
The weights here are all getting normalized already, so that the mean functions can directly apply and sum them.

In [None]:
def _weights(df,dist,alpha=0.6):
    """Function generates the weights for computing the averages

    Args:
        df (pd.DataFrame): The Data the list weights are generated (number of weight = length of data)
        dist (string): The distribution of weights. Possibilities are ["unif","triang","exp"] for uniform, triangular and exponential distributions.
        alpha (float, optional): Exponential smoothing factor in case the "exp" dist is used. Must be between 0 and 1.   Defaults to 0.6.

    Returns:
        [np.array]: [list of weights]
    """
    if dist=="unif":
        weights=np.array([1]*len(df))  #1 1 1 1
    
    if dist=="triang":
        weights=np.arange(1,len(df)+1) #1 2 3 4 

    if dist=="exp":
        weights=np.flip(np.array([alpha*(1-alpha)**i for i in range(len(df))]))
    
    weights=weights/np.sum(weights)
    return weights    

def _mean_unif(df):
    weights=_weights(df,"unif")
    return np.sum(np.array(df)*weights)

def _mean_triang(df):
    weights=_weights(df,"triang")
    return np.sum(np.array(df)*weights)

def _mean_exp(df,exp_alpha):
    weights=_weights(df,"exp",exp_alpha)
    return float(np.sum(np.array(df).ravel()*weights))



### Metric

To evaluate the performance of the moving averages we are going to compute we use the rmse. For this we use the following function, as it is well suited to deal with NA values (i.e. it just ignores those).

In [None]:
def _rmse(y_true,y_pred):
    return(np.sqrt(np.square(y_true - y_pred).mean()))

### Precompute all required data



In the next function, all the rolling means and the respective RMSE values are computed. It will be much easier to understand how this is done after you went through the whole notebook. For those of you who can't wait:
All that you will have to do in the end is to use:
```python
    df.value.rolling(window_size).mean() #compute rolling mean
    df.value.ewm(alpha=exp_alpha).mean() #compute exponentially weighted mean
```                        
Everything else is only for visualisation purposes and to make the educational example of how different customized weight-distributions can be used

In [None]:
def compute_rolling_data(df,window_size,exp_alpha):

    df_precomp=df.assign(noiseless=df.ts*10/len(df),
                        pandas_rollmean=df.value.rolling(window_size).mean(),
                        pandas_ewma=df.value.ewm(alpha=exp_alpha).mean(),
                        custom_rollmean_unif=df.value.rolling(window_size).apply(lambda x: _mean_unif(x)),
                        custom_rollmean_triang=df.value.rolling(window_size).apply(lambda x: _mean_triang(x)))

    
    #compute the custom rolling ewma
    ewma=[]
    for s in _slicer(df_precomp):
        ewma.append(_mean_exp(s.value,exp_alpha))
    df_precomp=df_precomp.assign(custom_ewma=ewma)

    # prepare storage for metrics
    metric_store={"rmse_signal":[],"rmse_pandas_rollmean":[],"rmse_pandas_ewma":[],"rmse_custom_rollmean_unif":[],"rmse_custom_rollmean_unif_noiseless":[],"rmse_custom_rollmean_triang":[],"rmse_custom_rollmean_triang_noiseless":[],"rmse_custom_ewma":[],"rmse_custom_ewma_noiseless":[]}

    # compute the metrics for each cumulative slice (basically the metric "up to ") and store them in the metric_store
    for s in _slicer(df_precomp):
        metric_store["rmse_signal"].append(                             _rmse(s.noiseless,s.value))
        metric_store["rmse_pandas_rollmean"].append(                    _rmse(s.value,s.pandas_rollmean))
        metric_store["rmse_pandas_ewma"].append(                        _rmse(s.value,s.pandas_ewma))
        metric_store["rmse_custom_rollmean_unif"].append(               _rmse(s.value, s.custom_rollmean_unif))
        metric_store["rmse_custom_rollmean_unif_noiseless"].append(     _rmse(s.noiseless, s.custom_rollmean_unif))
        metric_store["rmse_custom_rollmean_triang"].append(             _rmse(s.value, s.custom_rollmean_triang))
        metric_store["rmse_custom_rollmean_triang_noiseless"].append(   _rmse(s.noiseless, s.custom_rollmean_triang))
        metric_store["rmse_custom_ewma"].append(                        _rmse(s.value, s.custom_ewma))
        metric_store["rmse_custom_ewma_noiseless"].append(              _rmse(s.noiseless, s.custom_ewma))

    # turn metric_store into df and append its columns 
    df_precomp=pd.concat([df_precomp,pd.DataFrame(metric_store)]
                         ,axis=1)
    
    # return result
    return df_precomp



### Animation time!

The animations are generated in 4 *easy* steps:
1. Create ```animate()``` function, that draws the frame you want to show in the animation. The function should at least take a number as an argument (here i). This number is the Frame of the animation, so calls with different numbers will subsequently build up the animation.
2. Use ```animation.FuncAnimation()```. This will call the animate function once for each frame in ```frames``` (actually twice for the first frame but nvm) with the additional arguments given in ```fargs```
3. Use the ```animation.FFMPEGWriter()``` function to set stuff like fps, bitrate etc.
4. Use ```ani.save()``` to write the animation to file    

Steps 2-4 are performed within ```generate_animation()```


In [None]:
def animate(i, ax, data, smooth_type, size, exp_alpha,noiseless=False):
    """The main animation function. Each frame is drawn individually with this function.

    Args:
        i (int): Number of frame in the animation
        ax (matplotlib.axes): The ax object used to draw the frame
        data (DataFrame): The Data to draw, ie. the timeseries
        smooth_type (string): One of ['unif','triang','exp'], if unif is chosen, only that line is drawn. For 'triang' its 'unif' and 'triang' and for 'exp' its all three
        size (int): Window-size used for the rolling average
        exp_alpha (float): the exponential smoothing factor for smooth_type 'exp'
    """
    
    # compute the relevant means
    df_precomp=data
    df_precomp_up_to=df_precomp[:i+size]
    
    if(smooth_type=="unif"):
        precision=1
        df_precomp_window=df_precomp_up_to.iloc[-size:]
        current="custom_rollmean_unif"
    if(smooth_type=="triang"):
        precision=1
        df_precomp_window=df_precomp_up_to.iloc[-size:]
        current="custom_rollmean_triang"
    if(smooth_type=="exp"):
        precision=2
        df_precomp_window=df_precomp_up_to
        current="custom_ewma"
    
    weights=_weights(df_precomp_window,smooth_type,exp_alpha)
    
    colors=_color_palette(len(df_precomp_window))
    ax.clear()

    font = {'family': 'sans-serif',
            'color':  'black',
            'weight': 'normal',
            'size': 16}

    ax.xaxis.label.set_size(14)
    ax.yaxis.label.set_size(14)

    ax.grid(True)
    ax.tick_params(labelcolor='dimgrey',
                   labelsize='medium',
                   length=7,
                   color='lightgrey',
                   direction="out",
                   left=True,
                   bottom=True)

    ax.set_title('Rolling mean',
                 pad=30,
                 fontdict=font)

    ax.set_xlabel('Timestep',
                  labelpad=20,
                  color="dimgrey")

    ax.set_ylabel('Signal',
                  labelpad=20,
                  color="dimgrey")
    ax.set_xlim(-1,len(df_precomp) +5)
    ax.set_ylim(-3,12)
    
    #Original data
    sns.lineplot(x=df_precomp.ts,
                 y=df_precomp.value,
                 ax=ax,
                 ci=None,
                 color=NF_BLUE,
                 label="Original Data")
    if noiseless:
        plt.plot(df_precomp.ts,
                 df_precomp.noiseless,
                 "--",
                 color=NF_BLUE,)        
        
        
    # Output Rolling Error + Mean line
    # stacked lines!

    plt.text(x=3,y=7.5,s="RMSE unif:        {:.0%}".format(df_precomp_up_to.iloc[-1].rmse_custom_rollmean_unif),rotation=00,ha="left",size=10,color=NF_ORANGE,alpha=0.5,family="monospace")
    if(smooth_type in ["exp","triang"]):
        plt.text(x=3,y=7.0,s="RMSE triang:      {:.0%}".format(df_precomp_up_to.iloc[-1].rmse_custom_rollmean_triang),rotation=00,ha="left",size=10,color=NF_ORANGE,alpha=0.8,family="monospace")
    if(smooth_type == "exp"):
        plt.text(x=3,y=6.5,s="RMSE exp:         {:.0%}".format(df_precomp_up_to.iloc[-1].rmse_custom_ewma),rotation=00,ha="left",size=10,color=NF_ORANGE,alpha=1,family="monospace")
    
    if noiseless:
        plt.text(x=3,y=8.5,s="RMSE signal:      {:.0%}".format(df_precomp_up_to.iloc[-1].rmse_signal),rotation=00,ha="left",size=10,color=NF_BLUE,alpha=1,family="monospace")
        plt.text(x=3,y=5.5,s="RMSE unif true:   {:.0%}".format(df_precomp_up_to.iloc[-1].rmse_custom_rollmean_unif_noiseless),rotation=00,ha="left",size=10,color=NF_ORANGE,alpha=0.5,family="monospace")
        if(smooth_type in ["exp","triang"]):
            plt.text(x=3,y=5.0,s="RMSE triang true: {:.0%}".format(df_precomp_up_to.iloc[-1].rmse_custom_rollmean_triang_noiseless),rotation=00,ha="left",size=10,color=NF_ORANGE,alpha=0.8,family="monospace")
        if(smooth_type == "exp"):
            plt.text(x=3,y=4.5,s="RMSE exp true:    {:.0%}".format(df_precomp_up_to.iloc[-1].rmse_custom_ewma_noiseless),rotation=00,ha="left",size=10,color=NF_ORANGE,alpha=1,family="monospace")
    
    # Rolling mean line
    sns.lineplot(x=df_precomp_up_to.ts,
                 y=df_precomp_up_to.custom_rollmean_unif,
                 ax=ax,
                 ci=None,
                 color=NF_ORANGE,
                 alpha=0.5,
                 label="Uniformly weighted mean")
    if(smooth_type in ["exp","triang"]):
        sns.lineplot(x=df_precomp_up_to.ts,
                    y=df_precomp_up_to.custom_rollmean_triang,
                    ax=ax,
                    ci=None,
                    color=NF_ORANGE,
                    alpha=0.8,
                    label="triangular weighted mean")
    
    if(smooth_type == "exp"):
        sns.lineplot(x=df_precomp_up_to.ts,
                 y=df_precomp_up_to.custom_ewma,
                 ax=ax,
                 ci=None,
                 color=NF_ORANGE,
                 alpha=1,
                 label="exponentialy weighted mean")        
    
    ## Plot vertical lines for window points
    plt.vlines(x=df_precomp_window.ts,
                ymin=0,
                ymax=df_precomp_window.value,
                color=colors,
                linestyles="dotted")

    ## Rolling mean current value
    sns.scatterplot(data=df_precomp_up_to.iloc[-1:],
                    x="ts",
                    y=current,
                    color=NF_ORANGE,
                    marker="X",
                    s=300,
                    alpha=1)                

    # Little horozontal helper line
    plt.hlines(y=df_precomp_up_to[current].iloc[-1],
                xmin=df_precomp_up_to.ts.iloc[-1],
                xmax=df_precomp_up_to.ts.iloc[-1]+4,
                color=NF_BLUE)            
    
    # Points used for smothing
    plt.scatter(x=df_precomp_window.ts,
                y=df_precomp_window.value,
                #color=NF_BLUE,
                color=colors,
                s=400,
                marker="."
                )

    # lines and points that make up the mean
    plt.bar(df_precomp_window.ts,
            (df_precomp_window.value*weights),
            color=colors,
            )
                
    # weights distribution under points
    plt.bar(df_precomp_window.ts, 
            [-10*i for i in weights],
            color="grey", alpha=0.2)
    

    # weight labels in perc
    for i,v in enumerate(df_precomp_window.ts):
        #plt.text(x=v,y=-1,s="{:.3%}".format(weights[i]),rotation=90,ha="center",va="top",size=8)
        plt.text(x=v,y=-1,s=f"{round(100*weights[i],precision)}%",rotation=90,ha="center",va="top",size=8)
        
    
    ## stacked Bar Plot summary distribution
    for i in range(len(df_precomp_window)):
        plt.bar(x=df_precomp_window.ts.iloc[-1]+3,
                height=(df_precomp_window.value*weights).iloc[i],
                bottom=(df_precomp_window.value*weights)[:i].sum(),
                edgecolor="lightgrey",
                linewidth=1,
                width=2,
                color=colors[i])
        
    ## Plot legend
    plt.legend(loc=(0.03,0.8), frameon=False)

In [None]:
def generate_animation(name,data,smooth_type,window_size,alpha,frames,fps,noiseless=False):
    fig, ax = plt.subplots(figsize=(16, 8))
    
    ani = animation.FuncAnimation(fig,
                        animate,
                        fargs=[ax,data,smooth_type,window_size,alpha,noiseless],
                        frames=np.arange(0, frames-window_size+1, 1),
                        interval=1
                        )

    writer = animation.FFMpegWriter(
        fps=fps, metadata=dict(artist='neuefische'), bitrate=1800)

    ani.save(name,
            writer=writer,
            dpi=200)
    plt.close()


# Setup done, lets use it!

First we need some data, so lets generate a beautiful test signal and create a function to plot it

In [None]:
def signal_plot(x,y,z=None):
    fig, ax = plt.subplots(figsize=(16, 8))

    font = {'family': 'sans-serif',
            'color':  'black',
            'weight': 'normal',
            'size': 16}

    ax.xaxis.label.set_size(14)
    ax.yaxis.label.set_size(14)

    ax.grid(True)
    ax.tick_params(labelcolor='dimgrey',
                    labelsize='medium',
                    length=7,
                    color='lightgrey',
                    direction="out",
                    left=True,
                    bottom=True)

    ax.set_title('Generate a test data Signal',
                    pad=30,
                    fontdict=font)

    ax.set_xlabel('timestamp',
                    labelpad=20,
                    color="dimgrey")

    ax.set_ylabel('Signal',
                    labelpad=20,
                    color="dimgrey")

    ax.set_xlim(-0,60)
    ax.set_ylim(0,12)
    sns.lineplot(x,y,color=NF_BLUE,ax=ax)
    sns.scatterplot(x,y,color=NF_BLUE,marker="o",s=150,ax=ax)
    if z is None:
        pass
    else:
        sns.lineplot(x,z,color=NF_ORANGE,ax=ax)        
    return ax

In [None]:
#Generate random data
samples=60
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

plt_ax=signal_plot(x,y)


Wait a second. Thats not a beautiful signal, thats wiggly! Let's use something to smooth things over a little.

 How about we add to the plot... the average?

In [None]:
#plt_ax=signal_plot(x,y,pd.DataFrame(y).mean().tolist()*len(x))
plt_ax=signal_plot(x,y,[y.mean()]*len(x))

No. How about we add to the plot... the moving average?

## Moving average

In [None]:
plt_ax=signal_plot(x,y,pd.DataFrame(y).rolling(9).mean()[0].tolist())

That looks much better! But what is this orange line? Lets use our animation to find out whats behind that.

In [None]:
#MA and Animation settings
name="visualisations/MA_uniform.mp4"     # name of generated animation
smooth_type="unif"  # one of: unif, triang, exp

window_size=9       # size of rolling window, i.e. numbers of points used
alpha=0.4           # smoothing factor for exp (0,1)
samples=60          # numbers of sample in test frame
fps=3               # animation frames per second


#Generate random data
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

#process data
data = pd.DataFrame({"ts":x,"value":y})
df_precomputed_data=compute_rolling_data(data,window_size,alpha)

#make the animation
generate_animation(name,df_precomputed_data,smooth_type,window_size,alpha,samples,fps)

#show the result
Video(name, width=1024)

Yes! It worked :) 

Now, let's have a closer look. First, the blue line. This is the input dataseries we are analyzing. Next, look at moving window: we selected a size of 9 so its 9 points that are moving through the data to compute the mean at each step (hence, moving average). Those nine points are marked with colored big dots and a matching dotted line to the x axis. Below the axis you see the weight that each point is given. As we are just using a mean here, each of the 9 points has an uniform weight of 1/9 as shown by the grey bars and the percentage number (1/9 = 11.1%). 

The colored bars above the x-axis show the contribution of the window points (weight multiplied with their value). On the right side, next to the distributions, those contributions are stacked. If you add all of them up, you end up with the orange x, the computed rolling average. Easy, right?


## Triangular weighted moving average

In the last example, we used mean on the window-points, so all points had the same weight. But in time-series often it makes more sense to put more weight to the more recent points and less weight to the once further in the past. For example, if you have a time-series from your daily measurements on the scale. It makes sense that the measurement of today is closer to the measurement from yesterday then the measurement of the day befor that. (unless you take seasonality into account because we typically eat more on weekends but we'll get to that later).

So lets change the weights distribution from uniform to triangular: the oldest of the window points gets the weight one, second oldest two etc.

 Lets see how this affects the moving average (and for a better comparison, we just stack it on top of the previous animation)

In [None]:
name="visualisations/MA_triang.mp4"     # name of generated animation
smooth_type="triang"   # one of: unif, triang, exp

window_size=9       # size of rolling window, i.e. numbers of points used
alpha=0.4           # smoothing factor for exp (0,1)
samples=60          # numbers of sample in test frame
fps=3               # animation frames per second


#Generate random data
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

#process data
data = pd.DataFrame({"ts":x,"value":y})
df_precomputed_data=compute_rolling_data(data,window_size,alpha)

#make the animation
generate_animation(name,df_precomputed_data,smooth_type,window_size,alpha,samples,fps)

#show the result
Video(name, width=1024)

In general it looks pretty similar. But we can clearly see the different distribution of the weights by looking at the grey bars and the attached percentage numbers (again, the numbers are  scaled down from [1...9] ot ensure that their sum is one).

As we put more weight on the most recent data points, this triangular weighted moving average follows the originial signal closer, but still shows a good smoothing. We can see that not just by eye-balling it, but by looking at the RMSE for both lines that are also shown in the plot. The number given in the animation is updating so its always the RMSE up to the current point. As the RMSE is smaller for the triangular line, we can see that this is a closer fit to the original signal. Nice!




## Exponentialy weighted moving average

What else can we do? In the previous examples, we used a fixed window size: we always used nine points to compute the average. But if we adjust the weights anyway, we dont have to restrict ourselfs to fixed windows. 

Let's see how that looks and what it means!

In [None]:
name="visualisations/MA_exp.mp4"     # name of generated animation
smooth_type="exp"   # one of: unif, triang, exp

window_size=9       # size of rolling window, i.e. numbers of points used
alpha=0.4           # smoothing factor for exp (0,1)
samples=60          # numbers of sample in test frame
fps=3               # animation frames per second

#Generate random data
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

#process data
data = pd.DataFrame({"ts":x,"value":y})
df_precomputed_data=compute_rolling_data(data,window_size,alpha)

#make the animation
generate_animation(name,df_precomputed_data,smooth_type,window_size,alpha,samples,fps)

#show the result
Video(name, width=1024)

As you can see, all the points from the history are kept within the window! But to not have an overwhelming impact of the past values, we make sure that the weight we put on points further in the past is decaying quite fast. For example, the weight for the 15-points back is already down to 0.03\%. But it's still (slightly) contributing! 

What we used here is called: exponentially weighted moving average (EWMA). For most applications where a moving averages can be used, this yields very good results. 

But why is it calles **exponentially** weighted? The name stems from the way the decay in the weights is computed: its an exponential decay thats controlled by the parameter "alpha" (in the example we use alpha=0.4).

Let's look at how we computed the weights:

$w_i\approx \alpha * (1-\alpha)^i$ or in our implementation
```python
weights = np.array([alpha*(1-alpha)**i for i in range(len(df))]))
```
So the weight for the most recent point($x_i$, i=0) gets the weight:. 

$w_0=\alpha \cdot (1-\alpha)^0 = \alpha=0.4$

The point prior to that($x_{i-1}$, i=1) gets the weight:

$w_1=\alpha \cdot (1-\alpha)^1 = 0.4*0.6=0.24$

The next point ($x_{i-2}$, i=2) gets the weight

$w_2=\alpha \cdot (1-\alpha)^2 = 0.4*0.6^2=0.144 $

and so on. 


And for those who watched too carefully:
 Yes for the first points the displayed weight is slightly different because this exponential series only converges to one with infinite members, hence, to compute the mean we actually adjust the weights so that their sum will be one. E.g. for the second point we don't use the weights 0.4 and 0.24 but instead 0.4/(0.6+0.4)=0.625 and 0.24/(0.6+0.4)=0.375. As the series gets longer,the weights are getting closer to $w_i\approx \alpha * (1-\alpha)^i$)

To see this look at the 3 most recent weights for series of increasing length with:
```python
for i in [1,2,5,10,100]:
    display(_weights([1]*i,"exp",0.4)[-3:])
```

In [None]:
for i in [1,2,5,10,100,1000]:
    display(_weights([1]*i,"exp",0.4)[-3:])

## Outliers

One of the cases where EWMA (or all the MAs) are used is for smoothing data. So lets see how the different approaches react to outliers!

In [None]:
name="visualisations/MA_exp_outlier.mp4"     # name of generated animation
smooth_type="exp"   # one of: unif, triang, exp
window_size=9       # size of rolling window, i.e. numbers of points used
alpha=0.4           # smoothing factor for exp (0,1)
samples=60          # numbers of sample in test frame
fps=3               # animation frames per second

#Generate random data
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

#Introduce "outlier"
y[15]=20

#process data
data = pd.DataFrame({"ts":x,"value":y})
df_precomputed_data=compute_rolling_data(data,window_size,alpha)

#make the animation
generate_animation(name,df_precomputed_data,smooth_type,window_size,alpha,samples,fps)

#show the result
Video(name, width=1024)

As you can see, all of the rolling average models are influenced by the outlier. So if it really is an outlier, it mighty be preferable to replace it with an imputation (suprise, moving averages are ideally to do that as well!), before computing the ma. 

But if that is not desired, the models are severly smoothing the effect of the outlier. As you can see, EWMA has the smallest RMSE and is overall closest to the signal. However, unsuprisingly this means it's aswell the one most affected by the outlier! 
If you look at the other hand to the uniform mean, the impact directly at the jumps is the smallest, but the impact lingers for the whole window duration.


## Hyperparameter tuning

Next lets play a little with the smothing parameter (i.e. alpha). We can select values that result in a lot of smoothing(alpha close to zero) to barely any smoothing (alpha close to one)

In [None]:
name="visualisations/MA_exp_outlier_small.mp4"     # name of generated animation
smooth_type="exp"   # one of: unif, triang, exp

window_size=9       # size of rolling window, i.e. numbers of points used
alpha=0.05           # smoothing factor for exp (0,1)
samples=60          # numbers of sample in test frame
fps=3               # animation frames per second


#Generate random data
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

#Introduce "outlier"
y[15]=20

#process data
data = pd.DataFrame({"ts":x,"value":y})
df_precomputed_data=compute_rolling_data(data,window_size,alpha)

#make the animation
generate_animation(name,df_precomputed_data,smooth_type,window_size,alpha,samples,fps)

#show the result
Video(name, width=1024)

It's much smoother, but then again the error increased quite a bit and especially in the end we see how moving average jsut lags behind the signal.

In [None]:
name="visualisations/MA_exp_outlier_large.mp4"     # name of generated animation
smooth_type="exp"   # one of: unif, triang, exp

window_size=9       # size of rolling window, i.e. numbers of points used
alpha=0.75           # smoothing factor for exp (0,1)
samples=60          # numbers of sample in test frame
fps=3               # animation frames per second


#Generate random data
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

#Introduce "outlier"
y[15]=20

#process data
data = pd.DataFrame({"ts":x,"value":y})
df_precomputed_data=compute_rolling_data(data,window_size,alpha)

#make the animation
generate_animation(name,df_precomputed_data,smooth_type,window_size,alpha,samples,fps)

#show the result
Video(name, width=1024)

The other extreme: the error is much smaller but the line isn't much smoother then the signal. 

# Whats the time? Gridsearch time!


Up to now, we considered the dataset to be the true data, and the RMSE as an error intruduced by deviating from the true data. 

Let's just assume we now the "true" value of the data we are looking at, and we think that those wiggles are just noise. 
Let's face face it: this line 
```
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 
```
looked fishy from the beginning right?!


Let's say the latter part is the actual signal and the Arma process part is just noice ontop of that.  

For example this could be the situation in a laboratory, where you are testing a new sensor you are developing in an controlled environment. The "data" would be the measured signal from the sensor and "true data" would be the controlled load of the sensor.

Now to develop a good driver for our sensor, we can investigate the optimal smoothing for our EWMA on the laboratory data and compute the RMSE as an error between the smoothing model and the actual true data. 

How do we optimize this? We use a gridsearch on the parameters! 


In [None]:
name=""     # name of generated animation
smooth_type="exp"   # one of: unif, triang, exp

window_size=9       # size of rolling window, i.e. numbers of points used
alpha=0.75           # smoothing factor for exp (0,1)
samples=60          # numbers of sample in test frame
fps=3               # animation frames per second

#Generate random data
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

#Introduce "outlier"
y[15]=20

#process data
data = pd.DataFrame({"ts":x,"value":y})

alpha_store=[]
rmse_store=[]

for alpha in np.arange(0.01,1,0.01):
    df_precomputed_data=compute_rolling_data(data,window_size,alpha)
    alpha_store.append(alpha)
    rmse_store.append(df_precomputed_data.rmse_custom_ewma_noiseless.iloc[-1])

arg=np.argmin(np.abs(rmse_store))    
best_alpha=alpha_store[arg]
best_rmse=rmse_store[arg]

fig, ax = plt.subplots(figsize=(16, 8))

font = {'family': 'sans-serif',
        'color':  'black',
        'weight': 'normal',
        'size': 16}

ax.xaxis.label.set_size(14)
ax.yaxis.label.set_size(14)

ax.grid(True)
ax.tick_params(labelcolor='dimgrey',
                labelsize='medium',
                length=7,
                color='lightgrey',
                direction="out",
                left=True,
                bottom=True)

ax.set_title('Gridsearch for optimal smoothing',
                pad=30,
                fontdict=font)

ax.set_xlabel('Alpha',
                labelpad=20,
                color="dimgrey")

ax.set_ylabel('RMSE',
                labelpad=20,
                color="dimgrey")

ax.set_xlim(-0.2,1.2)
ax.set_ylim(0,3)

for alpha,rmse in zip(alpha_store,rmse_store):
    plt.scatter(alpha,rmse,color=NF_BLUE,s=10)
plt.scatter(alpha_store[arg],rmse_store[arg],color=NF_ORANGE,label="exponential smoothing")

plt.scatter(alpha_store[arg],rmse_store[arg],s=100,marker="X",color=NF_ORANGE)

plt.axhline(df_precomputed_data["rmse_custom_rollmean_unif_noiseless"].iloc[-1],color=NF_ORANGE,alpha=0.5, label="uniform smoothing")
plt.axhline(df_precomputed_data["rmse_custom_rollmean_triang_noiseless"].iloc[-1],color=NF_ORANGE,alpha=0.8, label="triangular smoothing")
plt.axvline(alpha_store[arg],color="grey")
plt.axhline(rmse_store[arg],color="grey")
plt.legend(frameon=False)
print(f"Best fit with exponential smoothing: alpha={round(alpha_store[arg],2)} and rmse={round(rmse_store[arg],2)}")


As we can see from the plot we get a nice minimum in the Error function At alpha $\approx$ 0.11.

 Now lets use this to plot our optimized smoothing model for this application.

In [None]:
name="visualisations/MA_uniformexp_opti.mp4"     # name of generated animation
smooth_type="exp"   # one of: unif, triang, exp

window_size=9       # size of rolling window, i.e. numbers of points used
alpha=best_alpha    # smoothing factor for exp (0,1)
samples=60          # numbers of sample in test frame
fps=3               # animation frames per second

#Generate random data
np.random.seed(13)
x=np.arange(samples)
y=sm.tsa.ArmaProcess(ar=[1,-0.8]).generate_sample(nsample=samples) + x*10/samples 

#Introduce "outlier"
y[15]=20

#process data
data = pd.DataFrame({"ts":x,"value":y})

#recompute data with best alpha
df_precomputed_data=compute_rolling_data(data,window_size,best_alpha)
    
#make the animation
generate_animation(name,df_precomputed_data,smooth_type,window_size,best_alpha,samples,fps,noiseless=True)


#show the result
Video(name, width=1024)

The value of "RMSE signal" is the rmse between the true data (dashed blue line) and the signal (blue line).

The first set RMSEs is the error between the smoothed data and the signal (blue line).

The second set of RMSEs is the error between the smoothed data and the true data (dashed line). This is what we were optimizing the model for in the gridsearch

To be fair we could include different window-sizes for the uniform and triangular smoothing in the in the gridsearch. You can implement that if you want.

# Outlook
MAs can also be used to forecast data (just leave out the current point). Or to impute missing values / outliers which is quite relevant when processing timeseries with methods that don't allow gaps in the data (e.g. seasonal decomposition). For this it's often beneficial to use symetric windows around the missing value (i.e. the average of the 4 points earlier and 4 points after the gap). 

In this Notebook three different weight-distributions where shown, but as this can be customised the available options are limitless. For the most used ones see [Pandas](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html) and the built-in signals from [scipy](https://docs.scipy.org/doc/scipy/reference/signal.windows.html#module-scipy.signal.windows).

**Don't forget to clear outputs before you commit a Notebook**