# Applying a linear filter to a digital signal

Now that we have looked at FFTs, lets consider the simple linear filter.  Linear filters are great for simple data where we want to remove noise.  Later we will look at a Kalman Filter, but this requires some knowledge of the statistics of the data, where as the linear filter does not.

We begin with the standard imports.

In [None]:
import numpy as np
import scipy as sp
import scipy.signal as sg
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In this example, let's look at the stock prices over a period of a few years.

In [None]:
nasdaq_df = pd.read_csv(
    'data/nasdaq.csv',
    index_col='Date',
    parse_dates=['Date'])

In [None]:
nasdaq_df.head()

In [None]:
date = nasdaq_df.index
nasdaq = nasdaq_df['Close']

Plotting the results we find a definite trend, with a peak, and lots of noise.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(6, 4))
nasdaq.plot(ax=ax, lw=1)

Using a linear filter, we can remove much of the noise, while preserving the important information like the peak.

Notice that if we simply did a line fit, e.g. linear regression, we would get a trend, but we would eliminate the peak around the year 2000.  Historically this is very important, because after the 9/11 attack in NYC, the stock market dropped significantly.

In [None]:
# We get a triangular window with 60 samples.
h = sg.get_window('triang', 60)
# We convolve the signal with this window.
fil = sg.convolve(nasdaq, h / h.sum())

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(6, 4))
# We plot the original signal...
nasdaq.plot(ax=ax, lw=3)
# ... and the filtered signal.
ax.plot_date(date, fil[:len(nasdaq)],
             '-w', lw=2)

Going one step further, we cann apply the Butterworth low-pass filter to further smooth the result.

Can you see the difference?

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(6, 4))
nasdaq.plot(ax=ax, lw=3)
# We create a 4-th order Butterworth low-pass filter.
b, a = sg.butter(4, 2. / 365)
# We apply this filter to the signal.
ax.plot_date(date, sg.filtfilt(b, a, nasdaq),
             '-w', lw=2)

Again using the Butterworth filter, instead of plotting the smooth line, we can instead look at the delta between the smooth line and the raw data.  This can reveal important underlying features in the data for further analysis.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(6, 4))
nasdaq.plot(ax=ax, lw=1)
b, a = sg.butter(4, 2 * 5. / 365, btype='high')
ax.plot_date(date, sg.filtfilt(b, a, nasdaq),
             '-', lw=1)