# Lab 1 - Time Domain Filtering and Music Synthesis

Instructor: Prof. Lillian Jane Ratliff

Teaching Assistants: Ashwin Srinivas Badrinath and Ellory Freneau

Team Members: 

In [None]:
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import IPython
from scipy.io import wavfile
import matplotlib.pyplot as plt
import scipy.signal
from scipy import *
import copy
import pylab as pl
from scipy import signal
import time as t
from IPython import display

# Implementing Discrete Time Filters to Filter Time-Series Data

In this part, we will be looking at various discrete time filters and how they are used to make more sense of time-series data. These are very common, basic and helpful operations that one encounters in anything related to signal processing.

## Implementing a Mean Filter

In [None]:
# choose relevant parameters
srate = 1000 # sampling rate in Hz
time  = np.arange(0,3,1/srate) # associated time vector that corresponds to 3 seconds
n     = len(time) # length of the time vector
p     = 15 # poles for random interpolation
pi = np.pi # value of pi

# here are some base signals to work with
base1 = np.interp(np.linspace(0,p,n),np.arange(0,p),np.random.rand(p)*30)
base2 = 5*np.sin(2*pi*5*time)

# create some random noise to be added to the abve base signals

noiseamp = 1
noise  = noiseamp * np.random.randn(n)

# add noise to the base signals to create new noisy signals
signal1 = base1 + noise
signal2 = base2 + noise

# implement the running mean filter


filtsig1 = np.zeros(n) # initialize filtered signal vector for signal 1
filtsig2 = np.zeros(n) # initialize filtered signal vector for signal 2

k = 20 # filter window is actually k*2+1
for i in range(k+1,n-k-1):
    # each point is the average of k surrounding points
    filtsig1[i] = np.mean(signal1[i-k:i+k])
    filtsig2[i] = np.mean(signal2[i-k:i+k])

# compute the time window size in ms and print it
windowsize = 1000*(k*2+1) / srate
print("The time window size used was ",windowsize,"ms")

# For base signal 1:
# In a single plot and three subplots, plot the original signal, noisy signal and 
# filtered signal overliad on the noisy signal to see the difference


fig1=plt.figure(1)
fig1.subplots_adjust(hspace=1,wspace=1,left=0.1)

plt.subplot(3,1,1)
plt.plot(time,base1,label='orig')
plt.title('Original Signal')

plt.subplot(3,1,2)
plt.plot(time,signal1,label='orig')
plt.title('Noisy Signal')

plt.subplot(3, 1, 3)
plt.plot(time,signal1,label='orig')
plt.plot(time,filtsig1,label='filtered')
plt.title('Noisy and Filtered Signal')

plt.legend()

# For base signal 2:
# In a single plot and three subplots, plot the original signal, noisy signal and 
# filtered signal overliad on the noisy signal to see the difference

fig2=plt.figure(2)
fig2.subplots_adjust(hspace=1,wspace=1,left=0.1)

plt.subplot(3,1,1)
plt.plot(time,base2,label='orig')
plt.title('Original Signal')

plt.subplot(3,1,2)
plt.plot(time,signal2,label='orig')
plt.title('Noisy Signal')

plt.subplot(3, 1, 3)
plt.plot(time,signal2,label='orig')
plt.plot(time,filtsig2,label='filtered')
plt.title('Noisy and Filtered Signal')

plt.legend()

## Discussion

**Comment on how the results and plots change when you amplify the noise more and also change the value of k.**

When the noise amplitude is increased, the amplitude of the filtered signal decreases. If k is too small, the filtering effect is precluded. If k is too big then the filtered signal suffers phase lag and attenuation. Also, the shape of the filtered signal deviates away from the shape of the true signal.



**Mention and explain any ONE of many possible drawbacks of the mean filter in analysing noisy time-series?**

It is not sensitive to outliers in data.

## Implementing a Median Filter to Remove Spikes

In [None]:
# create signal
n = 2000
signal = np.cumsum(np.random.randn(n))

# proportion of time points to replace with noise
propnoise = .05

# find noise points
noisepnts = np.random.permutation(n)
noisepnts = noisepnts[0:int(n*propnoise)]

# generate signal and replace points with noise
signal[noisepnts] = 50+np.random.rand(len(noisepnts))*100

fig3=plt.figure(3)
plt.plot(range(0,n),signal)
plt.title('Noisy Signal')
plt.xlabel('time')
plt.ylabel('Signal')

# use hist to pick threshold
fig4=plt.figure(4)
plt.hist(signal,100)
plt.title('Histogram of Noisy Signal')

# visual-picked threshold
threshold = 40

# find data values above the threshold
suprathresh = np.where( signal>threshold )[0]

# initialize filtered signal
filtsig = copy.deepcopy(signal)

# loop through suprathreshold points and set to median of k
k = 20 # actual window is k*2+1
for ti in range(0,len(suprathresh)):
    
    # lower and upper bounds
    lowbnd = np.max((1,suprathresh[ti]-k))
    uppbnd = np.min((suprathresh[ti]+k,n))
    
    # compute median of surrounding points
    filtsig[suprathresh[ti]] = np.median(signal[lowbnd:uppbnd])

# plot
fig5=plt.figure(5)
plt.plot(range(0,n),signal,label='noisy')
plt.plot(range(0,n),filtsig,label='filtered')
plt.title('Noisy Signal and Filtered Signal')
plt.xlabel('time')
plt.ylabel('Signal')
plt.legend()

## Discussion

**Compare the mean and median filters in terms of their uses and one advantage and disadvantage one has over the other.**

Uses : The mean filter is used for time-series smoothing and to make changes less abrupt. The median filter is used to filter out outliers in the data.

Advantages and Disadvantages : The median filter is better at dealing with outliers in data. The mean filter is a linear filter and this makes it easier to implement than the median filter, which is a nonlinear filter.

## Denoising an EMG signal

In [None]:
# import data
emgdata = scipy.io.loadmat('EMG.mat')

# extract needed variables
emgtime = emgdata['emgtime'][0]
emg  = emgdata['emg'][0]

# initialize filtered signal
emgf = copy.deepcopy(emg)

# the loop version for interpretability
for i in range(1,len(emgf)-1):
    emgf[i] = emg[i]**2 - emg[i-1]*emg[i+1]

# the vectorized version for speed and elegance
emgf = copy.deepcopy(emg)
emgf[1:-1] = emg[1:-1]**2 - emg[0:-2]*emg[2:]

## convert both signals to zscore

# find timepoint zero
time0 = np.argmin(emgtime**2)

# convert original EMG to z-score from time-zero
emgZ = (emg-np.mean(emg[0:time0])) / np.std(emg[0:time0])

# same for filtered EMG energy
emgZf = (emgf-np.mean(emgf[0:time0])) / std(emgf[0:time0])


## plot
# plot "raw" (normalized to max.1)
plt.plot(emgtime,emg/np.max(emg),'b',label='EMG')
plt.plot(emgtime,emgf/np.max(emgf),'m',label='TKEO energy')
plt.xlabel('Time (ms)')
plt.ylabel('Amplitude or energy')
plt.legend()

plt.show()

# plot zscored
plt.plot(emgtime,emgZ,'b',label='EMG')
plt.plot(emgtime,emgZf,'m',label='TKEO energy')

plt.xlabel('Time (ms)')
plt.ylabel('Zscore relative to pre-stimulus')
plt.legend()
plt.show()

## Discussion

**How would the other two filters implemented, i.e, the running mean and median filters fare against the TKEO method in analysing the EMG signal in this fashion?**

The TKEO is an obvious choice over the other two filters because it accentuates difference in levels and that is exactly what is needed to detect muscle activity. The other two filters do not have that difference accentuating property that TKEO has.

**If you had to use a running mean filter or a median filter to analyse the EMG signal to detect muscle activity, which one would you prefer and why?**

The running mean would be a better choice because the median filter filters out outliers in the signal or the spikes and that is exactly what we need to detect muscle activity. However, neither filter is a good candidate for EMG based muscle activity analysis.