# Respiratory rate estimation

### Description

Seismocardiography([SCG](https://www.ncbi.nlm.nih.gov/pubmed/24111357) is a very promising technique to measure Heart Rate (HR) and Respiratory Rate (RR) with the detector positioned above sternum. It is generally based on accelerometer and gyroscope readings or a combination of them.
Ballistocardiography([BCG](https://en.wikipedia.org/wiki/Ballistocardiography)) is an another technique to estimate heart and respiratory rate with combination of both accelerometer and gyroscope. It is an indirect evaluation of HR and RR since the contact between the device and the body of the subject is not required (e.g., accelerometer platform mounted under the slats of the bed).
MuSe(Multi-Sensor miniaturized, low-power, wireless [IMU](https://en.wikipedia.org/wiki/Inertial_measurement_unit)) is an Inertial Measurement Unit (IMU) provide by (221e)[https://www.221e.com]. In the context of this project, It allows to record the inertial data necessary for the estimation of SCG and BCG.
The goal of this assignment is to estimate the respiratory rate of an healthy subject, given linear acceleration and angular velocity measurements recorded by using the aforementioned MuSe platform. The study must be performed on two datasets: the first is the compulsory one (center_sternum.txt) while the second is left at the discretion of the group, among those made available for the assignment.
N.B: Remember that normal beat is around 40-100 bpm.

[Actigraphy](https://en.wikipedia.org/wiki/Actigraphy) is a non-invasive method of monitoring human rest/activity cycles. Data will be provided from sensors gathering data on humans during their day/night activities

### Datasets

The data is provided in .txt file. During this study two healthy subjects were involved with their informed consent. The first dataset was recorded on one subject, while all the other datasets were recorded on the second subject.

This is the first mandatory file:
* center_sternum.txt: MuSe placed on the center of the sternum. The subject was lying supine on his left and right side, respectively.

Choose one of the following files in order to complete the task.
1. 1_Stave_supine_static.txt: Sensor placed on a bed stave, under the mattress at the level of the chest. The subject was lying supine on his left and right side.
2. 2_Mattress_supine.txt: Sensor placed on the mattress, near one corner but not under the pillow. The subject laid in the same position as above.
3. 3_Subject_sitting_chair.txt: Sensor placed on the desk: the subject, sitting on a chair, leaned forearms and hands on the desk.
4. 4_Chest_sweater.txt: Sensor placed on the subject chest directly on a sweater.
5. 5_Under_chair.txt: Subject sitting on a chair, sensor placed under the seat of the chair.

All .txt files give 16 columns index, in particular:
* Log Freq stands for the acquisition in Hz (i.e., sampling interval is constant).
* AccX, AccY, AccZ are the measured magnitude of linear acceleration along each axis.
* GyroX, GyroY, GyroZ are the measured magnitude of angular velocity along each axis.
* MagnX, MagnY, MagnZ are the measured magnitude of magnetic field along each axis.
* qw, qi, qj, qk are the quaternion components, representing the spatial orientation of the Muse
system.

Each dataset includes, in addition to the data, one file containing the adopted configuration of the MuSe(README1.txt for the first measurement, and in README_5.txt for the other measurement).
 
### Assignments

Data preparation:

1.1. Load the txt file and select only the columns you are interesting in, in order to do a complete data analysis (e.g. Log Freq, AccX, ... )

1.2. Plot selected data in function of time and choose a properly time window over which to perform the analysis. Pay attention on time rappresentation and the measurament unit.

1.3. In order to make an appropiate work, decide if take care about some particular axis or some combination of them as well as derived features for the next step of the task. Motivate your choice.

Time and frequency analysis:

2.1. Statistical analysis: provide a statistical description of the chosen dataset. Statistical
descriptors includes for example mean, median, variance, standard deviation, 25th and 75th percentiles, and correlation coefficients. Investigate what could be the most interesting descriptors for this type of data, motivating the choices.

2.2. Fourier Analysis: Perform a frequency analysis of the data. Look at the spectrum and explain what you see. Use this step in order to properly design the filters in the following step.

Filter:

Implement your own filter, trying to extrapolate respiratory rate signal. Hint:

(a) Directly from Fourier Analysis, antitrasform data looking for the most interesting frequency band.

(b) Choose the appropriate Lowpass/Bandpass/Highpass filter.

(c) Wavelet trasform (a powerfull instrument that make a time and frequency analysis of signal). 

(d) Find another method by yourselves.

Motivate your choice.

Metrics:

4.1. Respiratory Rate Per Minute(RPM): extrapolate RPM, make an histogram of the result. Does it follow a partiular distribution?

4.2. Respiratory Rate Variability(RRV): extrapolate RRV, explain why this parameter is important, and plot the results.

(OPTIONAL) Algorithm: Elaborate a simple algorithm to extrapolate respiratory rate even when filter failed (e.g. look at particular threshold...).

Conclusion:
 Summarise the obtained results, in particular making a comparison between the two files analysed. Highlight limitation and critical issues encountered during the work, motivating the most relevant contribution given by your solution.

N.B: Indicate the contribution, to achieving the result, of each member of the group.

 

### Contacts

* Marco Zanetti <marco.zanetti@unipd.it>
* Marco Signorelli <signo@221e.com>


In [None]:
import scipy as sp
from scipy import ndimage
import statistics
import math
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.fft import fft, fftfreq
from scipy.signal import butter, lfilter
from scipy.signal import freqz
import heartpy as hp
import scipy.stats as stats
from scipy.signal import find_peaks
%matplotlib inline
import pywt
import scaleogram as scg 
import matplotlib.gridspec as GridSpec
from mat4py import loadmat
from skimage.restoration import denoise_wavelet
from scipy import fftpack

In [None]:
# Load mandatory dataset
file_name="center_sternum.txt"
data=pd.read_csv(file_name, delimiter="\t")
time_step1 = 1/200
index = data.index
number_of_rows1 = len(index)
time_vec1 = np.arange(0, number_of_rows1 * time_step1, time_step1)
data.insert(0, 'new_seconds', time_vec1)
data =data.drop(columns=['Log Mode', 'Log Freq', 'Timestamp'])
data

In [None]:
# Plotting the raw data
%matplotlib inline
plt.rcParams.update({'font.size': 30})
data.plot(subplots=True, layout=(len(data.columns),1),figsize=(40,100))

plt.tight_layout()
plt.show()


You can see that at the beginning and end there is a lot of noise, probably due to the movements necessary to start and end the experiment. It is proposed to discard the data of the beginning and end, but they are left for a greater generality of the code.

You can see that in some columns, especially in GyroX, you can practically see the beats

STATISTICAL ANALYSIS <br>
Performed by Selen Arsalan and Dyutideepta Banerjee <br>
Mean:sum of all samples divided by the total number of samples. <br>
Median:The median is the middle value in a dataset when ordered from largest to smallest or smallest to largest. <br>
Variance: In statistics, the variance is a measure of how far individual (numeric) values in a dataset are from the mean or average value. <br> 
The variance is often used to quantify spread or dispersion. Spread is a characteristic of a sample or population that describes how much variability there is in it. <br>
Standard deviation: In statistics, standard deviation is the measure of dispersion of a set of data from its mean. It measures the absolute variability of a distribution; the higher the dispersion, the greater is the standard deviation and greater will be the magnitude of the deviation of the value from their mean. 

STATISTICAL ANALYSIS FOR MANDATORY DATASET

In [None]:
ax_median=(data.loc[:,"AccX"]).median(axis = 0)
ax_mean=(data.loc[:,"AccX"]).mean(axis = 0)
ax_var=(data.loc[:,"AccX"]).var(axis = 0)
ax_std=(data.loc[:,"AccX"]).std(axis = 0)
ax_25=(data.loc[:,"AccX"]).quantile(0.25)
ax_75=(data.loc[:,"AccX"]).quantile(0.75)

ay_median=(data.loc[:,"AccY"]).median(axis = 0)
ay_mean=(data.loc[:,"AccY"]).mean(axis = 0)
ay_var=(data.loc[:,"AccY"]).var(axis = 0)
ay_std=(data.loc[:,"AccY"]).std(axis = 0)
ay_25=(data.loc[:,"AccY"]).quantile(0.25)
ay_75=(data.loc[:,"AccY"]).quantile(0.75)

az_median=(data.loc[:,"AccZ"]).median(axis = 0)
az_mean=(data.loc[:,"AccZ"]).mean(axis = 0)
az_var=(data.loc[:,"AccZ"]).var(axis = 0)
az_std=(data.loc[:,"AccZ"]).std(axis = 0)
az_25=(data.loc[:,"AccZ"]).quantile(0.25)
az_75=(data.loc[:,"AccZ"]).quantile(0.75)
print("AccX mean=",ax_mean,"AccY mean=",ay_mean,"AccZ mean=",az_mean,'\n',
      "AccX median=",ax_median,"AccY median=",ay_median,"AccZ median=",az_median,'\n',
      "AccX variance=",ax_var,"AccY variance=",ay_var,"AccZ variance=",az_var,'\n',
      "AccX standard dev.=",ax_std,"AccY standard dev.=",ay_std,"AccZ standard dev.=",az_std,'\n'
      ,"AccX 25 percentile=",ax_25,"AccY 25 percentile=",ay_25,"AccZ 25 percentile=",az_25,
      "AccX 75 percentile=",ax_75,"AccY 75 percentile=",ay_75,"AccZ 75 percentile=",az_75,
      )

In [None]:
gx_median=(data.loc[:,"GyroX"]).median(axis = 0)
gx_mean=(data.loc[:,"GyroX"]).mean(axis = 0)
gx_var=(data.loc[:,"GyroX"]).var(axis = 0)
gx_std=(data.loc[:,"GyroX"]).std(axis = 0)
gx_25=(data.loc[:,"GyroX"]).quantile(0.25)
gx_75=(data.loc[:,"GyroX"]).quantile(0.75)

gy_median=(data.loc[:,"GyroY"]).median(axis = 0)
gy_mean=(data.loc[:,"GyroY"]).mean(axis = 0)
gy_var=(data.loc[:,"GyroY"]).var(axis = 0)
gy_std=(data.loc[:,"GyroY"]).std(axis = 0)
gy_25=(data.loc[:,"GyroY"]).quantile(0.25)
gy_75=(data.loc[:,"GyroY"]).quantile(0.75)

gz_median=(data.loc[:,"GyroZ"]).median(axis = 0)
gz_mean=(data.loc[:,"GyroZ"]).mean(axis = 0)
gz_var=(data.loc[:,"GyroZ"]).var(axis = 0)
gz_std=(data.loc[:,"GyroZ"]).std(axis = 0)
gz_25=(data.loc[:,"GyroZ"]).quantile(0.25)
gz_75=(data.loc[:,"GyroZ"]).quantile(0.75)
print("GyroX mean=",gx_mean,"GyroY mean=",gy_mean,"GyroZ mean=",gz_mean,'\n',
      "GyroX median=",gx_median,"GyroY median=",gy_median,"GyroZ median=",gz_median,'\n',
      "GyroX variance=",gx_var,"GyroY variance=",gy_var,"GyroZ variance=",gz_var,'\n',
      "GyroX standard dev.=",gx_std,"GyroY standard dev.=",gy_std,"GyroZ standard dev.=",gz_std,'\n'
      ,"GyroX 25 percentile=",gx_25,"GyroY 25 percentile=",gy_25,"GyroZ 25 percentile=",gz_25,
      "GyroX 75 percentile=",gx_75,"GyroY 75 percentile=",gy_75,"GyroZ 75 percentile=",gz_75,
      )

In [None]:
mx_median=(data.loc[:,"MagnX"]).median(axis = 0)
mx_mean=(data.loc[:,"MagnX"]).mean(axis = 0)
mx_var=(data.loc[:,"MagnX"]).var(axis = 0)
mx_std=(data.loc[:,"MagnX"]).std(axis = 0)
mx_25=(data.loc[:,"MagnX"]).quantile(0.25)
mx_75=(data.loc[:,"MagnX"]).quantile(0.75)

my_median=(data.loc[:,"MagnY"]).median(axis = 0)
my_mean=(data.loc[:,"MagnY"]).mean(axis = 0)
my_var=(data.loc[:,"MagnY"]).var(axis = 0)
my_std=(data.loc[:,"MagnY"]).std(axis = 0)
my_25=(data.loc[:,"MagnY"]).quantile(0.25)
my_75=(data.loc[:,"MagnY"]).quantile(0.75)

mz_median=(data.loc[:,"MagnZ"]).median(axis = 0)
mz_mean=(data.loc[:,"MagnZ"]).mean(axis = 0)
mz_var=(data.loc[:,"MagnZ"]).var(axis = 0)
mz_std=(data.loc[:,"MagnZ"]).std(axis = 0)
mz_25=(data.loc[:,"MagnZ"]).quantile(0.25)
mz_75=(data.loc[:,"MagnZ"]).quantile(0.75)

print("MagnX mean=",mx_mean,"MagnY mean=",my_mean,"MagnZ mean=",mz_mean,'\n',
      "MagnX median=",mx_median,"MagnY median=",my_median,"MagnZ median=",mz_median,'\n',
      "MagnX variance=",mx_var,"MagnY variance=",my_var,"MagnZ variance=",mz_var,'\n',
      "MagnX standard dev.=",mx_std,"MagnY standard dev.=",my_std,"MagnZ standard dev.=",mz_std,'\n'
      ,"MagnX 25 percentile=",mx_25,"MagnY 25 percentile=",my_25,"MagnZ 25 percentile=",mz_25,
      "MagnX 75 percentile=",mx_75,"MagnY 75 percentile=",my_75,"MagnZ 75 percentile=",mz_75,
      )

In [None]:
t = time_vec1
plt.figure(figsize=(100,40))
plt.subplot(131)
plt.title('Mean and Median of AccX/Y/Z', size=100)
plt.xlabel('Value', size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.axvline(ax_mean, color='g', linestyle="-",linewidth=5.0)
plt.axvline(ay_mean, color='b', linestyle='-',linewidth=5.0)
plt.axvline(az_mean, color='r', linestyle='-',linewidth=5.0)
plt.axvline(ax_median, color='g', linestyle="--",linewidth=5.0)
plt.axvline(ay_median, color='b', linestyle='--',linewidth=5.0)
plt.axvline(az_median, color='r', linestyle='--',linewidth=5.0)
print('Outcome: mean>median :positively skewed distribution.')
#A positively skewed distribution means that a majority of the observations are rather small relative to the rest of the distribution. 
plt.subplot(132)
# plt.figure(figsize=(60,40))
plt.title('Mean and Median of GyroX/Y/Z', size=100)
plt.xlabel('Value', size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.axvline(gx_mean, color='g', linestyle="-",linewidth=5.0)
plt.axvline(gy_mean, color='b', linestyle='-',linewidth=5.0)
plt.axvline(gz_mean, color='r', linestyle='-',linewidth=5.0)
plt.axvline(gx_median, color='g', linestyle="--",linewidth=5.0)
plt.axvline(gy_median, color='b', linestyle='--',linewidth=5.0)
plt.axvline(gz_median, color='r', linestyle='--',linewidth=5.0)
print('Outcome: medians and mediums are similar, it is kind of symmetrical distributed.')
plt.subplot(133)
# plt.figure(figsize=(60,40))
plt.title('Mean and Median of MagnX/Y/Z', size=100)
plt.xlabel('Value', size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.axvline(mx_mean, color='g', linestyle="-",linewidth=5.0)
plt.axvline(my_mean, color='b', linestyle='-',linewidth=5.0)
plt.axvline(mz_mean, color='r', linestyle='-',linewidth=5.0)
plt.axvline(mx_median, color='g', linestyle="--",linewidth=5.0)
plt.axvline(my_median, color='b', linestyle='--',linewidth=5.0)
plt.axvline(mz_median, color='r', linestyle='--',linewidth=5.0)
print('Outcome: median>mean : negatively skewed distribution')


#The standard deviation is a measure of spread. 
#We use it as a measure of spread when we use the mean as a measure of center.
x = data['AccX'] 
plt.figure(figsize=(100,40))
plt.title('AccX',size=100)
plt.xlabel('time',size=100)
plt.axhline(ax_mean, color='g', linestyle="-",linewidth=5.0)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is big (232) and the spread is big.')


x = data['AccY'] 
plt.figure(figsize=(100,40))
plt.title('AccY',size=100)
plt.xlabel('time',size=100)
plt.axhline(ay_mean, color='b', linestyle='-',linewidth=5.0)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is not too big (62) and the spread is big.')

x = data['AccZ'] 
plt.figure(figsize=(100,40))
plt.title('AccZ',size=100)
plt.axhline(az_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is big (215) and the spread is big.')

x = data['GyroX'] 
plt.figure(figsize=(100,40))
plt.title('GyroX',size=100)
plt.axhline(gx_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is low (13) and the spread is big. Mean and Median are similar.')

x = data['GyroY'] 
plt.figure(figsize=(100,40))
plt.title('GyroY',size=100)
plt.axhline(gy_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is low (17) and the spread is big.')


x = data['GyroZ'] 
plt.figure(figsize=(100,40))
plt.title('GyroZ',size=100)
plt.axhline(gz_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is low (10) and the spread is big.')


x = data['MagnX'] 
plt.figure(figsize=(100,40))
plt.title('MagnX',size=100)
plt.axhline(mx_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.plot(t,x)
plt.show()
print('standard deviation is low (67) and the spread is big.')

x = data['MagnY'] 
plt.figure(figsize=(100,40))
plt.title('MagnY',size=100)
plt.axhline(my_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is low (16) and the spread is big.')

x = data['MagnZ'] 
plt.figure(figsize=(100,40))
plt.title('MagnZ',size=100)
plt.axhline(mz_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is low (123) and the spread is big.')

Correlation:<br>
Correlation analysis gives us the linear relationship between two variables. <br>
The correlation coefficient between two variables is a statistical measure of the strength of the relationship between the relative movements of two variables.
A correlation of -1.0 shows a perfect negative correlation, while a correlation of 1.0 shows a perfect positive correlation. A correlation of 0.0 shows no linear relationship between the movement of the two variables.

In [None]:
data1_corr = data.loc[:,["AccX","AccY","AccZ","GyroX","GyroY","GyroZ","MagnX","MagnY","MagnZ"]]
data1_corr.corr(method='pearson', min_periods=1)

For this dataset we see that Gyroscope of X, Y and Z have close to zero correlation coefficient with the other variables. 
This means that Gyroscope values are least dependent on the other variables making it reliable to use for analysis. 
With low correlation values and small deviation values, we assume that Gyroscope columns are the best for use of our purpose.

STATISTICAL ANALYSIS FOR OPTIONAL DATASET 

In [None]:
file_name="1_Stave_supine_static.txt"#"C:/Users/Xabier Galar/High Level project/LCP_projects_Y3-Group25/1_Stave_supine_static.txt"
data2=pd.read_csv(file_name, delimiter="\t")
time_step2 = 1/100
index = data2.index
number_of_rows2 = len(index)
time_vec2 = np.arange(0, number_of_rows2 * time_step2, time_step2)
data2.insert(0, 'new_seconds', time_vec2)
data2 =data2.drop(columns=['Log Mode', 'Log Freq', 'Timestamp'])
data2

In [None]:
ax_median=(data2.loc[:,"AccX"]).median(axis = 0)
ax_mean=(data2.loc[:,"AccX"]).mean(axis = 0)
ax_var=(data2.loc[:,"AccX"]).var(axis = 0)
ax_std=(data2.loc[:,"AccX"]).std(axis = 0)
ax_25=(data2.loc[:,"AccX"]).quantile(0.25)
ax_75=(data2.loc[:,"AccX"]).quantile(0.75)

ay_median=(data2.loc[:,"AccY"]).median(axis = 0)
ay_mean=(data2.loc[:,"AccY"]).mean(axis = 0)
ay_var=(data2.loc[:,"AccY"]).var(axis = 0)
ay_std=(data2.loc[:,"AccY"]).std(axis = 0)
ay_25=(data2.loc[:,"AccY"]).quantile(0.25)
ay_75=(data2.loc[:,"AccY"]).quantile(0.75)

az_median=(data2.loc[:,"AccZ"]).median(axis = 0)
az_mean=(data2.loc[:,"AccZ"]).mean(axis = 0)
az_var=(data2.loc[:,"AccZ"]).var(axis = 0)
az_std=(data2.loc[:,"AccZ"]).std(axis = 0)
az_25=(data2.loc[:,"AccZ"]).quantile(0.25)
az_75=(data2.loc[:,"AccZ"]).quantile(0.75)
print("AccX mean=",ax_mean,"AccY mean=",ay_mean,"AccZ mean=",az_mean,'\n',
      "AccX median=",ax_median,"AccY median=",ay_median,"AccZ median=",az_median,'\n',
      "AccX variance=",ax_var,"AccY variance=",ay_var,"AccZ variance=",az_var,'\n',
      "AccX standard dev.=",ax_std,"AccY standard dev.=",ay_std,"AccZ standard dev.=",az_std,'\n'
      ,"AccX 25 percentile=",ax_25,"AccY 25 percentile=",ay_25,"AccZ 25 percentile=",az_25,
      "AccX 75 percentile=",ax_75,"AccY 75 percentile=",ay_75,"AccZ 75 percentile=",az_75,
      )

In [None]:
gx_median=(data2.loc[:,"GyroX"]).median(axis = 0)
gx_mean=(data2.loc[:,"GyroX"]).mean(axis = 0)
gx_var=(data2.loc[:,"GyroX"]).var(axis = 0)
gx_std=(data2.loc[:,"GyroX"]).std(axis = 0)
gx_25=(data2.loc[:,"GyroX"]).quantile(0.25)
gx_75=(data2.loc[:,"GyroX"]).quantile(0.75)

gy_median=(data2.loc[:,"GyroY"]).median(axis = 0)
gy_mean=(data2.loc[:,"GyroY"]).mean(axis = 0)
gy_var=(data2.loc[:,"GyroY"]).var(axis = 0)
gy_std=(data2.loc[:,"GyroY"]).std(axis = 0)
gy_25=(data2.loc[:,"GyroY"]).quantile(0.25)
gy_75=(data2.loc[:,"GyroY"]).quantile(0.75)

gz_median=(data2.loc[:,"GyroZ"]).median(axis = 0)
gz_mean=(data2.loc[:,"GyroZ"]).mean(axis = 0)
gz_var=(data2.loc[:,"GyroZ"]).var(axis = 0)
gz_std=(data2.loc[:,"GyroZ"]).std(axis = 0)
gz_25=(data2.loc[:,"GyroZ"]).quantile(0.25)
gz_75=(data2.loc[:,"GyroZ"]).quantile(0.75)
print("GyroX mean=",gx_mean,"GyroY mean=",gy_mean,"GyroZ mean=",gz_mean,'\n',
      "GyroX median=",gx_median,"GyroY median=",gy_median,"GyroZ median=",gz_median,'\n',
      "GyroX variance=",gx_var,"GyroY variance=",gy_var,"GyroZ variance=",gz_var,'\n',
      "GyroX standard dev.=",gx_std,"GyroY standard dev.=",gy_std,"GyroZ standard dev.=",gz_std,'\n'
      ,"GyroX 25 percentile=",gx_25,"GyroY 25 percentile=",gy_25,"GyroZ 25 percentile=",gz_25,
      "GyroX 75 percentile=",gx_75,"GyroY 75 percentile=",gy_75,"GyroZ 75 percentile=",gz_75,
      )

In [None]:
mx_median=(data2.loc[:,"MagnX"]).median(axis = 0)
mx_mean=(data2.loc[:,"MagnX"]).mean(axis = 0)
mx_var=(data2.loc[:,"MagnX"]).var(axis = 0)
mx_std=(data2.loc[:,"MagnX"]).std(axis = 0)
mx_25=(data2.loc[:,"MagnX"]).quantile(0.25)
mx_75=(data2.loc[:,"MagnX"]).quantile(0.75)

my_median=(data2.loc[:,"MagnY"]).median(axis = 0)
my_mean=(data2.loc[:,"MagnY"]).mean(axis = 0)
my_var=(data2.loc[:,"MagnY"]).var(axis = 0)
my_std=(data2.loc[:,"MagnY"]).std(axis = 0)
my_25=(data2.loc[:,"MagnY"]).quantile(0.25)
my_75=(data2.loc[:,"MagnY"]).quantile(0.75)

mz_median=(data2.loc[:,"MagnZ"]).median(axis = 0)
mz_mean=(data2.loc[:,"MagnZ"]).mean(axis = 0)
mz_var=(data2.loc[:,"MagnZ"]).var(axis = 0)
mz_std=(data2.loc[:,"MagnZ"]).std(axis = 0)
mz_25=(data2.loc[:,"MagnZ"]).quantile(0.25)
mz_75=(data2.loc[:,"MagnZ"]).quantile(0.75)

print("MagnX mean=",mx_mean,"MagnY mean=",my_mean,"MagnZ mean=",mz_mean,'\n',
      "MagnX median=",mx_median,"MagnY median=",my_median,"MagnZ median=",mz_median,'\n',
      "MagnX variance=",mx_var,"MagnY variance=",my_var,"MagnZ variance=",mz_var,'\n',
      "MagnX standard dev.=",mx_std,"MagnY standard dev.=",my_std,"MagnZ standard dev.=",mz_std,'\n'
      ,"MagnX 25 percentile=",mx_25,"MagnY 25 percentile=",my_25,"MagnZ 25 percentile=",mz_25,
      "MagnX 75 percentile=",mx_75,"MagnY 75 percentile=",my_75,"MagnZ 75 percentile=",mz_75,
      
      )

In [None]:
t = time_vec2
plt.figure(figsize=(100,40))
plt.subplot(131)
plt.title('Mean and Median of AccX/Y/Z', size=100)
plt.xlabel('Value', size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.axvline(ax_mean, color='g', linestyle="-",linewidth=5.0)
plt.axvline(ay_mean, color='b', linestyle='-',linewidth=5.0)
plt.axvline(az_mean, color='r', linestyle='-',linewidth=5.0)
plt.axvline(ax_median, color='g', linestyle="--",linewidth=5.0)
plt.axvline(ay_median, color='b', linestyle='--',linewidth=5.0)
plt.axvline(az_median, color='r', linestyle='--',linewidth=5.0)
print('Outcome: mean<medium: positively skewed distribution')

plt.subplot(132)
plt.title('Mean and Median of GyroX/Y/Z', size=100)
plt.xlabel('Value', size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.axvline(gx_mean, color='g', linestyle="-",linewidth=5.0)
plt.axvline(gy_mean, color='b', linestyle='-',linewidth=5.0)
plt.axvline(gz_mean, color='r', linestyle='-',linewidth=5.0)
plt.axvline(gx_median, color='g', linestyle="--",linewidth=5.0)
plt.axvline(gy_median, color='b', linestyle='--',linewidth=5.0)
plt.axvline(gz_median, color='r', linestyle='--',linewidth=5.0)
print('Outcome: medians and mediums are similar, it is kind of symmetrical distributed.')

plt.subplot(133)
plt.title('Mean and Median of MagnX/Y/Z', size=100)
plt.xlabel('Value', size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.axvline(mx_mean, color='g', linestyle="-",linewidth=5.0)
plt.axvline(my_mean, color='b', linestyle='-',linewidth=5.0)
plt.axvline(mz_mean, color='r', linestyle='-',linewidth=5.0)
plt.axvline(mx_median, color='g', linestyle="--",linewidth=5.0)
plt.axvline(my_median, color='b', linestyle='--',linewidth=5.0)
plt.axvline(mz_median, color='r', linestyle='--',linewidth=5.0)
print('Outcome: mean<medium: positively skewed distribution')

x = data2['AccX'] 
plt.figure(figsize=(100,40))
plt.title('AccX',size=100)
plt.xlabel('time',size=100)
plt.axhline(ax_mean, color='g', linestyle="-",linewidth=5.0)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is big (172) and the spread is big.')

x = data2['AccY'] 
plt.figure(figsize=(100,40))
plt.title('AccY',size=100)
plt.xlabel('time',size=100)
plt.axhline(ay_mean, color='b', linestyle='-',linewidth=5.0)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is big (182) and the spread is big.')

x = data2['AccZ'] 
plt.figure(figsize=(100,40))
plt.title('AccZ',size=100)
plt.axhline(az_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is big (485) and the spread is big.')

x = data2['GyroX'] 
plt.figure(figsize=(100,40))
plt.title('GyroX',size=100)
plt.axhline(gx_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is not too  big (81) and the spread is medium size.')

x = data2['GyroY'] 
plt.figure(figsize=(100,40))
plt.title('GyroY',size=100)
plt.axhline(gy_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is not too big (88) and the spread is medium size.')

x = data2['GyroZ'] 
plt.figure(figsize=(100,40))
plt.title('GyroZ',size=100)
plt.axhline(gz_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is not too big (53) and the spread is medium size.')

x = data2['MagnX'] 
plt.figure(figsize=(100,40))
plt.title('MagnX',size=100)
plt.axhline(mx_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.plot(t,x)
plt.show()
print('standard deviation is not too big (93) and the spread is medium size.')

x = data2['MagnY'] 
plt.figure(figsize=(100,40))
plt.title('MagnY',size=100)
plt.axhline(my_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is not too big (68) and the spread is medium.')

x = data2['MagnZ'] 
plt.figure(figsize=(100,40))
plt.title('MagnZ',size=100)
plt.axhline(mz_mean, color='r', linestyle='-',linewidth=5.0)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.show()
print('standard deviation is not  big (235) and the spread is big.')

In [None]:
#Correlation
data2_corr = data2.loc[:,["AccX","AccY","AccZ","GyroX","GyroY","GyroZ","MagnX","MagnY","MagnZ"]]
data2_corr.corr(method='pearson', min_periods=1)

Here again, looking at the small values of Gyroscope X, Y and Z and backing with the previous statistical analysis proving smaller deviations for Gyroscopic data points, we assume that Gyroscope columns are most suitable for our data analysis.

Back to analysis of Mandatory Dataset

In [None]:
# Simple FFT

def plotFFT(df, T):
    index = df.index
    number_of_rows = len(index)
    N = number_of_rows
    x = np.linspace(0.0, N*T, N, endpoint=False)
    plt.rcParams.update({'font.size': 10})
    f, ax = plt.subplots(int(math.ceil((len(df.columns)-1)/4)),4, figsize=(15,15))
    ax = ax.ravel()
    
    for i in range(0, len(df.columns)-1, 1):
        y = df.iloc[:, i+1].to_numpy()
        ax[i].set_title(df.columns[i+1])
        yf = fft(y)
        xf = fftfreq(N, T)[:N//2]
        ax[i].plot(xf, 2.0/N * np.abs(yf[0:N//2]))
        ax[i].set_xlim(-1,30)
        ax[i].grid()

# Made by Xabier Galar

In [None]:
plotFFT(data, time_step1)

# Made by Xabier Galar

Very large peak at 0Hz in some of the columns due to offset, it must be removed to better appreciate the FT

In [None]:
# Remove offset from the signals
def offsetRemove(df):
    df2 = pd.DataFrame(df)
    for i in range(1, len(df2.columns), 1):
        while np.mean(df2.iloc[:, i].to_numpy()) < -0.2 or np.mean(df2.iloc[:, i].to_numpy()) > 0.2:
            df2.iloc[:, i]= df2.iloc[:, i] - np.mean(df2.iloc[:, i].to_numpy())
    return df2

# Made by Xabier Galar

In [None]:
plotFFT(offsetRemove(data), time_step1)

# Made by Xabier Galar

All signals have a peak very close to 0Hz, probably heart beats. Remember that the usual Bpm is normally between 40 and 100 (0.67 - 1.67 Hz).

In order to stay with this range of frequencies and discard the others, we proceed with a filtering

In [None]:
# Some functions to perform the filtering

# Obtain the parameters of Butterworth Bandpass filter
def butter_bandpass(lowcut, highcut, fs, order=3):
    nyq = 0.5 * fs
    low = lowcut / nyq
    high = highcut / nyq
    b, a = butter(order, [low, high], btype='band')
    return b, a

# Apply the filter
def butter_bandpass_filter(data, lowcut, highcut, fs, order=5):
    b, a = butter_bandpass(lowcut, highcut, fs, order=order)
    y = lfilter(b, a, data)
    return y

# apply the filter and plot the original ande the filtered data
def butterBandFilterAndPlot(data, lowcut, highcut, fs, order,t):
    plt.figure(figsize=(20,10))
    plt.plot(t, data, label='Noisy signal')
    plt.legend(loc='upper left')
    plt.show()
    plt.figure(figsize=(20,10))
    y = butter_bandpass_filter(data, lowcut, highcut, fs, order)
    plt.plot(t, y, label='Filtered signal')
    plt.legend(loc='upper left')
    plt.show()
    return y

# Obtain the frequency response of a Butterworth Bandpass filter
def bandFreqResponse(lowcut, highcut, fs, order):    
    plt.figure()
    b, a = butter_bandpass(lowcut, highcut, fs, order=order)
    w, h = freqz(b, a, worN=2000)
    plt.plot((fs * 0.5 / np.pi) * w, abs(h), 'b')

    plt.plot([0, 0.5 * fs], [np.sqrt(0.5), np.sqrt(0.5)],
             '--')
    plt.xlabel('Frequency (Hz)')
    plt.ylabel('Gain')
    plt.title("Butterworth Bandpass order %d FR" % order)
    plt.grid(True)
    plt.xlim(0,highcut*3)
    plt.xticks(np.arange(0, int(highcut*3), step=0.5), rotation = 45)
    plt.show()

# Obtain the parameters of Butterworth Lowpas filter
def butter_lowpass(cutoff, fs, order=5):
    nyq = 0.5 * fs
    normal_cutoff = cutoff / nyq
    b, a = butter(order, normal_cutoff, btype='low', analog=False)
    return b, a

# Apply the filter
def butter_lowpass_filter(data, cutoff, fs, order=5):
    b, a = butter_lowpass(cutoff, fs, order=order)
    y = lfilter(b, a, data)
    return y
    
# apply the filter and plot the original ande the filtered data    
def butterLowFilterAndPlot(data, highcut, fs, order,t):
    plt.figure(figsize=(20,10))
    plt.plot(t, data, label='Noisy signal')
    plt.legend(loc='upper left')
    plt.show()
    plt.figure(figsize=(20,10))
    y = butter_lowpass_filter(data, highcut, fs, order)
    plt.plot(t, y, label='Filtered signal')
    plt.legend(loc='upper left')
    plt.show()
    return y

# Obtain the frequency response of the Butterworth Lowpas filter
def lowFreqResponse(cutoff, fs, order):
    # Get the filter coefficients so we can check its frequency response.
    b, a = butter_lowpass(cutoff, fs, order)
    # Plot the frequency response.
    w, h = freqz(b, a, worN=8000)
    plt.figure()
    plt.plot(0.5*fs*w/np.pi, np.abs(h), 'b')
    plt.plot(cutoff, 0.5*np.sqrt(2), 'ko')
    plt.axvline(cutoff, color='k')
    plt.xlim(0, 0.5*fs)
    plt.title("Lowpass order %d FR" %order)
    plt.xlabel('Frequency [Hz]')
    plt.ylabel('Gain')
    plt.grid()
    plt.xlim(0,cutoff*3)
    plt.xticks(np.arange(0, int(cutoff*3), step=0.5), rotation = 45)
    plt.show()
    
# Create a new dataframe with applying the bandapss filter, and combine some of those columns
def createFilteredDF(df,lowcut, highcut, fs, order):

    filtered = pd.DataFrame(columns=df.columns)
    filtered['new_seconds']=df['new_seconds']
    for i in range(0, len(df.columns)-1, 1):
        x = df.iloc[:, i+1].to_numpy()
        y = butter_bandpass_filter(x, lowcut, highcut, fs, order)
        filtered.iloc[:, i+1] = y

    index = filtered.index
    number_of_rows2 = len(index)
#     Combine the Acc columns
    y = np.zeros([number_of_rows2,])
    for i in [1,2,3]:
        x = filtered.iloc[:, i].to_numpy()
        y = y + butter_bandpass_filter(x, lowcut, highcut, fs, order)
    filtered['Acc'] = y
#     Combine the Gyro columns
    y = np.zeros([number_of_rows2,]) 
    for i in [4,5,6]:
        x = filtered.iloc[:, i].to_numpy()
        y = y + butter_bandpass_filter(x, lowcut, highcut, fs, order)
    filtered['Gyro'] = y
#     Combine the Magn columns
    y = np.zeros([number_of_rows2,])
    for i in [7,8,9]:
        x = filtered.iloc[:, i].to_numpy()
        y = y + butter_bandpass_filter(x, lowcut, highcut, fs, order)
    filtered['Magn'] = y
# Combine all the columns
    y = np.zeros([number_of_rows2,])
    for i in range(1, len(df.columns)-1,1):
        x = filtered.iloc[:, i].to_numpy()
        y = y + butter_bandpass_filter(x, lowcut, highcut, fs, order)
    filtered['All'] = y
    filtered
    return filtered

# Made by Xabier Galar

In [None]:
# Filter requirements.
order = 8
fs = 200       # sample rate, Hz
lowcut = 40/60  
highcut = 100/60
lowFreqResponse(250/60, fs, order)
bandFreqResponse(lowcut,highcut,fs,3)

# Made by Xabier Galar

In [None]:
# Filtering some data, it works pretty well with some axes
fs = 200.0          #Sample frequency
lowcut = 40/60      
highcut = 100/60
order=2
%matplotlib inline
t = time_vec1
y =butterBandFilterAndPlot(data['AccZ'].to_numpy(),lowcut, highcut, fs, 3, t)
_ = butterLowFilterAndPlot(data['AccZ'].to_numpy(), 200/60, fs, 8, t)





# Made by Xabier Galar

In [None]:
# F analyses on mandatory data(x,y,and z) by Ali

fft_man = fftpack.fft(list(data['AccZ'].to_numpy()))
power_man = np.abs(fft_man)
sample_freq = fftpack.fftfreq(number_of_rows1, time_step1)

pos_mask = np.where(np.logical_and(sample_freq > lowcut, sample_freq < highcut))
# pos_mask = np.where(sample_freq > 0.15)
freqs = sample_freq[pos_mask]
peak_freq = freqs[power_man[pos_mask].argmax()]
spectrum = power_man[pos_mask]

antitransformed_man = fftpack.ifft(spectrum)
# beats_man , _man = find_peaks(antitransformed_man, threshold=24)

plt.figure(figsize=(10, 6))
plt.plot(np.linspace(0,75,len(freqs)), np.abs(antitransformed_man), label='Filtered signal')
# plt.plot(np.linspace(0,75,len(beats_man)), antitransformed_man[beats_man])
plt.xlabel('Time [s]')
plt.ylim(-10,10000)
plt.ylabel('Amplitude')
plt.legend(loc='best')


After a few tests it is concluded that the best filter is the bandpass butterworth filter of order 3.

Now we proceed to identify the peaks that represent the pulsations.
It is done by two methods, with the specific heartpy library and with the sipy function find_peaks.

In [None]:
# Proccesing the data with the library
wd, m = hp.process(y, sample_rate = 200.0)
plt.figure(figsize=(12,4))
hp.plotter(wd, m)
#display measures computed
for measure in m.keys():
    print('%s: %f' %(measure, m[measure]))
    
# Made by Xabier Galar

In [None]:
# Procces the data "by hand"
peaks, _ = find_peaks(y, distance=100)
plt.figure(figsize=(15,8))
plt.plot(data['new_seconds'], y)
plt.plot(data['new_seconds'][peaks], y[peaks], "x")
plt.xlabel('Time (s)')
plt.show()
print('Average BPM:')
print(1/((peaks[-1]-peaks[0])/len(peaks)*0.005)*60)

# Made by Xabier Galar

In [None]:
# Given a df, filters and plots all columns finding peaks
def rawPeaks(df, distance, lowcut, highcut, fs, order):
    f, ax = plt.subplots(len(df.columns)-1,1, figsize=(15,40))
    for i in range(0, len(df.columns)-1, 1):

        x = df.iloc[:, i+1].to_numpy()
        ax[i].set_title(data.columns[i+1])
        ax[i].plot(df['new_seconds'],x, 'c', label= 'Original')
        y = butter_bandpass_filter(x, lowcut, highcut, fs, order)
        ax[i].plot(df['new_seconds'], y, 'k', label= 'Filtered')
        peaks, _ = find_peaks(y, distance=distance)
        ax[i].plot(df['new_seconds'],y)
        ax[i].plot(df['new_seconds'][peaks], y[peaks], "x")
        ax[i].text(0.5, 0.1,'Detected peacks : %d' %len(peaks), horizontalalignment='center', verticalalignment='center', transform = ax[i].transAxes)
        ax[i].legend()

# Made by Xabier Galar

In [None]:
distance = 100
lowcut= 40/60
highcut =100/60 
fs= 200
order = 3
rawPeaks(data, distance, lowcut, highcut, fs, order)

# Made by Xabier Galar

It is seen that in all the data columns a fairly similar number of peaks can be counted, so it is decided to go ahead with all the data columns and also try some combinations of them.

we proceed to create a new df with the filtered columns and some combinations of them

In [None]:
lowcut= 40/60
highcut =100/60 
fs= 200
order = 3
filtered1 = createFilteredDF(data, lowcut, highcut,fs, order)

# Made by Xabier Galar

In [None]:
# Given a df, plots all columns and finds peaks
def rawFilteredBPM(df, distance):
    plt.rcParams.update({'font.size': 12})
    plt.rc('xtick', labelsize=14) 
    plt.rc('ytick', labelsize=14) 
    f, ax = plt.subplots(len(df.columns)-1,1, figsize=(15,55))

    for i in range(0, len(df.columns)-1, 1):
        x = df.iloc[:, i+1].to_numpy()
        ax[i].set_title(filtered1.columns[i+1])
        ax[i].plot(df['new_seconds'], x, 'c', label= 'Filtered')
        peaks, _ = find_peaks(x, distance=distance)
        ax[i].plot(df['new_seconds'][peaks], x[peaks], "xr",)
        ax[i].text(0.5, 0.1,'Detected peacks : %d' %len(peaks), horizontalalignment='center', verticalalignment='center', transform = ax[i].transAxes)
        ax[i].legend()

# Made by Xabier Galar

In [None]:
rawFilteredBPM(filtered1, 100)

# Made by Xabier Galar

In [None]:
# rawFilteredBPM(data, 100)

Again, it can be seen that a similar number of peaks are detected in all the columns, so it is decided to calculate the BPM in all of them and choose the median, to achieve a more stable result.

Looking at the FT of the filtered data it can be seen that now only frequencies remain within the BPM range

In [None]:
plotFFT(filtered1, time_step1)

# Made by Xabier Galar

In [None]:
def measureBeats(x, p, s, T):      #Given a column and the peaks of the column calculates
    y = np.zeros_like(x)           #the BPM in an array of the lenght of the column
    yb =0
    n=0
    p = np.pad(p,(0,s),'constant')
    for i in range(len(x)):
        if i == p[n]:
            yb = 60/((p[n+s]-i)/s * T)
            n = n+1
        y[i]= yb
    return y

# Calculates the BPM of all the columns
def BPM(df, distance, s, T, number_of_rows):
    y = np.zeros([len(df.columns)-1, number_of_rows])
    result = np.zeros([number_of_rows,])
    
    for i in range(0, len(df.columns)-1, 1):
        x = df.iloc[:, i+1].to_numpy()
        peaks, _ = find_peaks(x, distance=distance)
        y[i,:] = measureBeats(x, peaks, s, T)
    return y

# Given all BPM of all columns calculates the median BPM and plots
def medianBPMandPlot(bpm, filtered2):
    result = ndimage.median_filter(bpm, size=(bpm.shape[0],1))
    plt.figure(figsize=(20,10))
    for i in range(bpm.shape[0]):
        plt.plot(filtered2['new_seconds'], bpm[i,:], '--', label=filtered2.columns[i+1])
    plt.plot(filtered2['new_seconds'], result[int((bpm.shape[0]-1)/2),:], 'r', linewidth=5, label= 'Median BPM')
#     plt.ylim(30,55)
    plt.title('BPM of all data and Median BPM')
    plt.xlabel('Time (s)')
    plt.legend()
    return result[int((bpm.shape[0]-1)/2),:]

# Given all BPM of all columns calculates the median BPM
def medianBPM(bpm, filtered2):
    result = ndimage.median_filter(bpm, size=(bpm.shape[0],1))
    return result[int((bpm.shape[0]-1)/2),:]

# Made by Xabier Galar

In [None]:
def preProcessDF(df, fs):
    time_step = 1/fs
    index = df.index
    number_of_rows = len(index)
    time_vec = np.arange(0, number_of_rows * time_step, time_step)
    df.insert(0, 'new_seconds', time_vec)
    df =df.drop(columns=['Log Mode', 'Log Freq', 'Timestamp'])
    return df

def findBPM(df, fs):
    lowcut= 40/60
    highcut =100/60 
    order = 3
    time_step = 1/fs
    index = df.index
    number_of_rows = len(index)
    filtered = createFilteredDF(df, lowcut, highcut,fs, order)
    bpm =BPM(filtered, 100, 10, time_step, number_of_rows)
    y = medianBPMandPlot(bpm, filtered)
    return y, filtered

def rawVariances(df, bot, top):    
    for i in range(0, len(df.columns)-1, 1):
        y = df.iloc[:, i+1].to_numpy()
        print(df.columns[i+1],"  variance is %s" %(round(statistics.pvariance(df.iloc[:, i+1][bot:top]),2)))

def histoAndVariances(df, filtered, y, bot, top):
    plt.figure()
    _ = plt.hist(y[bot:top], bins=10)
    print('No filtered')
    rawVariances(df, bot, top)
    print('Filtered')
    rawVariances(filtered, bot, top)
    print("Median variance is %s" %(statistics.pvariance(y[2000:12000])))
    
# Made by Xabier Galar

In [None]:
bpm =BPM(filtered1, 100, 10, time_step1, number_of_rows1)
y = medianBPMandPlot(bpm, filtered1)

# Made by Xabier Galar

In [None]:
_ = plt.hist(y[2000:12000], bins=40)

print('No filtered')
rawVariances(data, 2000, 12000)
print('Filtered')
rawVariances(filtered1, 2000, 12000)
print("Median variance is %s" 
      %(statistics.pvariance(y[2000:12000])))

# Made by Xabier Galar

In [None]:
rawVariances(data, 2000,12000)

# Made by Xabier Galar

In [None]:
file_name="1_Stave_supine_static.txt"
data2=pd.read_csv(file_name, delimiter="\t")
data2
time_step2 = 1/100
index = data2.index
number_of_rows2 = len(index)
time_vec2 = np.arange(0, number_of_rows2 * time_step2, time_step2)
data2.insert(0, 'new_seconds', time_vec2)
data2 =data2.drop(columns=['Log Mode', 'Log Freq', 'Timestamp'])
# data2

# Made by Xabier Galar

In [None]:
# Plotting the raw data
%matplotlib inline
plt.rcParams.update({'font.size': 30})
plt.figure()
data2.plot(subplots=True, layout=(len(data2.columns),1),figsize=(40,100))

plt.tight_layout()
plt.show()

# Made by Xabier Galar

In [None]:
plotFFT(data2, time_step2)

# Made by Xabier Galar

In [None]:
plotFFT(offsetRemove(data), time_step1)

# Made by Xabier Galar

In [None]:
distance = 100
lowcut= 40/60
highcut =100/60 
fs= 100
order = 3
rawPeaks(data2, distance, lowcut, highcut, fs, order)

# Made by Xabier Galar

In [None]:
lowcut= 40/60
highcut =100/60 
fs= 100
order = 3
filtered2 = createFilteredDF(data2, lowcut, highcut,fs, order)

# Made by Xabier Galar

In [None]:
rawFilteredBPM(filtered2,100)

# Made by Xabier Galar

In [None]:
plotFFT(filtered2, time_step2)

# Made by Xabier Galar

In [None]:
bpm =BPM(filtered2, 100, 10, time_step2, number_of_rows2)
y = medianBPMandPlot(bpm, filtered2)

# Made by Xabier Galar

In [None]:
_ = plt.hist(y[500:4500], bins=40)
print("Median BPM variance is %s" 
      %(statistics.pvariance(y[500:4500]))) 
print("AccX filtered variance is %s" 
      %(statistics.pvariance(bpm[0,500:4500]))) 

# Made by Xabier Galar

In [None]:
file_name="2_Mattress_supine.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)

y, filtered = findBPM(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

In [None]:
file_name="3_Subject_sitting_chair.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)

y, filtered = findBPM(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

In [None]:
file_name="4_Chest_sweater.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)

y, filtered = findBPM(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

In [None]:
file_name="5_Under_chair.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)

y, filtered = findBPM(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

# What about the Respiratoy Rate?

Same technique but with different cut frequencyes, normal RR in healthy person: 12 - 18 Respirations per minute.

In [None]:
# Filter requirements.
order = 3
fs = 200       # sample rate, Hz
lowcut = 12/60  
highcut = 18/60
lowFreqResponse(250/60, fs, order)
bandFreqResponse(lowcut,highcut,fs, order)

# Made by Xabier Galar

In [None]:
distance = 100
lowcut= 12/60
highcut =18/60 
fs= 200
order = 3
rawPeaks(data, distance, lowcut, highcut, fs, order)

# Made by Xabier Galar

In [None]:
lowcut= 12/60
highcut =18/60 
fs= 200
order = 3
filtered_1 = createFilteredDF(data, lowcut, highcut,fs, order)

# Made by Xabier Galar

In [None]:
rawFilteredBPM(filtered_1, 100)

# Made by Xabier Galar

In [None]:
plotFFT(filtered_1, time_step1)

# Made by Xabier Galar

In [None]:
# Calculates the BPM of all the columns
def RR(df, distance, s, T, number_of_rows):
    y = np.zeros([len(df.columns)-1, number_of_rows])
    result = np.zeros([number_of_rows,])
    
    for i in range(0, len(df.columns)-1, 1):
        x = df.iloc[:, i+1].to_numpy()
        peaks, _ = find_peaks(x, distance=distance)
        y[i,:] = measureBeats(x, peaks, s, T)
    return y

# Given all BPM of all columns calculates the median BPM and plots
def medianRRandPlot(bpm, filtered2):
    result = ndimage.median_filter(bpm, size=(bpm.shape[0],1))
    plt.figure(figsize=(20,10))
    for i in range(bpm.shape[0]):
        plt.plot(filtered2['new_seconds'], bpm[i,:], '--', label=filtered2.columns[i+1])
    plt.plot(filtered2['new_seconds'], result[int((bpm.shape[0]-1)/2),:], 'r', linewidth=5, label= 'Median BPM')
#     plt.ylim(30,55)
    plt.title('RR of all data and Median RR')
    plt.xlabel('Time (s)')
    plt.legend()
    return result[int((bpm.shape[0]-1)/2),:]

# Given all BPM of all columns calculates the median BPM
def medianRR(bpm, filtered2):
    result = ndimage.median_filter(bpm, size=(bpm.shape[0],1))
    return result[int((bpm.shape[0]-1)/2),:]

def findRR(df, fs):
    lowcut= 12/60
    highcut =18/60 
    order = 3
    time_step = 1/fs
    index = df.index
    number_of_rows = len(index)
    filtered = createFilteredDF(df, lowcut, highcut,fs, order)
    bpm =RR(filtered, 100, 3, time_step, number_of_rows)
    y = medianRRandPlot(bpm, filtered)
    return y, filtered

# Made by Xabier Galar

In [None]:
rr =RR(filtered_1, 100, 3, time_step1, number_of_rows1)
y = medianRRandPlot(rr, filtered_1)

# Made by Xabier Galar

In [None]:
histoAndVariances(data, filtered_1,y, 2000, 12000)

# Made by Xabier Galar

In [None]:
lowcut= 12/60
highcut =18/60 
fs= 100
order = 3
filtered_2 = createFilteredDF(data2, lowcut, highcut,fs, order)

index = data2.index
number_of_rows2 = len(index)

rr =RR(filtered_2, 100, 3, time_step2, number_of_rows2)
y = medianRRandPlot(rr, filtered_2)


plt.figure()
_ = plt.hist(y[500:4500], bins=40)

print('No filtered')
rawVariances(data, 500, 4500)
print('Filtered')
rawVariances(filtered_2, 500, 4500)
print("Median variance is %s" 
      %(statistics.pvariance(y[500:4500])))

# Made by Xabier Galar

In [None]:
file_name="1_Stave_supine_static.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)
y, filtered = findRR(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

In [None]:
file_name="2_Mattress_supine.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)
y, filtered = findRR(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

In [None]:
file_name="3_Subject_sitting_chair.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)
y, filtered = findRR(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

In [None]:
file_name="4_Chest_sweater.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)
y, filtered = findRR(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

In [None]:
file_name="5_Under_chair.txt"
df=pd.read_csv(file_name, delimiter="\t")
df =preProcessDF(df, 100)
y, filtered = findRR(df, 100)
histoAndVariances(df, filtered, y,500, 4500)

# Made by Xabier Galar

In [None]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt

import scipy as sp


file_name="center_sternum.txt"
data=pd.read_csv(file_name, delimiter="\t")
time_step1 = 1/200
index = data.index
number_of_rows1 = len(index)
time_vec1 = np.arange(0, number_of_rows1 * time_step1, time_step1)
data.insert(0, 'new_seconds', time_vec1)
data =data.drop(columns=['Log Mode', 'Log Freq', 'Timestamp'])
data

# Dropping the begining and the end of the dataset
data = data.drop(data.index[0:2000])
data = data.drop(data.index[10000:16505])

index = data.index
number_of_rows1 = len(index)

T = 0.005
nsamples = 16506
t = np.arange(0, number_of_rows1* T, T)

x = data['AccX'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered AccX',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=5.0)
plt.show()

x = data['AccY'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered AccY',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['AccZ'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered AccZ',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['GyroX'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered GyroX',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['GyroY'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered GyroY',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['GyroZ'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered GyroZ',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['MagnX'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered MagnX',size=100)
plt.xlabel('time',size=100)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['MagnY'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered MagnY',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['MagnZ'] 
y = sp.signal.medfilt(x,21) 
print(y.shape)
plt.figure(figsize=(100,40))
plt.title('Filtered MagnZ',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()


In [None]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt

import scipy as sp
file_name="1_Stave_supine_static.txt"
data=pd.read_csv(file_name, delimiter="\t")
data
time_step2 = 1/100
# Dropping the begining and the end of the dataset


data = data.drop(data.index[0:1500])
data = data.drop(data.index[4500:9170])



index = data.index
number_of_rows2 = len(index)
t= np.arange(0, number_of_rows2 * time_step2, time_step2)
data.insert(0, 'new_seconds', t)
data =data.drop(columns=['Log Mode', 'Log Freq', 'Timestamp'])




sample_rate=200
window_length=21



x = data['AccX'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered AccX', size=100)
plt.xlabel('time', size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=5.0)
plt.show()

x = data['AccY'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered AccY',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['AccZ'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered AccZ',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['GyroX'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered GyroX',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['GyroY'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered GyroY',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['GyroZ'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered GyroZ',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['MagnX'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered MagnX',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['MagnY'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered MagnY',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()

x = data['MagnZ'] 
y=sp.signal.savgol_filter(x, window_length, polyorder=1)
plt.figure(figsize=(100,40))
plt.title('Filtered MagnZ',size=100)
plt.xlabel('time',size=100)
plt.xticks(size = 50)
plt.yticks(size = 50)
plt.plot(t,x)
plt.plot(t,y,c='r',linewidth=7.0)
plt.show()



WAVELET TRANSFORM done by Dyutideepta Banerjee

Wavelet transform is a tool that has high resolution in the frequency domain and also in the time domain, that allows us to know at which frequencies the signal oscillates, and at which time these oscillations occur.<br>
Scaleogram is a special kind of spectogram, but in the case of wavelet the resolution in time vary with the scale value on the Y axis.
In a wavelet formalism, a scaleogram is a 2D representation of a 1D data with X axis as time and Y axis as signal periodicity to which the time transform is sensitive to. The value correspond to the amplitude of the signal variation measured which are location at time X and have a periodicity Y.

References used <br>
https://paos.colorado.edu/research/wavelets/ <br>
https://pywavelets.readthedocs.io/en/latest/ref/thresholding-functions.html <br>
http://profesores.elo.utfsm.cl/~mzanartu/IPD414/Docs/wavelet_ug.pdf <br>
https://github.com/mnf2014/article_fft_wavelet_ecg/blob/develop/wavelet_article_octo.ipynb <br>
https://dsp.stackexchange.com/questions/15823/feature-extraction-reduction-using-dwt <br>
https://www.researchgate.net/post/How_to_run_Wavelet_analysis_in_Python_or_R_or_Matlab <br>
https://www.kaggle.com/asauve/a-gentle-introduction-to-wavelet-for-data-analysis <br><br>

MANDATORY DATASET

I have selected a part of the whole dataframe for Wavelet analysis. <br>
Here, I have manually sliced only 1000 points of the dataframe from the 'good' region of the dataframe.
I have then attempted to understand the behaviour of the signal using wavelet coefficients and scaleogram. 

In [None]:
file_name="center_sternum.txt"
data=pd.read_csv(file_name, delimiter="\t")
time_step1 = 1/200
index = data.index
number_of_rows1 = len(index)
time_vec1 = np.arange(0, number_of_rows1 * time_step1, time_step1)
data.insert(0, 'new_seconds', time_vec1)
data =data.drop(columns=['Log Mode', 'Log Freq', 'Timestamp'])
#Slicing
data_test = data.iloc[5000:6000,:]
t = data_test.iloc[:,0] 
data_test.head()

GyroX

In [None]:
# cat = pywt.thresholding.soft(cA2, np.std(cA2)/2)
# cdt = pywt.thresholding.soft(cD2, np.std(cD2)/2)
normGX = data_test.loc[:,"GyroX"]/max(data_test.loc[:,"GyroX"])
ts = normGX

(ca, cd) = pywt.dwt(ts,'haar') #Finding coefficients of wavelet transform 

cat = pywt.threshold(ca, np.std(ca)/2, mode='soft')
cdt = pywt.threshold(ca, np.std(cd)/2, mode='soft')

#ts_rec = pywt.idwt(cat, cdt, 'haar') #reconstructing a new signal using tramsform coefficients
denoise = denoise_wavelet(ts, method = 'BayesShrink', mode = 'soft', wavelet_levels = 3, 
                          wavelet = 'sym8', rescale_sigma = 'True') #reconstructing a new denoised signal
#plt.close('all')
plt.figure(figsize=(30, 20))
plt.subplot(211)
# Original coefficients
plt.plot(ca, '--*b')
plt.plot(cd, '--*r')
# Thresholded coefficients
plt.plot(cat, '--*c')
plt.plot(cdt, '--*m')
plt.legend(['ca','cd','ca_thresh', 'cd_thresh'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in ec')
plt.ylabel('Value')
plt.title('Coefficient and Threshold of GyroX')

plt.subplot(212)
#plt.plot(ts_rec, 'r')
plt.plot(ts.to_numpy(), '--*r')
plt.plot(denoise, 'b')
#plt.hold('on')
plt.legend(['original signal', 'reconstructed signal'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in sec')
plt.ylabel('Value')
plt.title('Original and Reconstructed Signals of GyroX')


plt.show()

In [None]:
#Continuous wavelet transform

scale = np.arange(1,200)
coef, freqs = pywt.cwt(normGX, scale,'morl') #Finding coefficiets using Morlet meth=hos
#Plotting
plt.imshow(abs(coef), interpolation = 'bilinear', cmap='PRGn', aspect = 'auto', 
           vmax = abs(coef).max(), vmin = abs(coef).min()) # doctest: +SKIP
plt.gca().invert_yaxis()
# plt.yticks(np.arange(1,len(normGX),1))
# plt.xticks(np.arange(0,201,10))
plt.show()
# #plt.plot(abs(coef))

scg.set_default_wavelet('morl')

#nn = 33
signal_length = 1200
# range of scales to perform the transform
scales = scg.periods2scales( np.arange(1, signal_length+1) ) #Setting the scale for scaleogram
x_values_wvt_arr = normGX

# plot the signal 
fig1, ax1 = plt.subplots(1, 1, figsize=(9, 3.5));  
ax1.plot(t, x_values_wvt_arr, linewidth=3, color='blue')
#ax1.set_xlim(0, 2)
ax1.set_title("Norm of GyroX signal")

# the scaleogram
scg.cws(x_values_wvt_arr, scales=scales, figsize=(10, 4.0), coi = False, ylabel="Period", xlabel="Time",
        title='Norm of GyroX signal: scaleogram with linear period'); 

print("Default wavelet function used to compute the transform:", scg.get_default_wavelet(), "(",
      pywt.ContinuousWavelet(scg.get_default_wavelet()).family_name, ")")


GyroY

In [None]:
normGY = data_test.loc[:,"GyroY"]/max(data_test.loc[:,"GyroY"])
ts = normGY

(ca, cd) = pywt.dwt(ts,'haar') #Finding coefficients of wavelet transform 

cat = pywt.threshold(ca, np.std(ca)/2, mode='soft')
cdt = pywt.threshold(ca, np.std(cd)/2, mode='soft')

#ts_rec = pywt.idwt(cat, cdt, 'haar') #reconstructing a new signal using tramsform coefficients
denoise = denoise_wavelet(ts, method = 'BayesShrink', mode = 'soft', wavelet_levels = 3, 
                          wavelet = 'sym8', rescale_sigma = 'True') #reconstructing a new denoised signal
#plt.close('all')
plt.figure(figsize=(30, 20))
plt.subplot(211)
# Original coefficients
plt.plot(ca, '--*b')
plt.plot(cd, '--*r')
# Thresholded coefficients
plt.plot(cat, '--*c')
plt.plot(cdt, '--*m')
plt.legend(['ca','cd','ca_thresh', 'cd_thresh'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in ec')
plt.ylabel('Value')
plt.title('Coefficient and Threshold of GyroY')

plt.subplot(212)
#plt.plot(ts_rec, 'r')
plt.plot(ts.to_numpy(), '--*r')
plt.plot(denoise, 'b')
#plt.hold('on')
plt.legend(['original signal', 'reconstructed signal'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in sec')
plt.ylabel('Value')
plt.title('Original and Reconstructed Signals of GyroY')


plt.show()

In [None]:
#Continuous wavelet transform

scale = np.arange(1,200)
coef, freqs = pywt.cwt(normGY, scale,'morl')

plt.imshow(abs(coef), interpolation = 'bilinear', cmap='PRGn', aspect = 'auto', 
           vmax = abs(coef).max(), vmin = abs(coef).min()) # doctest: +SKIP
plt.gca().invert_yaxis()
# plt.yticks(np.arange(1,len(normGX),1))
# plt.xticks(np.arange(0,201,10))
plt.show()
# #plt.plot(abs(coef))

scg.set_default_wavelet('morl')

#nn = 33
signal_length = 1200
# range of scales to perform the transform
scales = scg.periods2scales( np.arange(1, signal_length+1) )
x_values_wvt_arr = normGY

# plot the signal 
fig1, ax1 = plt.subplots(1, 1, figsize=(9, 3.5));  
ax1.plot(t, x_values_wvt_arr, linewidth=3, color='blue')
#ax1.set_xlim(0, 2)
ax1.set_title("Norm of GyroY signal")

# the scaleogram
scg.cws(x_values_wvt_arr, scales=scales, figsize=(10, 4.0), coi = False, ylabel="Period", xlabel="Time",
        title='Norm of GyroY signal: scaleogram with linear period'); 

print("Default wavelet function used to compute the transform:", scg.get_default_wavelet(), "(",
      pywt.ContinuousWavelet(scg.get_default_wavelet()).family_name, ")")

GyroZ

In [None]:
normGZ = data_test.loc[:,"GyroZ"]/max(data_test.loc[:,"GyroZ"])
ts = normGZ

(ca, cd) = pywt.dwt(ts,'haar') #Finding coefficients of wavelet transform 

cat = pywt.threshold(ca, np.std(ca)/2, mode='soft')
cdt = pywt.threshold(ca, np.std(cd)/2, mode='soft')

#ts_rec = pywt.idwt(cat, cdt, 'haar') #reconstructing a new signal using tramsform coefficients
denoise = denoise_wavelet(ts, method = 'BayesShrink', mode = 'soft', wavelet_levels = 3, 
                          wavelet = 'sym8', rescale_sigma = 'True') #reconstructing a new denoised signal
#plt.close('all')
plt.figure(figsize=(30, 20))
plt.subplot(211)
# Original coefficients
plt.plot(ca, '--*b')
plt.plot(cd, '--*r')
# Thresholded coefficients
plt.plot(cat, '--*c')
plt.plot(cdt, '--*m')
plt.legend(['ca','cd','ca_thresh', 'cd_thresh'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in ec')
plt.ylabel('Value')
plt.title('Coefficient and Threshold of GyroZ')

plt.subplot(212)
#plt.plot(ts_rec, 'r')
plt.plot(ts.to_numpy(), '--*r')
plt.plot(denoise, 'b')
#plt.hold('on')
plt.legend(['original signal', 'reconstructed signal'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in sec')
plt.ylabel('Value')
plt.title('Original and Reconstructed Signals of GyroZ')


plt.show()

In [None]:
#Continuous wavelet transform

scale = np.arange(1,200)
coef, freqs = pywt.cwt(normGZ, scale,'morl')

plt.imshow(abs(coef), interpolation = 'bilinear', cmap='PRGn', aspect = 'auto', 
           vmax = abs(coef).max(), vmin = abs(coef).min()) # doctest: +SKIP
plt.gca().invert_yaxis()
# plt.yticks(np.arange(1,len(normGX),1))
# plt.xticks(np.arange(0,201,10))
plt.show()
# #plt.plot(abs(coef))

scg.set_default_wavelet('morl')

#nn = 33
signal_length = 1200
# range of scales to perform the transform
scales = scg.periods2scales( np.arange(1, signal_length+1) )
x_values_wvt_arr = normGZ

# plot the signal 
fig1, ax1 = plt.subplots(1, 1, figsize=(9, 3.5));  
ax1.plot(t, x_values_wvt_arr, linewidth=3, color='blue')
#ax1.set_xlim(0, 2)
ax1.set_title("Norm of GyroZ signal")

# the scaleogram
scg.cws(x_values_wvt_arr, scales=scales, figsize=(10, 4.0), coi = False, ylabel="Period", xlabel="Time",
        title='Norm of GyroZ signal: scaleogram with linear period'); 

print("Default wavelet function used to compute the transform:", scg.get_default_wavelet(), "(",
      pywt.ContinuousWavelet(scg.get_default_wavelet()).family_name, ")")

OPTIONAL DATASET

Here again, I have manually sliced to have 500 data samples for the wavelet analysis.

In [None]:
file_name="1_Stave_supine_static.txt"#"C:/Users/Xabier Galar/High Level project/LCP_projects_Y3-Group25/1_Stave_supine_static.txt"
data2=pd.read_csv(file_name, delimiter="\t")
time_step2 = 1/100
index = data2.index
number_of_rows2 = len(index)
time_vec2 = np.arange(0, number_of_rows2 * time_step2, time_step2)
data2.insert(0, 'new_seconds', time_vec2)
data2 =data2.drop(columns=['Log Mode', 'Log Freq', 'Timestamp'])
#Slicing
data_test2 = data2.iloc[2000:2500,:]
t = data_test2.iloc[:,0] 

GyroX

In [None]:
normGX = data_test2.loc[:,"GyroX"]/max(data_test2.loc[:,"GyroX"])
ts = normGX

(ca, cd) = pywt.dwt(ts,'haar') #Finding coefficients of wavelet transform 

cat = pywt.threshold(ca, np.std(ca)/2, mode='soft')
cdt = pywt.threshold(ca, np.std(cd)/2, mode='soft')

#ts_rec = pywt.idwt(cat, cdt, 'haar') #reconstructing a new signal using tramsform coefficients
denoise = denoise_wavelet(ts, method = 'BayesShrink', mode = 'soft', wavelet_levels = 3, 
                          wavelet = 'sym8', rescale_sigma = 'True') #reconstructing a new denoised signal
#plt.close('all')
plt.figure(figsize=(30, 20))
plt.subplot(211)
# Original coefficients
plt.plot(ca, '--*b')
plt.plot(cd, '--*r')
# Thresholded coefficients
plt.plot(cat, '--*c')
plt.plot(cdt, '--*m')
plt.legend(['ca','cd','ca_thresh', 'cd_thresh'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in ec')
plt.ylabel('Value')
plt.title('Coefficient and Threshold of GyroX')

plt.subplot(212)
#plt.plot(ts_rec, 'r')
plt.plot(ts.to_numpy(), '--*r')
plt.plot(denoise, 'b')
#plt.hold('on')
plt.legend(['original signal', 'reconstructed signal'], loc = 'best', fontsize=15)#, 'reconstructed signal'])
plt.grid('on')
plt.xlabel('time in sec')
plt.ylabel('Value')
plt.title('Original and Reconstructed Signals of GyroX')


plt.show()

In [None]:
#Continuous wavelet transform

scale = np.arange(1,200)
coef, freqs = pywt.cwt(normGX, scale,'morl') #finding coefficients of continuous transform
#plotting
plt.imshow(abs(coef), interpolation = 'bilinear', cmap='PRGn', aspect = 'auto', 
           vmax = abs(coef).max(), vmin = abs(coef).min()) # doctest: +SKIP
plt.gca().invert_yaxis()
# plt.yticks(np.arange(1,len(normGX),1))
# plt.xticks(np.arange(0,201,10))
plt.show()
# #plt.plot(abs(coef))

scg.set_default_wavelet('morl')

#nn = 33
signal_length = 2000
# range of scales to perform the transform
scales = scg.periods2scales( np.arange(1, signal_length+1) ) #Setting scale of the scaleogram
x_values_wvt_arr = normGX

# plot the signal 
fig1, ax1 = plt.subplots(1, 1, figsize=(9, 3.5));  
ax1.plot(t, x_values_wvt_arr, linewidth=3, color='blue')
#ax1.set_xlim(0, 2)
ax1.set_title("Norm of GyroX signal")

# the scaleogram
scg.cws(x_values_wvt_arr, scales=scales, figsize=(10, 4.0), coi = False, ylabel="Period", xlabel="Time",
        title='Norm of GyroX signal: scaleogram with linear period'); 

print("Default wavelet function used to compute the transform:", scg.get_default_wavelet(), "(",
      pywt.ContinuousWavelet(scg.get_default_wavelet()).family_name, ")")

GyroY

In [None]:
normGY = data_test2.loc[:,"GyroY"]/max(data_test2.loc[:,"GyroY"])
ts = normGY

(ca, cd) = pywt.dwt(ts,'haar') #Finding coefficients of wavelet transform 

cat = pywt.threshold(ca, np.std(ca)/2, mode='soft')
cdt = pywt.threshold(ca, np.std(cd)/2, mode='soft')

#ts_rec = pywt.idwt(cat, cdt, 'haar') #reconstructing a new signal using tramsform coefficients
denoise = denoise_wavelet(ts, method = 'BayesShrink', mode = 'soft', wavelet_levels = 3, 
                          wavelet = 'sym8', rescale_sigma = 'True') #reconstructing a new denoised signal
#plt.close('all')
plt.figure(figsize=(30, 20))
plt.subplot(211)
# Original coefficients
plt.plot(ca, '--*b')
plt.plot(cd, '--*r')
# Thresholded coefficients
plt.plot(cat, '--*c')
plt.plot(cdt, '--*m')
plt.legend(['ca','cd','ca_thresh', 'cd_thresh'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in ec')
plt.ylabel('Value')
plt.title('Coefficient and Threshold of GyroY')

plt.subplot(212)
#plt.plot(ts_rec, 'r')
plt.plot(ts.to_numpy(), '--*r')
plt.plot(denoise, 'b')
#plt.hold('on')
plt.legend(['original signal', 'reconstructed signal'], loc = 'best', fontsize=15)#, 'reconstructed signal'])
plt.grid('on')
plt.xlabel('time in sec')
plt.ylabel('Value')
plt.title('Original and Reconstructed Signals of GyroY')


plt.show()

In [None]:
#Continuous wavelet transform

scale = np.arange(1,200)
coef, freqs = pywt.cwt(normGY, scale,'morl')

plt.imshow(abs(coef), interpolation = 'bilinear', cmap='PRGn', aspect = 'auto', 
           vmax = abs(coef).max(), vmin = abs(coef).min()) # doctest: +SKIP
plt.gca().invert_yaxis()
# plt.yticks(np.arange(1,len(normGX),1))
# plt.xticks(np.arange(0,201,10))
plt.show()
# plt.plot(t,normGX)
# #plt.plot(abs(coef))

scg.set_default_wavelet('morl')

#nn = 33
signal_length = 2000
# range of scales to perform the transform
scales = scg.periods2scales( np.arange(1, signal_length+1) )
x_values_wvt_arr = normGY

# plot the signal 
fig1, ax1 = plt.subplots(1, 1, figsize=(9, 3.5));  
ax1.plot(t, x_values_wvt_arr, linewidth=3, color='blue')
#ax1.set_xlim(0, 2)
ax1.set_title("Norm of GyroY signal")

# the scaleogram
scg.cws(x_values_wvt_arr, scales=scales, figsize=(10, 4.0), coi = False, ylabel="Period", xlabel="Time",
        title='Norm of GyroY signal: scaleogram with linear period'); 

print("Default wavelet function used to compute the transform:", scg.get_default_wavelet(), "(",
      pywt.ContinuousWavelet(scg.get_default_wavelet()).family_name, ")")

GyroZ

In [None]:
normGZ = data_test2.loc[:,"GyroZ"]/max(data_test2.loc[:,"GyroZ"])
ts = normGZ

(ca, cd) = pywt.dwt(ts,'haar') #Finding coefficients of wavelet transform 

cat = pywt.threshold(ca, np.std(ca)/2, mode='soft')
cdt = pywt.threshold(ca, np.std(cd)/2, mode='soft')

#ts_rec = pywt.idwt(cat, cdt, 'haar') #reconstructing a new signal using tramsform coefficients
denoise = denoise_wavelet(ts, method = 'BayesShrink', mode = 'soft', wavelet_levels = 3, 
                          wavelet = 'sym8', rescale_sigma = 'True') #reconstructing a new denoised signal
#plt.close('all')
plt.figure(figsize=(30, 20))
plt.subplot(211)
# Original coefficients
plt.plot(ca, '--*b')
plt.plot(cd, '--*r')
# Thresholded coefficients
plt.plot(cat, '--*c')
plt.plot(cdt, '--*m')
plt.legend(['ca','cd','ca_thresh', 'cd_thresh'], loc = 'best', fontsize=15)
plt.grid('on')
plt.xlabel('time in ec')
plt.ylabel('Value')
plt.title('Coefficient and Threshold of GyroZ')

plt.subplot(212)
#plt.plot(ts_rec, 'r')
plt.plot(ts.to_numpy(), '--*r')
plt.plot(denoise, 'b')
#plt.hold('on')
plt.legend(['original signal', 'reconstructed signal'], loc = 'best', fontsize=15)#, 'reconstructed signal'])
plt.grid('on')
plt.xlabel('time in sec')
plt.ylabel('Value')
plt.title('Original and Reconstructed Signals of GyroZ')


plt.show()

In [None]:
#Continuous wavelet transform

scale = np.arange(1,200)
coef, freqs = pywt.cwt(normGZ, scale,'morl')

plt.imshow(abs(coef), interpolation = 'bilinear', cmap='PRGn', aspect = 'auto', 
           vmax = abs(coef).max(), vmin = abs(coef).min()) # doctest: +SKIP
plt.gca().invert_yaxis()
# plt.yticks(np.arange(1,len(normGX),1))
# plt.xticks(np.arange(0,201,10))
plt.show()
# plt.plot(t,normGX)
# #plt.plot(abs(coef))

scg.set_default_wavelet('morl')

#nn = 33
signal_length = 2000
# range of scales to perform the transform
scales = scg.periods2scales( np.arange(1, signal_length+1) )
x_values_wvt_arr = normGZ

# plot the signal 
fig1, ax1 = plt.subplots(1, 1, figsize=(9, 3.5));  
ax1.plot(t, x_values_wvt_arr, linewidth=3, color='blue')
#ax1.set_xlim(0, 2)
ax1.set_title("Norm of GyroZ signal")

# the scaleogram
scg.cws(x_values_wvt_arr, scales=scales, figsize=(10, 4.0), coi = False, ylabel="Period", xlabel="Time",
        title='Norm of GyroZ signal: scaleogram with linear period'); 

print("Default wavelet function used to compute the transform:", scg.get_default_wavelet(), "(",
      pywt.ContinuousWavelet(scg.get_default_wavelet()).family_name, ")")