# Removing heartbeat from signal
---
**Author:**
- **Carlos Salgado** - Github: **[@socd06](https://github.com/socd06)**
---
## Improving signal-to-noise ratio using Savitsky-Golay filtering
---
This notebook deals specifically with denoising lung sound using Savitsky-Golay filtering. Inspired by the [Savitzky-Golay Filter for Denoising Lung Sound](http://www.scielo.br/scielo.php?script=sci_arttext&pid=S1516-89132018000100326) paper.

Note that you don't need to run this notebook if you cloned or forked the repository. The filtered 16-bit audio files are already in the `Respiratory_Sound_Database/16b_filtered_9_coeff`, `..11_coeff` and `..15_coeff`. You may use this notebook as an example to process future similar audio data.

In [1]:
# Don't Show Warning Messages
import warnings
warnings.filterwarnings('ignore')

# Import main libraries 
import pandas as pd
import numpy as np
import os

In [2]:
from __future__ import print_function

# matplotlib for displaying the output
import matplotlib.pyplot as plt
%matplotlib inline

# and IPython.display for audio output
import IPython.display

# Librosa for audio
import librosa
# And the display module for visualization
import librosa.display

In [3]:
# Using original 24-bit depth files

path = \
'../Respiratory_Sound_Database/audio_and_txt_files'

os.listdir(path)

['141_1b3_Al_mc_LittC2SE.txt',
 '205_1b3_Ll_mc_AKGC417L.txt',
 '140_2b3_Ll_mc_LittC2SE.txt',
 '117_1b2_Tc_mc_LittC2SE.txt',
 '211_1p2_Pr_mc_AKGC417L.wav',
 '163_2b2_Ll_mc_AKGC417L.txt',
 '138_1p4_Tc_mc_AKGC417L.txt',
 '213_1p2_Tc_mc_AKGC417L.wav',
 '204_2b5_Ll_mc_AKGC417L.txt',
 '157_1b1_Pl_sc_Meditron.wav',
 '125_1b1_Tc_sc_Meditron.txt',
 '172_2b5_Tc_mc_AKGC417L.txt',
 '161_1b1_Al_sc_Meditron.txt',
 '211_2p4_Tc_mc_AKGC417L.wav',
 '102_1b1_Ar_sc_Meditron.txt',
 '147_2b4_Ll_mc_AKGC417L.wav',
 '163_8b3_Pl_mc_AKGC417L.wav',
 '130_2b4_Al_mc_AKGC417L.txt',
 '178_1b3_Ar_mc_AKGC417L.wav',
 '140_2b2_Tc_mc_LittC2SE.txt',
 '184_1b1_Ar_sc_Meditron.wav',
 '134_2b3_Ar_mc_LittC2SE.txt',
 '130_2b2_Lr_mc_AKGC417L.txt',
 '157_1b1_Pl_sc_Meditron.txt',
 '130_2p5_Al_mc_AKGC417L.wav',
 '142_1b1_Pl_mc_LittC2SE.txt',
 '160_1b3_Al_mc_AKGC417L.wav',
 '172_2b5_Pr_mc_AKGC417L.wav',
 '170_1b3_Pr_mc_AKGC417L.wav',
 '219_2b1_Tc_mc_LittC2SE.txt',
 '130_1p2_Tc_mc_AKGC417L.wav',
 '213_2p2_Pl_mc_AKGC417L.wav',
 '201_1b

Using `105_1b1_Tc_sc_Meditron` as an example 

In [4]:
audio_extension = ".wav"
#prefix = '16b_'
audio_file = '105_1b1_Tc_sc_Meditron'

audio_path = path + '/' + audio_file + audio_extension

# resample the signal to 22050Hz (default)
#y, sr = librosa.load(audio_path)

# uncomment to resample at 44.1KHz
# y, sr = librosa.load(audio_path, sr=44100)

# Disable resampling since the files are already at 44.1Khz
y, sr = librosa.load(audio_path, sr=None)

In [5]:
audio_path

'../Respiratory_Sound_Database/audio_and_txt_files/105_1b1_Tc_sc_Meditron.wav'

In [6]:
# Play our file with default sampling rate
IPython.display.Audio(y, rate=sr)

## Savitsky-Golay filtering
---
We can use Savitsky-Golay filtering to improve signal to noise ratio (SNR) on our recordings without using considerable computation power.

In [7]:
y_filtered = librosa.feature.delta(y, width=15, order=1)

In [8]:
# Playing the percussive component of the signal
IPython.display.Audio(y_filtered, rate=sr)

Note how there is considerable less noise in the signal and the heart is not audible anymore.

In [9]:
# Specify 16bit audio folder 
path = "../Respiratory_Sound_Database/audio_and_txt_files/"
path

'../Respiratory_Sound_Database/audio_and_txt_files/'

In [10]:
# Move system into the 16-bit audio folder
import glob # We will use glob to work with files
os.chdir(path)

In [11]:
# Files are 24-bit depth, not compatible with the GUI labelling app
# Need to change all files to PCM_16 (16-bit depth)
import soundfile

Now we will run a `for` loop for 3 different coefficient(`width`) parameters 

In [16]:
# 9
# Saving the filtered files into 16-bit depth PCM .wav files
for file in glob.glob("*.wav"):
    print(file)
    
    y, sr = librosa.load(file, sr=None)    
    y_filtered = librosa.feature.delta(y, width=9, order=1)
    
    data, samplerate = y_filtered, sr
    soundfile.write('../16b_filtered_9_coeff/'+ 'fil_' + file, data, samplerate, subtype='PCM_16')

211_1p2_Pr_mc_AKGC417L.wav
213_1p2_Tc_mc_AKGC417L.wav
157_1b1_Pl_sc_Meditron.wav
211_2p4_Tc_mc_AKGC417L.wav
147_2b4_Ll_mc_AKGC417L.wav
163_8b3_Pl_mc_AKGC417L.wav
178_1b3_Ar_mc_AKGC417L.wav
184_1b1_Ar_sc_Meditron.wav
130_2p5_Al_mc_AKGC417L.wav
160_1b3_Al_mc_AKGC417L.wav
172_2b5_Pr_mc_AKGC417L.wav
170_1b3_Pr_mc_AKGC417L.wav
130_1p2_Tc_mc_AKGC417L.wav
213_2p2_Pl_mc_AKGC417L.wav
201_1b3_Al_sc_Meditron.wav
198_6p1_Pl_mc_AKGC417L.wav
174_1p2_Ar_mc_AKGC417L.wav
138_1p3_Pl_mc_AKGC417L.wav
201_1b2_Al_sc_Meditron.wav
160_2b4_Pl_mc_AKGC417L.wav
166_1p1_Pr_sc_Meditron.wav
141_1b3_Al_mc_LittC2SE.wav
130_2b4_Al_mc_AKGC417L.wav
160_1b2_Tc_mc_AKGC417L.wav
178_1b2_Al_mc_AKGC417L.wav
203_1p3_Ar_mc_AKGC417L.wav
133_2p4_Pr_mc_AKGC417L.wav
151_2p3_Lr_mc_AKGC417L.wav
129_1b1_Ar_sc_Meditron.wav
160_1b4_Lr_mc_AKGC417L.wav
124_1b1_Ar_sc_Litt3200.wav
178_1b6_Ar_mc_AKGC417L.wav
141_1b2_Tc_mc_LittC2SE.wav
130_3p2_Tc_mc_AKGC417L.wav
162_2b2_Pl_mc_AKGC417L.wav
172_1b4_Tc_mc_AKGC417L.wav
224_1b1_Tc_sc_Meditron.wav
1

160_1b3_Lr_mc_AKGC417L.wav
163_2b2_Ar_mc_AKGC417L.wav
170_1b2_Lr_mc_AKGC417L.wav
175_1b1_Pl_sc_Litt3200.wav
141_1b3_Ar_mc_LittC2SE.wav
193_7b3_Ar_mc_AKGC417L.wav
171_1b1_Al_sc_Meditron.wav
175_1b1_Pr_sc_Litt3200.wav
218_1p1_Pr_sc_Litt3200.wav
176_2b3_Lr_mc_AKGC417L.wav
151_2p4_Pl_mc_AKGC417L.wav
122_2b3_Al_mc_LittC2SE.wav
198_6p1_Ll_mc_AKGC417L.wav
118_1b1_Pr_sc_Litt3200.wav
207_3b2_Lr_mc_AKGC417L.wav
110_1p1_Ll_sc_Meditron.wav
130_1p2_Pr_mc_AKGC417L.wav
180_1b4_Pl_mc_AKGC417L.wav
162_2b3_Al_mc_AKGC417L.wav
139_1b1_Pr_sc_Litt3200.wav
186_2b3_Tc_mc_AKGC417L.wav
203_1p3_Al_mc_AKGC417L.wav
130_1p3_Pr_mc_AKGC417L.wav
133_2p4_Pl_mc_AKGC417L.wav
158_1p3_Ll_mc_AKGC417L.wav
174_2p3_Tc_mc_AKGC417L.wav
177_1b2_Tc_mc_AKGC417L.wav
138_2p2_Ll_mc_AKGC417L.wav
122_2b2_Ar_mc_LittC2SE.wav
204_2b5_Al_mc_AKGC417L.wav
146_2b2_Pl_mc_AKGC417L.wav
158_1p2_Pl_mc_AKGC417L.wav
133_2p2_Pl_mc_AKGC417L.wav
135_2b1_Pl_mc_LittC2SE.wav
172_2b5_Al_mc_AKGC417L.wav
159_1b1_Ll_sc_Meditron.wav
205_1b3_Al_mc_AKGC417L.wav
1

151_3p2_Al_mc_AKGC417L.wav
134_2b2_Al_mc_LittC2SE.wav
138_1p4_Lr_mc_AKGC417L.wav
170_1b2_Pl_mc_AKGC417L.wav
198_1b5_Pl_mc_AKGC417L.wav
200_2p3_Pr_mc_AKGC417L.wav
195_1b1_Al_sc_Litt3200.wav
207_2b2_Ar_mc_AKGC417L.wav
213_1p2_Pl_mc_AKGC417L.wav
156_2b3_Ll_mc_AKGC417L.wav
130_1p4_Al_mc_AKGC417L.wav
134_2b1_Ar_mc_LittC2SE.wav
205_1b3_Pl_mc_AKGC417L.wav
151_2p3_Pl_mc_AKGC417L.wav
130_1p2_Pl_mc_AKGC417L.wav
133_2p4_Tc_mc_AKGC417L.wav
174_2p3_Al_mc_AKGC417L.wav
107_3p2_Pr_mc_AKGC417L.wav
205_3b4_Al_mc_AKGC417L.wav
200_2p3_Pl_mc_AKGC417L.wav
133_2p2_Al_mc_AKGC417L.wav
164_1b1_Ll_sc_Meditron.wav
172_1b4_Pl_mc_AKGC417L.wav
174_1p2_Pl_mc_AKGC417L.wav
178_1b2_Pr_mc_AKGC417L.wav
130_3p4_Pr_mc_AKGC417L.wav
195_1b1_Ar_sc_Litt3200.wav
174_1p2_Lr_mc_AKGC417L.wav
172_1b5_Lr_mc_AKGC417L.wav
134_2b1_Al_mc_LittC2SE.wav
160_1b4_Tc_mc_AKGC417L.wav
162_1b2_Lr_mc_AKGC417L.wav
130_1p3_Ar_mc_AKGC417L.wav
130_3p3_Pl_mc_AKGC417L.wav
203_1p4_Al_mc_AKGC417L.wav
156_5b3_Ar_mc_AKGC417L.wav
172_1b4_Pr_mc_AKGC417L.wav
2

When we listen to them we can tell that there is still considerable noise in the recordings. We will raise the coefficients value to 11  for the next experiment.

In [17]:
# 11
# Saving the filtered files into 16-bit depth PCM .wav files
for file in glob.glob("*.wav"):
    print(file)
    
    y, sr = librosa.load(file, sr=None)    
    y_filtered = librosa.feature.delta(y, width=11, order=1)
    
    data, samplerate = y_filtered, sr
    soundfile.write('../16b_filtered_11_coeff/'+ 'fil_' + file, data, samplerate, subtype='PCM_16')

211_1p2_Pr_mc_AKGC417L.wav
213_1p2_Tc_mc_AKGC417L.wav
157_1b1_Pl_sc_Meditron.wav
211_2p4_Tc_mc_AKGC417L.wav
147_2b4_Ll_mc_AKGC417L.wav
163_8b3_Pl_mc_AKGC417L.wav
178_1b3_Ar_mc_AKGC417L.wav
184_1b1_Ar_sc_Meditron.wav
130_2p5_Al_mc_AKGC417L.wav
160_1b3_Al_mc_AKGC417L.wav
172_2b5_Pr_mc_AKGC417L.wav
170_1b3_Pr_mc_AKGC417L.wav
130_1p2_Tc_mc_AKGC417L.wav
213_2p2_Pl_mc_AKGC417L.wav
201_1b3_Al_sc_Meditron.wav
198_6p1_Pl_mc_AKGC417L.wav
174_1p2_Ar_mc_AKGC417L.wav
138_1p3_Pl_mc_AKGC417L.wav
201_1b2_Al_sc_Meditron.wav
160_2b4_Pl_mc_AKGC417L.wav
166_1p1_Pr_sc_Meditron.wav
141_1b3_Al_mc_LittC2SE.wav
130_2b4_Al_mc_AKGC417L.wav
160_1b2_Tc_mc_AKGC417L.wav
178_1b2_Al_mc_AKGC417L.wav
203_1p3_Ar_mc_AKGC417L.wav
133_2p4_Pr_mc_AKGC417L.wav
151_2p3_Lr_mc_AKGC417L.wav
129_1b1_Ar_sc_Meditron.wav
160_1b4_Lr_mc_AKGC417L.wav
124_1b1_Ar_sc_Litt3200.wav
178_1b6_Ar_mc_AKGC417L.wav
141_1b2_Tc_mc_LittC2SE.wav
130_3p2_Tc_mc_AKGC417L.wav
162_2b2_Pl_mc_AKGC417L.wav
172_1b4_Tc_mc_AKGC417L.wav
224_1b1_Tc_sc_Meditron.wav
1

175_1b1_Pl_sc_Litt3200.wav
141_1b3_Ar_mc_LittC2SE.wav
193_7b3_Ar_mc_AKGC417L.wav
171_1b1_Al_sc_Meditron.wav
175_1b1_Pr_sc_Litt3200.wav
218_1p1_Pr_sc_Litt3200.wav
176_2b3_Lr_mc_AKGC417L.wav
151_2p4_Pl_mc_AKGC417L.wav
122_2b3_Al_mc_LittC2SE.wav
198_6p1_Ll_mc_AKGC417L.wav
118_1b1_Pr_sc_Litt3200.wav
207_3b2_Lr_mc_AKGC417L.wav
110_1p1_Ll_sc_Meditron.wav
130_1p2_Pr_mc_AKGC417L.wav
180_1b4_Pl_mc_AKGC417L.wav
162_2b3_Al_mc_AKGC417L.wav
139_1b1_Pr_sc_Litt3200.wav
186_2b3_Tc_mc_AKGC417L.wav
203_1p3_Al_mc_AKGC417L.wav
130_1p3_Pr_mc_AKGC417L.wav
133_2p4_Pl_mc_AKGC417L.wav
158_1p3_Ll_mc_AKGC417L.wav
174_2p3_Tc_mc_AKGC417L.wav
177_1b2_Tc_mc_AKGC417L.wav
138_2p2_Ll_mc_AKGC417L.wav
122_2b2_Ar_mc_LittC2SE.wav
204_2b5_Al_mc_AKGC417L.wav
146_2b2_Pl_mc_AKGC417L.wav
158_1p2_Pl_mc_AKGC417L.wav
133_2p2_Pl_mc_AKGC417L.wav
135_2b1_Pl_mc_LittC2SE.wav
172_2b5_Al_mc_AKGC417L.wav
159_1b1_Ll_sc_Meditron.wav
205_1b3_Al_mc_AKGC417L.wav
138_1p2_Al_mc_AKGC417L.wav
203_1p3_Pr_mc_AKGC417L.wav
197_1b1_Al_sc_Meditron.wav
1

195_1b1_Al_sc_Litt3200.wav
207_2b2_Ar_mc_AKGC417L.wav
213_1p2_Pl_mc_AKGC417L.wav
156_2b3_Ll_mc_AKGC417L.wav
130_1p4_Al_mc_AKGC417L.wav
134_2b1_Ar_mc_LittC2SE.wav
205_1b3_Pl_mc_AKGC417L.wav
151_2p3_Pl_mc_AKGC417L.wav
130_1p2_Pl_mc_AKGC417L.wav
133_2p4_Tc_mc_AKGC417L.wav
174_2p3_Al_mc_AKGC417L.wav
107_3p2_Pr_mc_AKGC417L.wav
205_3b4_Al_mc_AKGC417L.wav
200_2p3_Pl_mc_AKGC417L.wav
133_2p2_Al_mc_AKGC417L.wav
164_1b1_Ll_sc_Meditron.wav
172_1b4_Pl_mc_AKGC417L.wav
174_1p2_Pl_mc_AKGC417L.wav
178_1b2_Pr_mc_AKGC417L.wav
130_3p4_Pr_mc_AKGC417L.wav
195_1b1_Ar_sc_Litt3200.wav
174_1p2_Lr_mc_AKGC417L.wav
172_1b5_Lr_mc_AKGC417L.wav
134_2b1_Al_mc_LittC2SE.wav
160_1b4_Tc_mc_AKGC417L.wav
162_1b2_Lr_mc_AKGC417L.wav
130_1p3_Ar_mc_AKGC417L.wav
130_3p3_Pl_mc_AKGC417L.wav
203_1p4_Al_mc_AKGC417L.wav
156_5b3_Ar_mc_AKGC417L.wav
172_1b4_Pr_mc_AKGC417L.wav
203_1p2_Lr_mc_AKGC417L.wav
174_1p4_Lr_mc_AKGC417L.wav
191_2b1_Pl_mc_LittC2SE.wav
130_2b2_Ll_mc_AKGC417L.wav
193_1b2_Pr_mc_AKGC417L.wav
146_8p3_Lr_mc_AKGC417L.wav
1

Going all the way to 15 in the coefficients value can produce some very cleaned-up versions of our signals.

In [14]:
# 15
# Saving the filtered files into 16-bit depth PCM .wav files
for file in glob.glob("*.wav"):
    print(file)
    
    y, sr = librosa.load(file, sr=None)    
    y_filtered = librosa.feature.delta(y, width=15, order=1)
    
    data, samplerate = y_filtered, sr
    soundfile.write('../16b_filtered_15_coeff/'+ 'fil_' + file, data, samplerate, subtype='PCM_16')

211_1p2_Pr_mc_AKGC417L.wav
213_1p2_Tc_mc_AKGC417L.wav
157_1b1_Pl_sc_Meditron.wav
211_2p4_Tc_mc_AKGC417L.wav
147_2b4_Ll_mc_AKGC417L.wav
163_8b3_Pl_mc_AKGC417L.wav
178_1b3_Ar_mc_AKGC417L.wav
184_1b1_Ar_sc_Meditron.wav
130_2p5_Al_mc_AKGC417L.wav
160_1b3_Al_mc_AKGC417L.wav
172_2b5_Pr_mc_AKGC417L.wav
170_1b3_Pr_mc_AKGC417L.wav
130_1p2_Tc_mc_AKGC417L.wav
213_2p2_Pl_mc_AKGC417L.wav
201_1b3_Al_sc_Meditron.wav
198_6p1_Pl_mc_AKGC417L.wav
174_1p2_Ar_mc_AKGC417L.wav
138_1p3_Pl_mc_AKGC417L.wav
201_1b2_Al_sc_Meditron.wav
160_2b4_Pl_mc_AKGC417L.wav
166_1p1_Pr_sc_Meditron.wav
141_1b3_Al_mc_LittC2SE.wav
130_2b4_Al_mc_AKGC417L.wav
160_1b2_Tc_mc_AKGC417L.wav
178_1b2_Al_mc_AKGC417L.wav
203_1p3_Ar_mc_AKGC417L.wav
133_2p4_Pr_mc_AKGC417L.wav
151_2p3_Lr_mc_AKGC417L.wav
129_1b1_Ar_sc_Meditron.wav
160_1b4_Lr_mc_AKGC417L.wav
124_1b1_Ar_sc_Litt3200.wav
178_1b6_Ar_mc_AKGC417L.wav
141_1b2_Tc_mc_LittC2SE.wav
130_3p2_Tc_mc_AKGC417L.wav
162_2b2_Pl_mc_AKGC417L.wav
172_1b4_Tc_mc_AKGC417L.wav
224_1b1_Tc_sc_Meditron.wav
1

160_1b3_Lr_mc_AKGC417L.wav
163_2b2_Ar_mc_AKGC417L.wav
170_1b2_Lr_mc_AKGC417L.wav
175_1b1_Pl_sc_Litt3200.wav
141_1b3_Ar_mc_LittC2SE.wav
193_7b3_Ar_mc_AKGC417L.wav
171_1b1_Al_sc_Meditron.wav
175_1b1_Pr_sc_Litt3200.wav
218_1p1_Pr_sc_Litt3200.wav
176_2b3_Lr_mc_AKGC417L.wav
151_2p4_Pl_mc_AKGC417L.wav
122_2b3_Al_mc_LittC2SE.wav
198_6p1_Ll_mc_AKGC417L.wav
118_1b1_Pr_sc_Litt3200.wav
207_3b2_Lr_mc_AKGC417L.wav
110_1p1_Ll_sc_Meditron.wav
130_1p2_Pr_mc_AKGC417L.wav
180_1b4_Pl_mc_AKGC417L.wav
162_2b3_Al_mc_AKGC417L.wav
139_1b1_Pr_sc_Litt3200.wav
186_2b3_Tc_mc_AKGC417L.wav
203_1p3_Al_mc_AKGC417L.wav
130_1p3_Pr_mc_AKGC417L.wav
133_2p4_Pl_mc_AKGC417L.wav
158_1p3_Ll_mc_AKGC417L.wav
174_2p3_Tc_mc_AKGC417L.wav
177_1b2_Tc_mc_AKGC417L.wav
138_2p2_Ll_mc_AKGC417L.wav
122_2b2_Ar_mc_LittC2SE.wav
204_2b5_Al_mc_AKGC417L.wav
146_2b2_Pl_mc_AKGC417L.wav
158_1p2_Pl_mc_AKGC417L.wav
133_2p2_Pl_mc_AKGC417L.wav
135_2b1_Pl_mc_LittC2SE.wav
172_2b5_Al_mc_AKGC417L.wav
159_1b1_Ll_sc_Meditron.wav
205_1b3_Al_mc_AKGC417L.wav
1

151_3p2_Al_mc_AKGC417L.wav
134_2b2_Al_mc_LittC2SE.wav
138_1p4_Lr_mc_AKGC417L.wav
170_1b2_Pl_mc_AKGC417L.wav
198_1b5_Pl_mc_AKGC417L.wav
200_2p3_Pr_mc_AKGC417L.wav
195_1b1_Al_sc_Litt3200.wav
207_2b2_Ar_mc_AKGC417L.wav
213_1p2_Pl_mc_AKGC417L.wav
156_2b3_Ll_mc_AKGC417L.wav
130_1p4_Al_mc_AKGC417L.wav
134_2b1_Ar_mc_LittC2SE.wav
205_1b3_Pl_mc_AKGC417L.wav
151_2p3_Pl_mc_AKGC417L.wav
130_1p2_Pl_mc_AKGC417L.wav
133_2p4_Tc_mc_AKGC417L.wav
174_2p3_Al_mc_AKGC417L.wav
107_3p2_Pr_mc_AKGC417L.wav
205_3b4_Al_mc_AKGC417L.wav
200_2p3_Pl_mc_AKGC417L.wav
133_2p2_Al_mc_AKGC417L.wav
164_1b1_Ll_sc_Meditron.wav
172_1b4_Pl_mc_AKGC417L.wav
174_1p2_Pl_mc_AKGC417L.wav
178_1b2_Pr_mc_AKGC417L.wav
130_3p4_Pr_mc_AKGC417L.wav
195_1b1_Ar_sc_Litt3200.wav
174_1p2_Lr_mc_AKGC417L.wav
172_1b5_Lr_mc_AKGC417L.wav
134_2b1_Al_mc_LittC2SE.wav
160_1b4_Tc_mc_AKGC417L.wav
162_1b2_Lr_mc_AKGC417L.wav
130_1p3_Ar_mc_AKGC417L.wav
130_3p3_Pl_mc_AKGC417L.wav
203_1p4_Al_mc_AKGC417L.wav
156_5b3_Ar_mc_AKGC417L.wav
172_1b4_Pr_mc_AKGC417L.wav
2

Now its time to find a way to compare them all either by listening to all the versions or by producing signal to noise data for every folder.