# Introduction to Signal Processing for Absolute Beginners

If you have no idea what spectrogram or frequency mean, and Fourier transform sounds like a magic spell, then you're like me! Let's get on a journey together to find out how to process signals, using G2Net competition data as examples.

I will work on this notebook over the next several weeks, so you may come back when it's ready or join me on the learning journey. If you follow along, I'll appreciate comments with feedback or questions!

![Signal Processing System](https://upload.wikimedia.org/wikipedia/commons/4/46/Signal_processing_system.png)

## Plan of Attack

How do we learn about this from scratch? Here's my plan:

1. Start by tinkering! I already did this with a very naive approach first, and a some more informed but still very blind experiments later.

2. Learn the basics! Let's understand some of the theory to figure out what we missed in our naive approach from the previous step, and come up with a more informed approach. 

3. Apply the learning in practice! After getting some understanding, let's put it to use to get some practise and verify if it works!

## What is Digital Signal Processing? 

I started by reading the first chapter from *Foundations of Digital Signal Processing* by Patrick Gaydecki (via google books). Here some things I have learned: 

* Digital Signal Processing involves manipulation of signals that have their origins in the analog world, such as audio, video, radar, thermal, magnetic or ultrasonic sensor systems
* Signals are originally analog and continuous in nature
* A transducer is a device that converts some form of energy to which it is designed to respond into an electrical signal. This signal is later converted into digital form. 

I also learned about some of the *founding fathers* of signal processing: 
* Jean-Baptiste Fourier (1768 - 1830) showed how any waveform could be represented as a series of weighted sine and cosine components of ascending frequency, which set the foundation for modern Fourier analysis and the Fourier transform. This allows us to analyse and process a signal. 
* Laplace developed the Laplace transform (not sure yet what that is)
* Claud Shannon developed the science of information theory
* Cooley and Turkey developed the FFT (Fast Fourier transform) in 1965, which allowed to obtain spectra in an efficient way.

My conclusion is that I need to take another step back, to high school physics, to understand more about waves. 

## What About Waves? 

It seems that waves are a core concept necessary to understand signal processing. I can highly recommend Khan Academy videos such as this [introduction to waves](https://www.khanacademy.org/science/high-school-physics/waves-and-sound/introduction-to-waves/v/introduction-to-waves). Here are some things I have learned. 

![basic wave concepts](https://pbs.twimg.com/media/E5lsNiQWQAQ_Fq6?format=jpg&name=4096x4096)

A **wave** is a disturbance propagating through space, usually transferring energy. In the example pictured above, imagine someone jerking a string up, down, and back up again. This is an example of *transverse* wave, because the medium (string) is moving up and down. If we repeat that movement and create a periodic *pulse* then we will create a *periodic wave*. 

There are a few properties of *periodic waves* that we should know about: 
* **Amplitude** is the height of a wave, the maximum displacement measured from the equilibrium position
* **Period** measures the number of seconds per each cycle
* **Frequency** is the reverse of period, it counts how many cycles can fit in a second. The unit of frequency is **Hertz**, for example 10 cycles per second = 10 Hertz
* **Wavelength** measures how far a wave has travelled after 1 period (1 cycle). This can be for example the distance between 2 neighboring peaks. 
* **Velocity** tells us how quickly a wave is moving to the right. Velocity equals distance over time, so we can say V = wavelength / period = wavelength * frequency

### 1. Sound

The concepts above apply to sound, although it's a different type of wave. Let's start by considering how sound is produced via a loudspeaker...

![sound waves](https://pbs.twimg.com/media/E5nmOMoXoBcnguC?format=jpg&name=large)

A loudspeaker moves a diaphragm back and forth (oscillates), which pushes the air molecules to follow a similar back and forth movement. The molecules don't just move forward (like a ball that is kicked), but they move forward and backward like a wave. That wave propagates through air and we can see the compressed region of molecules moving forward. In contrast to the string example above, the movement is in the direction of propagated pulse, so we call this a *longitudinal wave*. It can be mapped on a diagram in a very similar way to *transverse ways* and the properties described above apply here as as well. 

* **Amplitude** measures the displacement of air molecule as it oscillates. Higher amplitude means higher volume, and lower amplited means lower volume. 
* **Frequency** corresponds to the sound pitch. Higher frequency corresponds to higher notes. Humans hear frequencies between 20Hz and 20,000Hz. 
* **Wavelength** measures the distance between two compressed regions of air. 

We use **decibel scale** to measure loudness of sounds. To calculate decibel level, we need to know the Intensity of sound, which corresponds to Power (measured in Watts) over Area (measured in square meters). We divide Intensity by the threshold of hearing (1e-12 W/m2), take a log of this, and multiply by 10.

#### Ultrasound 

Sound waves can be useful beyond communication and entertainment! For example, sound waves with frequency higher than human hearing (above 20,000Hz) are used for medical imaging. A device is emiting high-frequency waves and captures the waves reflecteced by our internal organs to produce an image of our muscles, tendons and many internal organs. 


### 2. Electromagnetic Waves

When an electric field changes in some region, that creates a changing magnetic field. This in turn leads to a change in the electric field. This can result in a chain reaction that propagates like a wave. These waves are called **electromagnetic waves**.

In the image below, you can see how an electromagnetic wave contains both electric and magnetic field vectors, orthogonal to each other, and how we can apply the concepts of amplitude, wavelength and frequency to this type of waves. An interesting fact about electromagnetic waves is that they can propagate in vacuum and don't require a medium. 

![](https://cdn.kastatic.org/ka-perseus-images/9b999f75e599f3ed46c0ed16586410ec1e5ffc34.png)

Because they have a constant speed (speed of light), there is a direct relationship between wavelenght and frequency. If we map all the possible frequencies/wavelengths on a chart, we call this electromagnetic spectrum. The visible light is a small part of that spectrum. Lower frequencies correspond to infrared, radio waves, microwaves etc. Higher frequencies correspond to ultraviolet, X-rays, gamma rays etc. 

![](https://cdn.kastatic.org/ka-perseus-images/7370593cc71daa2ccaca091cec088fa5fec6ca16.png)

### 3. Gravitational Waves

Finally, time to cover the waves we're supposed to detect in the G2Net competition! According to Wikipedia, "**gravitational waves** are disturbances in the curvature of spacetime, generated by accelerated masses, that propagate as waves outward from their source at the speed of light". 

![](https://upload.wikimedia.org/wikipedia/commons/thumb/2/2f/The_Gravitational_wave_spectrum_Sources_and_Detectors.jpg/2560px-The_Gravitational_wave_spectrum_Sources_and_Detectors.jpg)

### Wavelets

What happens if an oscillation is only temporary and doesn't display the periodic pattern, but still looks like a wave? You might have seen these from a seismograph recording or a heart monitor. Such brief oscillation is a **wavelet** and we can also analyse and use them with our digital signal processing tools. Here is an example of a wavelet. 

![wavelet example](https://upload.wikimedia.org/wikipedia/commons/thumb/0/0a/MorletWaveletMathematica.svg/1920px-MorletWaveletMathematica.svg.png)


## Frequency Domain

I've been struggling to understand the concept of frequency domain - if you're like me, don't worry if this takes time to understand! Let's start with a rainbow. Rainbow shows us how white light can be decomposed into colors. Each color is associated with a different frequency. 

![rainbow](https://upload.wikimedia.org/wikipedia/commons/5/5c/Double-alaskan-rainbow.jpg)

This idea that a **signal can be a mix of different frequencies** is why we need the frequency domain to better understand a signal. And this is where our magic spell - **the Fourier transform** - comes in handy: it allows us to decompose any signal into a set of sine and cosine functions.

Side note: humans can also analyse sounds and images in terms of their sinusoidal components, our ears and eyes have developed to distinguish different frequencies. 

#### Frequency vs. time domain

Since we are dealing with digital representations of signals, we can obtain measurements at discrete points in time. A **sampling rate** tells us the number of samples per second (1/s unit is Hertz so we will use it here). For example, a typical sampling rate for audio signal is 44.1 kHz. 

Side note: if you fancy some more theory, check out [Nyquistâ€“Shannon sampling theorem](https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem).

If we take a single sample (point it time), it's not possible to approximate the sinusoidal frequency function. The more points we have, the better the approximation. That's why we're talking about moving from time domain to frequency domain (Fourier analysis) or from frequency domain to time domain (Fourier synthesis). The analysis allows us to find the contribution of different frequencies in a given signal. The synthesis helps us create signals with known frequency content. 

### The Spectrogram

Ok, so we just learned that we can move from time domain to frequency domain with the help of Fourier transform. But wouldn't it make sense to see both time and frequency at the same time? Frequency can change over time, right? 

Here comes the **spectrogram**! The way to create it is by performing FFT on segments of the signal with a moving window. This is called **short-time Fourier transform**. The picture below illustrates it pretty well. 

![stft](https://www.researchgate.net/publication/346243843/figure/fig1/AS:961807523000322@1606324191138/Short-time-Fourier-transform-STFT-overview.png)

Finally, we're at a point where we can write some code!

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
from scipy.interpolate import interp1d
from scipy.signal import butter, filtfilt, iirdesign, zpk2tf, freqz

def id2path(id, is_test=False):
    a, b, c = id[0], id[1], id[2]
    if is_test: return f'../input/g2net-gravitational-wave-detection/test/{a}/{b}/{c}/{id}.npy'
    return f'../input/g2net-gravitational-wave-detection/train/{a}/{b}/{c}/{id}.npy'

Let's first look at our raw signal data. 

In [None]:
_id = '0021f9dd71' # credit for finding example with strong signal: https://www.kaggle.com/mistag/data-preprocessing-with-gwpy/
x = np.load(id2path(_id, is_test=True))
plt.figure(figsize=(12,4))
plt.plot(x[0])
plt.title('Detector 1')
plt.xlabel('Time')
plt.ylabel('Amplitude')
plt.show()

Now let's create a spectrogram by following [this tutorial](https://www.gw-openscience.org/GW150914data/GW150914_tutorial.html). We will start with a regular version, then we will apply whitening. It looks like whitening might not be helping though ([discussion](https://www.kaggle.com/c/g2net-gravitational-wave-detection/discussion/252138)).

In [None]:
fs = 2048 # sampling rate:
NFFT = int(fs/16) # pick a shorter FTT time interval, like 1/16 of a second:
NOVL = int(NFFT*15/16) # and with a lot of overlap, to resolve short-time features:
# and choose a window that minimizes "spectral leakage" (https://en.wikipedia.org/wiki/Spectral_leakage)
window = np.blackman(NFFT)
spec_cmap='viridis'
print(f'fs: {fs}, NFFT: {NFFT}, NOVL: {NOVL}')

plt.figure(figsize=(12,8))
spec_H1, freqs, bins, im = plt.specgram(x[0], NFFT=NFFT, Fs=fs, window=window, noverlap=NOVL, cmap=spec_cmap)
plt.xlabel('time')
plt.ylabel('Frequency (Hz)')
plt.axis([0.03, 2-0.03, 30, 500]) # according to discussions, the frequencies that interest us should be in this range
plt.colorbar()
plt.title('Detector 1')
plt.show()

We can now see a gravitational wave signal on the spectrogram!

### Sources

1. *Foundations of Digital Signal Processing* by Patrick Gaydecki (via google books)
2. Khan Academy videos about waves
3. MOOCs on Coursera: 
    *  https://www.coursera.org/learn/audio-signal-processing
    *  https://www.coursera.org/learn/dsp1
4. Preprocessing tutorials: https://www.kaggle.com/c/g2net-gravitational-wave-detection/discussion/250244