# 01. Basic Signal Processing

This notebook provides a hands-on introduction to fundamental signal processing concepts essential for analyzing electrophysiological recordings (EEG, LFP, EMG). We'll cover sine waves, sampling theory, filtering techniques, and downsampling - all illustrated with both simulated signals and real neural data.

**What you'll learn:**
- **I**: Signal fundamentals (frequency, amplitude, phase)
- **II**: From analog to digital signal (sampling rates and Nyquist frequency)
- **III**: Filtering (types and applications)
- **IV**: Downsampling

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# I. Signal fundamentals

### Generating a Sine Wave

A sine wave is a mathematical curve that describes a smooth periodic oscillation. It is defined by the following formula:

$$
y(t) = A \cdot \sin(2 \pi f t + \phi)
$$

Where:
- $y(t)$ is the value of the sine wave at time $t$,
- $A$ is the amplitude of the wave (the peak value),
- $f$ is the frequency of the wave (in Hz),
- $t$ is the time (in seconds),
- $\phi$ is the phase offset (in radians).

## I.1 General function

In [None]:
def sinus_function(t, f, phi=0, A=1):
    """
    General sinusoidal function: A * sin(2 * pi * f * t + phi)
    t: time vector
    A: Amplitude
    f: Frequency (Hz)
    phi: Phase (radians)
    """
    return A * np.sin(2 * np.pi * f * t + phi)

## I.2 Example 10 Hz sinus

In [None]:
# Define Sampling parameters
fs = 10000  # Sampling rate (1000 Hz)
duration = 1.0  # Seconds
t_vect = np.arange(0, duration, 1/fs)
freq_1 = 10
sig_10 = sinus_function(t_vect, f = freq_1)    # Generate sine wave

In [None]:
# Plot
plt.figure(figsize=(12, 4))
plt.plot(t_vect, sig_10)
plt.title(f"Simulated Neural Signal at {freq_1} Hz")
plt.xlabel("Time (s)")
plt.ylabel("Voltage (mV)")
plt.show()

## I.3 Effect of frequency: 10 Hz vs 5 Hz

In [None]:
sig_10 = sinus_function(t_vect, f = 10)    # Generate sine wave
sig_5 = sinus_function(t_vect, f = 5)    # Generate sine wave

# Plot
plt.figure(figsize=(12, 4))
plt.plot(t_vect, sig_10, label="10 Hz")
plt.plot(t_vect, sig_5, label="5 Hz")
plt.xlabel("Time (s)")
plt.ylabel("Voltage (mV)")
plt.legend()
plt.show()

## I.4 Effect of phase

In [None]:
sig_10 = sinus_function(t_vect, f = 10, phi=0)    # Generate sine wave
sig_10_phaseoffset = sinus_function(t_vect, f = 10, phi=np.pi/2)    # Generate sine wave

# Plot
plt.figure(figsize=(12, 4))
plt.plot(t_vect, sig_10, label="phi = 0")
plt.plot(t_vect, sig_10_phaseoffset, label="phi = π/2")
plt.xlabel("Time (s)")
plt.ylabel("Voltage (mV)")
plt.legend()
plt.show()

## I.5 Effect of amplitude

In [None]:
sig_10 = sinus_function(t_vect, f = 10, phi=0, A=1)    # Generate sine wave
sig_10_highampltiude = sinus_function(t_vect, f = 10, phi=0, A=3)    # Generate sine wave

# Plot
plt.figure(figsize=(12, 4))
plt.plot(t_vect, sig_10, label="A = 1")
plt.plot(t_vect, sig_10_highampltiude, label="A = 3")
plt.xlabel("Time (s)")
plt.ylabel("Voltage (mV)")
plt.legend()
plt.show()

# II. From analog to digital signal

In the real world, neural signals are continuous analog waveforms - voltage fluctuations that vary smoothly over time. However, computers can only work with discrete digital values. This section explores how we convert analog signals into digital data through **sampling**.

We'll cover:
- How sampling rate determines what information we can capture
- The **Nyquist theorem**: why you need to sample at least twice the highest frequency
- What happens when we violate this rule (**aliasing**) and why it matters

Understanding these concepts is critical because improper sampling can introduce artifacts that cannot be fixed later in your analysis pipeline.

## II.0 What is sampling?

**Sampling** is the process of converting a continuous analog signal into a discrete digital representation by measuring the signal's amplitude at regular time intervals.

**Key concept:** The **sampling rate** (or sampling frequency, *fs*) defines how many measurements we take per second, measured in Hertz (Hz). For example:
- fs = 1000 Hz means we take 1000 samples every second
- The time between samples is the **sampling period**: 1/fs seconds

The fundamental question: *How fast do we need to sample to accurately represent our signal?*

## II.1 High sampling

In [None]:
# Define Sampling parameters
freq_sinus = 10
f_analog = 10000  #Rate for plotting analog signal (10 kHz)
f_sampling = 200 # Rate for sampling digital signal, in Hz
duration = 1.0  # Seconds

t_analog = np.arange(0, duration, 1/f_analog)
sig_analog = sinus_function(t_analog, f = freq_sinus)    # Generate sine wave

t_digital = np.arange(0, duration, 1/f_sampling)
sig_digital = sinus_function(t_digital, f = freq_sinus)

plt.figure(figsize=(14, 7))
plt.suptitle(f"Sampling at high frequency", fontsize=16)

ax1 = plt.subplot(2, 1, 1)
ax1.plot(t_analog, sig_analog, 'r-', linewidth=1.5, label=f'True Analog Signal ({freq_sinus} Hz)')
ax1.plot(t_digital, sig_digital, 'bo', markersize=5, label=f'Sampled Points (FS = {f_sampling} Hz)')
ax1.legend()
ax1.grid(True, linestyle='--', alpha=0.6)
ax1.set_ylabel('Amplitude')
ax1.set_xlabel('Time (s)')

ax2 = plt.subplot(2, 1, 2, sharey=ax1)
ax2.plot(t_digital, sig_digital, c='blue', marker = 'o', markersize=5, markerfacecolor = 'k', markeredgecolor = 'k', label=f'Reconstructed signal')
ax2.legend()
ax2.grid(True, linestyle='--', alpha=0.6)
ax2.set_ylabel('Amplitude')
ax2.set_xlabel('Time (s)')

When we sample much faster than necessary (e.g., 10 times the signal frequency), we capture many points per cycle. This provides an excellent digital representation of the original signal, but comes at the cost of larger file sizes, more computational resources needed, slower processing

**When to use:** When you need maximum accuracy or aren't sure about the highest frequencies in your data

## II.2 Sufficient sampling

In [None]:
# Define Sampling parameters
freq_sinus = 10
f_analog = 10000  #Rate for plotting analog signal (10 kHz)
f_sampling = 25 # Rate for sampling digital signal, in Hz
duration = 1.0  # Seconds

t_analog = np.arange(0, duration, 1/f_analog)
sig_analog = sinus_function(t_analog, f = freq_sinus)    # Generate sine wave

t_digital = np.arange(0, duration, 1/f_sampling)
sig_digital = sinus_function(t_digital, f = freq_sinus)

plt.figure(figsize=(14, 7))
plt.suptitle(f"Sampling at sufficient frequency (> 2 * fs)", fontsize=16)

ax1 = plt.subplot(2, 1, 1)
ax1.plot(t_analog, sig_analog, 'r-', linewidth=1.5, label=f'True Analog Signal ({freq_sinus} Hz)')
ax1.plot(t_digital, sig_digital, 'bo', markersize=5, label=f'Sampled Points (FS = {f_sampling} Hz)')
ax1.legend()
ax1.grid(True, linestyle='--', alpha=0.6)
ax1.set_ylabel('Amplitude')
ax1.set_xlabel('Time (s)')

ax2 = plt.subplot(2, 1, 2, sharey=ax1)
ax2.plot(t_digital, sig_digital, c='blue', marker = 'o', markersize=5, markerfacecolor = 'k', markeredgecolor = 'k', label=f'Reconstructed signal')
ax2.legend()
ax2.grid(True, linestyle='--', alpha=0.6)
ax2.set_ylabel('Amplitude')
ax2.set_xlabel('Time (s)')

This is a practical balance - we sample faster than the Nyquist minimum but not excessively. The signal is still well-represented with good fidelity.
- **Pros:** Good signal reconstruction, reasonable file sizes, safe margin above Nyquist
- **Cons:** Slightly larger data than strictly necessary
- **When to use:** Most real-world applications - it's the standard practice

## II.3 Critical sampling frequency: Nyquist (fs/2)

In [None]:
# Define Sampling parameters
freq_sinus = 10
f_analog = 10000  #Rate for plotting analog signal (10 kHz)
f_sampling = 20 # Rate for sampling digital signal, in Hz
duration = 1.0  # Seconds

t_analog = np.arange(0, duration, 1/f_analog)
sig_analog = sinus_function(t_analog, f = freq_sinus)    # Generate sine wave

t_digital = np.arange(0, duration, 1/f_sampling)
sig_digital = sinus_function(t_digital, f = freq_sinus)

plt.figure(figsize=(14, 7))
plt.suptitle(f"Sampling at high frequency", fontsize=16)

ax1 = plt.subplot(2, 1, 1)
ax1.plot(t_analog, sig_analog, 'r-', linewidth=1.5, label=f'True Analog Signal ({freq_sinus} Hz)')
ax1.plot(t_digital, sig_digital, 'bo', markersize=5, label=f'Sampled Points (FS = {f_sampling} Hz)')
ax1.legend()
ax1.grid(True, linestyle='--', alpha=0.6)
ax1.set_ylabel('Amplitude')
ax1.set_xlabel('Time (s)')

ax2 = plt.subplot(2, 1, 2, sharey = ax1)
ax2.plot(t_digital, sig_digital, c='blue', marker = 'o', markersize=5, markerfacecolor = 'k', markeredgecolor = 'k', label=f'Reconstructed signal')
ax2.legend()
ax2.grid(True, linestyle='--', alpha=0.6)
ax2.set_ylabel('Amplitude')
ax2.set_xlabel('Time (s)')

**Sampling at exactly 2x the signal frequency**

This is the theoretical minimum sampling rate defined by the **Nyquist-Shannon theorem**: to perfectly reconstruct a signal, you must sample at least twice its highest frequency component.

## II.4 Under Nyquist frequency: aliasing

In [None]:
# Define Sampling parameters
freq_sinus = 10
f_analog = 10000  #Rate for plotting analog signal (10 kHz)
f_sampling = 15 # Rate for sampling digital signal, in Hz
duration = 1.0  # Seconds

t_analog = np.arange(0, duration, 1/f_analog)
sig_analog = sinus_function(t_analog, f = freq_sinus)    # Generate sine wave

t_digital = np.arange(0, duration, 1/f_sampling)
sig_digital = sinus_function(t_digital, f = freq_sinus)

plt.figure(figsize=(14, 7))
plt.suptitle(f"Sampling at high frequency", fontsize=16)

ax1 = plt.subplot(2, 1, 1)
ax1.plot(t_analog, sig_analog, 'r-', linewidth=1.5, label=f'True Analog Signal ({freq_sinus} Hz)')
ax1.plot(t_digital, sig_digital, 'bo', markersize=5, label=f'Sampled Points (FS = {f_sampling} Hz)')
ax1.legend()
ax1.grid(True, linestyle='--', alpha=0.6)
ax1.set_ylabel('Amplitude')
ax1.set_xlabel('Time (s)')

ax2 = plt.subplot(2, 1, 2, sharey = ax1)
ax2.plot(t_digital, sig_digital, c='blue', marker = 'o', markersize=5, markerfacecolor = 'k', markeredgecolor = 'k', label=f'Reconstructed signal')
ax2.legend()
ax2.grid(True, linestyle='--', alpha=0.6)
ax2.set_ylabel('Amplitude')
ax2.set_xlabel('Time (s)')

**Sampling below the Nyquist rate**

When we sample slower than 2x the signal frequency, we get **aliasing** - the signal appears as a different, lower frequency in our digital data. This is a critical error that cannot be fixed after acquisition!

- **What happens:** High frequencies masquerade as low frequencies
- **Example:** A 10 Hz signal sampled at 15 Hz appears as a 5 Hz signal
- **Danger:** You can't tell aliased signals from real low-frequency signals

# III. Filtering

Filtering is one of the most fundamental operations in signal processing. It allows us to selectively **keep or remove specific frequency components** from our data.

**Why filter?**
- **Remove noise:** Eliminate power line interference (50/60 Hz), high-frequency electrical noise, or slow drifts
- **Isolate signals of interest:** Extract specific brain rhythms (delta, theta, alpha, beta, gamma oscillations)
- **Prevent aliasing:** Remove high frequencies before downsampling
- **Improve signal quality:** Enhance features relevant to your analysis

**Critical concepts we'll cover:**
- Filter parameters: cutoff frequency and filter order
- Forward-backward filtering with `filtfilt` for zero phase distortion
- Edge artifacts 
- Types of filtering and applications

## III.1 Example filtering: low-pass filtering

In [None]:
from scipy.signal import butter, lfilter, freqz # Key library for filtering

# Define Sampling parameters
freq_low = 10
freq_high = 50
fs = 1000  #Rate for plotting analog signal (10 kHz)
duration = 1.0  # Seconds


cutoff_lowpass = 25.0  # Hz
order = 5             # Order of the filter (determines steepness)


# --- Signal Parameters ---
Fs = 10000        # Sampling frequency (Hz)
duration = 1.0    # Duration (seconds)
t_vect = np.arange(0, duration, 1/fs)

sig_low = sinus_function(t_vect,freq_low)
sig_high = sinus_function(t_vect,freq_high)
composite_signal = sig_low + sig_high

# Design the filter coefficients (Wn is normalized frequency: F_c / (Fs/2))
nyq = 0.5 * fs
Wn = cutoff_lowpass / nyq
b, a = butter(order, Wn, btype='low', analog=False)

# Apply the filter
filtered_lowpass = lfilter(b, a, composite_signal)

plt.figure(figsize=(14, 7))
ax1 = plt.subplot(2, 1, 1)
ax1.plot(t_vect, composite_signal, 'r-', linewidth=1.5)
ax1.grid(True, linestyle='--', alpha=0.6)
ax1.set_ylabel('Amplitude')
ax1.set_xlabel('Time (s)')

# --- Visualization ---
ax2 = plt.subplot(2, 1, 2)
ax2.plot(t_vect, composite_signal, 'r-', linewidth=1, alpha=0.5, label='Composite (10Hz + 50Hz)')
ax2.plot(t_vect, filtered_lowpass, 'b-', linewidth=2, label=f'Filtered (LPF < {cutoff_lowpass}Hz)')
ax2.set_xlabel('Time (s)')
ax2.set_ylabel('Amplitude')
ax2.legend()
ax2.grid(True)

## III.2 Reaching zero-lag filtering

In [None]:
from scipy.signal import butter, lfilter, freqz, filtfilt # Key library for filtering

# Define Sampling parameters
freq_low = 10
freq_high = 50
fs = 1000  #Rate for plotting analog signal (10 kHz)
duration = 1.0  # Seconds


cutoff_lowpass = 25.0  # Hz
order = 5             # Order of the filter (determines steepness)


# --- Signal Parameters ---
Fs = 10000        # Sampling frequency (Hz)
duration = 1.0    # Duration (seconds)
t_vect = np.arange(0, duration, 1/fs)

sig_low = sinus_function(t_vect,freq_low)
sig_high = sinus_function(t_vect,freq_high)
composite_signal = sig_low + sig_high

# Design the filter coefficients (Wn is normalized frequency: F_c / (Fs/2))
nyq = 0.5 * fs
Wn = cutoff_lowpass / nyq
b, a = butter(order, Wn, btype='low', analog=False)

# Apply the filter
filtered_lowpass = lfilter(b, a, composite_signal)

reversed_signal = np.flip(filtered_lowpass)
reversed_signal_filtered = lfilter(b, a, reversed_signal)
zero_lag_signal = np.flip(reversed_signal_filtered)


# 3b. Zero-Phase (Non-Causal) Filtering: single-function
zero_lag_signal_single_function = filtfilt(b, a, composite_signal)

# --- Visualization ---
plt.figure(figsize=(14, 7))

ax1 = plt.subplot(2, 1, 1)
ax1.plot(t_vect, composite_signal, 'r-', linewidth=1, alpha=0.5, label='Composite (10Hz + 50Hz)')
ax1.plot(t_vect, filtered_lowpass, 'b-', linewidth=2, label=f'Single filtering')
ax1.plot(t_vect, zero_lag_signal, 'g-', linewidth=2, label=f'Back-and-forth filtering')
ax1.set_xlabel('Time (s)')
ax1.set_ylabel('Amplitude')
ax1.legend()
ax1.grid(True)

ax2 = plt.subplot(2, 1, 2)
ax2.plot(t_vect, composite_signal, 'r-', linewidth=1, alpha=0.5, label='Composite (10Hz + 50Hz)')
ax2.plot(t_vect, zero_lag_signal, 'g-', linewidth=2, label=f'Back-and-forth filtering')
ax2.plot(t_vect, zero_lag_signal_single_function, '-', linewidth=2, c = 'm', label='Computed with filtfilt')
ax2.grid(True, linestyle='--', alpha=0.6)
ax2.set_ylabel('Amplitude')
ax2.set_xlabel('Time (s)')
ax2.legend()
ax2.grid(True)

## III.4 Edge effects

In [None]:
# --- Visualization ---
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8), sharey=True)
fig.suptitle('Comparing Edge Effects: Manual vs. filtfilt', fontsize=16)

# --- Plot 1: Zoomed in on the START of the signal ---
ax1.plot(t_vect, composite_signal, 'r-', linewidth=1, alpha=0.5, label='Original Signal')
ax1.plot(t_vect, zero_lag_signal, 'g-', linewidth=2, label='Manual Forward-Backward')
ax1.plot(t_vect, zero_lag_signal_single_function, '-', linewidth=2, c = 'm', label='filtfilt (Corrected Edges)')
ax1.grid(True, linestyle='--', alpha=0.6)
ax1.set_title("Start of Signal (t=0)")
ax1.set_xlabel("Time [s]")
ax1.set_ylabel("Amplitude")
ax1.legend()
ax1.set_xlim(-0.01, 0.1) # Zoom in on the first 200ms

# --- Plot 2: Zoomed in on the END of the signal ---
ax2.plot(t_vect, composite_signal, 'r-', linewidth=1, alpha=0.5, label='Original Signal')
ax2.plot(t_vect, zero_lag_signal, 'g-', linewidth=2, label='Manual Forward-Backward')
ax2.plot(t_vect, zero_lag_signal_single_function, '-', linewidth=2, c = 'm', label='filtfilt (Corrected Edges)')
ax2.grid(True, linestyle='--', alpha=0.6)
ax2.set_title("End of Signal")
ax2.set_xlabel("Time [s]")
ax2.set_ylabel("Amplitude")
ax2.legend()
ax2.set_xlim(duration - 0.1, duration + 0.01) # Zoom in on the last 200ms

plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()


**How the Two Methods Work Differently**

**Manual Forward-Backward Method (Green Line):**
1. **First pass**: Filter the signal from start to end (forward)
   - Problem: The filter starts with "empty memory" at t=0, causing a startup transient
2. **Flip it**: Reverse the filtered signal
3. **Second pass**: Filter the reversed signal (backward in time relative to original)
   - Problem: Again starts with "empty memory" at what was the end
4. **Flip again**: Reverse back to original time direction

This method removes time delay but **doesn't fix the edge problems** because the filter always starts fresh at each edge.

**filtfilt Method (Magenta Line):**
1. **Smart padding**: Before filtering, `filtfilt` extends the signal at both ends by reflecting/mirroring it
   - This gives the filter some "warm-up" data so it's already in a good state when it reaches the real signal
2. **First pass**: Filter forward (but now with padded data)
3. **Second pass**: Filter backward (also with padded data)
4. **Remove padding**: Cut off the extra padded sections, keeping only the original signal length

The key difference: `filtfilt` **prepares the filter** before it hits the actual signal edges, resulting in much cleaner boundaries.

## III.4 Filter order

The **order** of a filter determines how sharply it can separate frequencies. Think of it like the "strength"of the filter.

- **Lower order** (1-3): Gentle, smooth transition between frequencies, but less selective
- **Higher order** (6-10): Sharp, aggressive cutoff, but can cause instability or ringing

Let's see how different filter orders affect our composite signal (10 Hz + 50 Hz) with a 25 Hz cutoff.

In [None]:
# --- Effect of Filter Order ---
cutoff_lowpass = 25.0  # Hz (between our 10 Hz and 50 Hz components)
nyq = 0.5 * fs
Wn = cutoff_lowpass / nyq

# Create figure with 3 subplots
fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(14, 10))
fig.suptitle('Effect of Filter Order on Lowpass Filtering', fontsize=16, fontweight='bold')

# --- Subplot 1: Low orders (1-3) ---
ax1.plot(t_vect, composite_signal, 'k-', linewidth=1.5, alpha=0.3, label='Original (10Hz + 50Hz)')
for order in [1, 2, 3]:
    b, a = butter(order, Wn, btype='low', analog=False)
    filtered = filtfilt(b, a, composite_signal)
    ax1.plot(t_vect, filtered, linewidth=2.5, label=f'Order {order}', alpha=0.8)
ax1.set_title('Low Order (Gentle Filtering)', fontsize=12)
ax1.set_ylabel('Amplitude')
ax1.legend(loc='upper right')
ax1.grid(True, alpha=0.3)
ax1.set_xlim(0, 0.3)

# --- Subplot 2: Medium orders (4-6) ---
ax2.plot(t_vect, composite_signal, 'k-', linewidth=1.5, alpha=0.3, label='Original (10Hz + 50Hz)')
for order in [4, 5, 6]:
    b, a = butter(order, Wn, btype='low', analog=False)
    filtered = filtfilt(b, a, composite_signal)
    ax2.plot(t_vect, filtered, linewidth=2.5, label=f'Order {order}', alpha=0.8)
ax2.set_title('Medium Order (Balanced)', fontsize=12)
ax2.set_ylabel('Amplitude')
ax2.legend(loc='upper right')
ax2.grid(True, alpha=0.3)
ax2.set_xlim(0, 0.3)

# --- Subplot 3: High orders (8, 10, 15) - showing artifacts ---
ax3.plot(t_vect, composite_signal, 'k-', linewidth=1.5, alpha=0.3, label='Original (10Hz + 50Hz)')
for order in [8, 10, 15]:
    b, a = butter(order, Wn, btype='low', analog=False)
    filtered = filtfilt(b, a, composite_signal)
    ax3.plot(t_vect, filtered, linewidth=2.5, label=f'Order {order}', alpha=0.8)
ax3.set_title('High Order (Watch for Overshoot/Ringing)', fontsize=12)
ax3.set_xlabel('Time [s]')
ax3.set_ylabel('Amplitude')
ax3.legend(loc='upper right')
ax3.grid(True, alpha=0.3)
ax3.set_xlim(0, 0.3)

plt.tight_layout()
plt.show()

**Top Plot (Order 1-3):**
- Very gentle filtering
- The 50 Hz component is still quite visible as high-frequency oscillations
- Smooth, no artifacts, but not very selective

**Middle Plot (Order 4-6):**
- Good balance between smoothness and selectivity
- The 50 Hz is well suppressed while keeping the 10 Hz clean

**Bottom Plot (Order 8-15):**
- Very aggressive filtering - the 50 Hz is almost completely gone
- Order 15 shows slight ripples or numerical instability
- The signal might look "too smooth" and lose important features

**Guidelines for practical applications**: Start with order 4-6. Higher orders aren't always better—they can introduce numerical instability, overshoot, or ringing artifacts, especially near sudden changes in the signal.

## III.5 Types of Filters

Filters can be designed to let through or block different frequency ranges. Here are the four main types:

- **Lowpass**: Keeps low frequencies, removes high frequencies
- **Highpass**: Keeps high frequencies, removes low frequencies
- **Bandpass**: Keeps frequencies in a specific range, removes everything else
- **Bandstop (Notch)**: Removes frequencies in a specific range, keeps everything else

In [None]:
# --- Setup: Create composite signal with 3 frequencies ---
fs = 1000
duration = 1.0
t_vect = np.arange(0, duration, 1/fs)

# Create composite signal with 3 frequencies for demonstration
freq1 = 5   # Low frequency
freq2 = 25  # Mid frequency
freq3 = 80  # High frequency

sig1 = np.sin(2 * np.pi * freq1 * t_vect)
sig2 = np.sin(2 * np.pi * freq2 * t_vect)
sig3 = np.sin(2 * np.pi * freq3 * t_vect)
composite_signal = sig1 + sig2 + sig3

nyq = 0.5 * fs
order = 4

We'll demonstrate it with real data as well:

In [None]:
# Load recording
rec_50hz = np.load("data/50hz_traces.npz")
traces_50hz = rec_50hz["traces"][0,:] # Import first trace
fs_50hz = rec_50hz["fs"]    # Sampling rate

duration_real = len(traces_50hz) / fs_50hz
t_real = np.linspace(0, duration_real, len(traces_50hz))

### Lowpass Filter: Keep Low Frequencies Only

Removes high-frequency components while preserving slow variations.

**Use case**: Smoothing signals, removing high-frequency noise.

In [None]:
# Lowpass: keeps frequencies below 15 Hz
low_cutoff = 15 / nyq
b_low, a_low = butter(order, low_cutoff, btype='low')
filtered_low = filtfilt(b_low, a_low, composite_signal)

# Visualization
plt.figure(figsize=(14, 5))
plt.plot(t_vect, composite_signal, 'k-', linewidth=1, alpha=0.3, label='Original (5Hz + 25Hz + 80Hz)')
plt.plot(t_vect, filtered_low, 'b-', linewidth=2.5, label='Lowpass (< 15 Hz)')
plt.title('Lowpass Filter: Keeps LOW frequencies only', fontsize=14, fontweight='bold')
plt.xlabel('Time [s]')
plt.ylabel('Amplitude')
plt.legend()
plt.grid(True, alpha=0.3)
plt.xlim(0, 0.5)
plt.tight_layout()
plt.show()

In [None]:
traces_low = filtfilt(b_low, a_low, traces_50hz)

# Visualization
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 8), sharey=True)
fig.suptitle('Lowpass Filter: Keeps LOW frequencies only', fontsize=16, fontweight='bold')

# Subplot 1: Full signal
ax1.plot(t_real, traces_50hz, 'k-', linewidth=0.5, alpha=0.3, label='Original')
ax1.plot(t_real, traces_low, 'b-', linewidth=1, label='Lowpass (< 15 Hz)')
ax1.set_title('Full Signal')
ax1.set_ylabel('Amplitude')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Subplot 2: Zoomed-in view
ax2.plot(t_real, traces_50hz, 'k-', linewidth=1, alpha=0.3, label='Original')
ax2.plot(t_real, traces_low, 'b-', linewidth=2, label='Lowpass (< 15 Hz)')
ax2.set_title('Zoomed-in View (38-40s)')
ax2.set_xlabel('Time [s]')
ax2.set_ylabel('Amplitude')
ax2.legend()
ax2.grid(True, alpha=0.3)
ax2.set_xlim(38, 40)

plt.tight_layout(rect=[0, 0.03, 1, 0.95]) # Adjust layout to make room for suptitle
plt.show()

### Highpass Filter: Keep High Frequencies Only

Removes low-frequency components while preserving fast variations.

**Use case**: Removing baseline drift, DC offset, or slow trends.

In [None]:
# Highpass: keeps frequencies above 40 Hz
high_cutoff = 40 / nyq
b_high, a_high = butter(order, high_cutoff, btype='high')
filtered_high = filtfilt(b_high, a_high, composite_signal)

# Visualization
plt.figure(figsize=(14, 5))
plt.plot(t_vect, composite_signal, 'k-', linewidth=1, alpha=0.3, label='Original (5Hz + 25Hz + 80Hz)')
plt.plot(t_vect, filtered_high, 'r-', linewidth=2.5, label='Highpass (> 40 Hz)')
plt.title('Highpass Filter: Keeps HIGH frequencies only', fontsize=14, fontweight='bold')
plt.xlabel('Time [s]')
plt.ylabel('Amplitude')
plt.legend()
plt.grid(True, alpha=0.3)
plt.xlim(0, 0.5)
plt.tight_layout()
plt.show()

In [None]:
traces_high = filtfilt(b_high, a_high, traces_50hz)

# Visualization
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 8), sharey=True)
fig.suptitle('Highpass Filter: Keeps HIGH frequencies only', fontsize=14, fontweight='bold')

# Subplot 1: Full signal
ax1.plot(t_real, traces_50hz, 'k-', linewidth=0.5, alpha=0.3, label='Original')
ax1.plot(t_real, traces_high, 'r-', linewidth=1, label='Highpass (> 40 Hz)')
ax1.set_title('Full Signal')
ax1.set_ylabel('Amplitude')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Subplot 2: Zoomed-in view
ax2.plot(t_real, traces_50hz, 'k-', linewidth=1, alpha=0.3, label='Original')
ax2.plot(t_real, traces_high, 'r-', linewidth=2, label='Highpass (> 40 Hz)')
ax2.set_title('Zoomed-in View (38-40s)')
ax2.set_xlabel('Time [s]')
ax2.set_ylabel('Amplitude')
ax2.legend()
ax2.grid(True, alpha=0.3)
ax2.set_xlim(38, 40)

plt.tight_layout(rect=[0, 0.03, 1, 0.95]) # Adjust layout to make room for suptitle
plt.show()

### Bandpass Filter: Keep a Specific Frequency Range

Keeps only frequencies within a specified range, removes everything else.

**Use case**: Isolating a specific signal component.

In [None]:
# Bandpass: keeps frequencies between 15-40 Hz
band_cutoffs = [15 / nyq, 40 / nyq]
b_band, a_band = butter(order, band_cutoffs, btype='band')
filtered_band = filtfilt(b_band, a_band, composite_signal)

# Visualization
plt.figure(figsize=(14, 5))
plt.plot(t_vect, composite_signal, 'k-', linewidth=1, alpha=0.3, label='Original (5Hz + 25Hz + 80Hz)')
plt.plot(t_vect, filtered_band, 'g-', linewidth=2.5, label='Bandpass (15-40 Hz)')
plt.title('Bandpass Filter: Keeps MIDDLE frequencies only', fontsize=14, fontweight='bold')
plt.xlabel('Time [s]')
plt.ylabel('Amplitude')
plt.legend()
plt.grid(True, alpha=0.3)
plt.xlim(0, 0.5)
plt.tight_layout()
plt.show()

In [None]:
traces_band = filtfilt(b_band, a_band, traces_50hz)

# Visualization
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 8), sharey=True)
fig.suptitle('Highpass Filter: Keeps HIGH frequencies only', fontsize=14, fontweight='bold')

# Subplot 1: Full signal
ax1.plot(t_real, traces_50hz, 'k-', linewidth=0.5, alpha=0.3, label='Original')
ax1.plot(t_real, traces_band, 'g-', linewidth=1, label='Bandpass (15-40 Hz)')
ax1.set_title('Full Signal')
ax1.set_ylabel('Amplitude')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Subplot 2: Zoomed-in view
ax2.plot(t_real, traces_50hz, 'k-', linewidth=1, alpha=0.3, label='Original')
ax2.plot(t_real, traces_band, 'g-', linewidth=2, label='Bandpass (15-40 Hz)')
ax2.set_title('Zoomed-in View (38-40s)')
ax2.set_xlabel('Time [s]')
ax2.set_ylabel('Amplitude')
ax2.legend()
ax2.grid(True, alpha=0.3)
ax2.set_xlim(38, 40)

plt.tight_layout(rect=[0, 0.03, 1, 0.95]) # Adjust layout to make room for suptitle
plt.show()

### Bandstop (Notch) Filter: Remove Specific Frequencies

Removes frequencies within a specified range, keeps everything else.

**Use case**: **Removing 50 Hz or 60 Hz power line interference** - this is extremely common in electrical recordings (EMG, EEG, ECG)!

Let's demonstrate with a signal contaminated by 50 Hz noise:

In [None]:
# Create a signal with 50 Hz power line interference
freq_signal = 10  # Our signal of interest
freq_noise = 50   # Power line interference

signal_clean = np.sin(2 * np.pi * freq_signal * t_vect)
noise_50hz = 0.5 * np.sin(2 * np.pi * freq_noise * t_vect)  # 50 Hz noise
signal_noisy = signal_clean + noise_50hz

# Notch filter to remove 50 Hz (with a narrow band around it: 48-52 Hz)
notch_cutoffs = [48 / nyq, 52 / nyq]
b_notch, a_notch = butter(order, notch_cutoffs, btype='bandstop')
signal_filtered = filtfilt(b_notch, a_notch, signal_noisy)

# Visualization
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 8))
fig.suptitle('Notch Filter: Removing 50 Hz Power Line Interference', fontsize=16, fontweight='bold')

# Before filtering
ax1.plot(t_vect, signal_clean, 'r-', linewidth=1, label='Clean 10 Hz Signal')
ax1.plot(t_vect, signal_noisy, 'k-', linewidth=1, alpha=0.5, label='Noisy (10 Hz + 50 Hz interference)')
ax1.set_title('Before: Signal contaminated with 50 Hz noise', fontsize=12)
ax1.set_ylabel('Amplitude')
ax1.legend()
ax1.grid(True, alpha=0.3)
ax1.set_xlim(0, 0.3)

# After filtering
ax2.plot(t_vect, signal_clean, 'r-', linewidth=1, alpha=0.5, label='Original Clean Signal')
ax2.plot(t_vect, signal_filtered, 'm-', linewidth=2, label='After Notch Filter (48-52 Hz removed)')
ax2.set_title('After: 50 Hz interference removed', fontsize=12)
ax2.set_xlabel('Time [s]')
ax2.set_ylabel('Amplitude')
ax2.legend()
ax2.grid(True, alpha=0.3)
ax2.set_xlim(0, 0.3)

plt.tight_layout()
plt.show()

In [None]:
traces_notch = filtfilt(b_notch, a_notch, traces_50hz)

# Visualization
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 8), sharey=True)
fig.suptitle('Notch Filter: Removing 50 Hz Power Line Interference', fontsize=16, fontweight='bold')

# Subplot 1: Full signal
ax1.plot(t_real, traces_50hz, 'k-', linewidth=0.5, alpha=0.3, label='Original')
ax1.plot(t_real, traces_notch, 'm-', linewidth=1, label='Notch (50 Hz)')
ax1.set_title('Full Signal')
ax1.set_ylabel('Amplitude')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Subplot 2: Zoomed-in view
ax2.plot(t_real, traces_50hz, 'k-', linewidth=1, alpha=0.3, label='Original')
ax2.plot(t_real, traces_notch, 'm-', linewidth=2, label='Notch (50 Hz)')
ax2.set_title('Zoomed-in View (38-40s)')
ax2.set_xlabel('Time [s]')
ax2.set_ylabel('Amplitude')
ax2.legend()
ax2.grid(True, alpha=0.3)
ax2.set_xlim(38, 40)

plt.tight_layout(rect=[0, 0.03, 1, 0.95]) # Adjust layout to make room for suptitle
plt.show()

# IV. Downsampling

**What is downsampling?**
Downsampling reduces the number of samples in a signal by keeping only every Nth sample. For example, downsampling by a factor of 2 means keeping every 2nd sample (discarding half the data).

**Why do we use it?**
- **Reduce data size**: Smaller files, faster processing, less memory usage
- **Match sampling rates**: When combining signals from different sources
- **Computational efficiency**: Faster analysis when high frequencies aren't needed

**How does it work?**
If we downsample by factor N:
- Original sampling rate: 50 Hz → New sampling rate: 50/N Hz
- Keep samples at indices: 0, N, 2N, 3N, ...
- Example: Factor of 2 means 50 Hz → 25 Hz

**Critical step: Filter BEFORE downsampling!**
Remember aliasing? If we downsample without filtering first, high-frequency components above the new Nyquist frequency will fold back as artifacts.

**The rule:**
1. **First**: Apply a lowpass filter with cutoff < new_sampling_rate / 2
2. **Then**: Downsample

Example: To downsample from 500 Hz to 50 Hz:
- New Nyquist = 50/2 = 25 Hz
- Filter out everything above ~15-20 Hz
- Then keep every 10th sample

**The golden rule**: Always lowpass filter at the new Nyquist frequency before downsampling to prevent aliasing artifacts.

In [None]:
# Simulated data example for downsampling
duration = 2  # seconds
fs_original = 500  # Hz
t_sim = np.linspace(0, duration, int(fs_original * duration), endpoint=False)

# Create a signal with multiple frequency components
freq_low = 5   # Hz - low frequency (should survive downsampling)
freq_high = 40  # Hz - high frequency (above new Nyquist, will alias if not filtered)

signal = (np.sin(2 * np.pi * freq_low * t_sim) + 
          0.5 * np.sin(2 * np.pi * freq_high * t_sim))

# Downsampling setup
downsample_factor = 10
fs_new = fs_original / downsample_factor  # 50 Hz
nyquist_new = fs_new / 2  # 25 Hz

# Method 1: WRONG - Downsample without filtering
downsampled_wrong = signal[::downsample_factor]
t_downsampled = t_sim[::downsample_factor]

# Method 2: CORRECT - Filter first (remove frequencies > 15 Hz), then downsample
b, a = butter(4, 15, btype='low', fs=fs_original)
filtered_signal = filtfilt(b, a, signal)
downsampled_correct = filtered_signal[::downsample_factor]

# Visualization
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 8), sharex=True)

# Subplot 1: Wrong method
ax1.plot(t_sim, signal, 'k-', linewidth=1, alpha=0.3, label=f'Original: {freq_low} Hz + {freq_high} Hz (500 Hz sampling)')
ax1.plot(t_downsampled, downsampled_wrong, 'r.-', linewidth=2, markersize=6, label='Downsampled WITHOUT filtering')
ax1.axhline(y=0, color='gray', linestyle='--', alpha=0.3)
ax1.set_title(f'WITHOUT Filtering: {freq_high} Hz aliases (new Nyquist = {nyquist_new} Hz)', fontsize=12, fontweight='bold')
ax1.set_ylabel('Amplitude')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Subplot 2: Correct method
ax2.plot(t_sim, signal, 'k-', linewidth=1, alpha=0.3, label=f'Original: {freq_low} Hz + {freq_high} Hz')
ax2.plot(t_sim, filtered_signal, 'g-', linewidth=1, alpha=0.5, label='After lowpass filter (< 15 Hz)')
ax2.plot(t_downsampled, downsampled_correct, 'b.-', linewidth=2, markersize=6, label='Downsampled after filtering')
ax2.axhline(y=0, color='gray', linestyle='--', alpha=0.3)
ax2.set_title(f'WITH Filtering: {freq_high} Hz removed, only {freq_low} Hz remains', fontsize=12, fontweight='bold')
ax2.set_xlabel('Time [s]')
ax2.set_ylabel('Amplitude')
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()