# Class 09: Tutorial

In this tutorial, we will explore the basics of working with audio data in Python. We will rely on a combination of classical modules such as *numpy* and *matplotlib*, but mostly on the signal processing library called *scipy*.

Colab works generally like local notebooks, but its shortcuts are different. Below is a list that you might find helpful:

- Add a text cell: `ctrl + L`
- Add a code cell: `ctrl + K`
- Convert to text cell: `ctrl + M M`
- Convert to code cell: `ctrl + M Y`
- Add code cell above: `ctrl + M A`
- Add code cell below: `ctrl + M B`
- Run cell and select new cell `shift + enter`
- Run cell and insert new cell `alt + enter`
- Run selected code `ctrl + shift + enter`
- Run the focused cell `ctrl + enter`
- Run all cells in notebook `ctrl + F9`
- Run all cells before the current `ctrl + F8`
- Run cell and all cells after `ctrl + F10`
- Restart runtime: `ctrl + M`


## 0 Setup

### 0.1 What's Colab?

Google Colab works just like local notebooks, but are hosted in the cloud. This feature means that we access to GPUs which most of us don't have on our local computers.

Good things to know:
* Colab that is free of charge notebooks can run for at most 12 hours
* Each colab notebook uses a so-called runtime. When you refresh colab, you are starting a new runtime. When you do this, you lose *all* things you've created. That is, you must download/upload the data again, import all libraries and so on.

### 0.2 Accessing GitHub
To access the audio files from GitHub, we will start by cloning the GitHub repo into our local runtime. This can be with the terminal command:

```
!git clone https://github.com/mraskj/css_fall2023.git
```

The exclamation mark `!` specifies that we execute a terminal command within a notebook, just as we have seen previously. When the repo is cloned, it is located in */content/


In [None]:
# Clone GitHub directory into
!git clone https://github.com/mraskj/css_fall2023.git


### 0.3 Environments and Packages

Unlike local notebooks, which we typically run in local environments to keep package dependencies in check, Colab notebooks run each runtime. That means that all packages must be installed if you for instance refresh Colab. Unlike local environments, however, Colab has a bunch of preinstalled packages, but we still need to a few on certain occassions. We can do that using


```
!pip install -r /content/css_fall2023/requirements/FILENAME.txt
```

where *FILENAME* is the name of the file (e.g. *requirements-topic4-class9*).

In [None]:
!pip install -r /content/css_fall2023/requirements/requirements_topic4-class9-colab.txt

In [None]:
!sudo apt-get install libportaudio2

### 0.4 Importing Modules

In [None]:
# MODULES

# For file and directory management
import os

# For data handling
import numpy as np

# For plotting
import matplotlib.pyplot as plt

# For signal processing
import scipy
import librosa
from scipy.io import wavfile

## Waveforms

In [None]:
# Signal specs
length = 2.0
sr = 1000
amplitude = 1.0
f = 1.0

# Generate time values
t = np.linspace(0, length, int(length * sr), endpoint=False)

# Generate sine wave
sine_wave = amplitude * np.sin(2 * np.pi * f * t)

In [None]:
# The working of np.linspace
np.linspace(start=0, stop=10, num=10, endpoint=False)

In [None]:
# Plot waveform
plt.figure(figsize=(12, 8))
plt.plot(t, sine_wave, color='#381a61')
plt.title(f"Frequency = {f} Hz", size=20)
plt.ylabel(f"Amplitude {amplitude}", size=16)
plt.grid(True)
plt.show()

In [None]:
def generate_sine_signals(f, length, sr, amplitude):

  # Generate time values
  t = np.linspace(0, length, int(length * sr), endpoint=False)

  # Generate sine wave
  sine_wave = amplitude * np.sin(2 * np.pi * f * t)

  return {'f': f, 'length': length, 'sr': sr, 'amplitude': amplitude}, sine_wave


def plot_waveform(x, y, color='#381a61', ylab=None, xlab=None, title=None, show=True):
  plt.plot(x, y, color=color)

  if title:
    plt.title(title, size=20)

  if ylab:
    plt.ylabel(ylab, size=16)

  if xlab:
    plt.xlabel(xlab, size=16)

  plt.grid(True)

In [None]:
sine_wave1_specs, sine_wave1_signal = generate_sine_signals(f=1.0, length=2.0, sr=1000, amplitude=1.0)
sine_wave2_specs, sine_wave2_signal = generate_sine_signals(f=3.0, length=2.0, sr=1000, amplitude=1.0)

In [None]:
plt.figure(figsize=(16, 8))  # Adjust the figure size as needed
plt.subplot(1, 2, 1)
plot_waveform(x=t,
              y=sine_wave1_signal,
              ylab='Amplitude',
              xlab='Time (s)',
              title=f"Frequency {sine_wave1_specs['f']} Hz")

plt.subplot(1, 2, 2)  # 2 row, 2 columns, second subplot
plot_waveform(x=t,
              y=sine_wave2_signal,
              xlab='Time (s)',
              title=f"Frequency {sine_wave2_specs['f']} Hz")
plt.yticks([])
plt.show()

### Spectrograms

In [None]:
# Generate sine wave with:
#   - frequency: 50 Hz
#   - duration: 2 seconds
#   - sampling rate: 1000
#   - amplitude: 1.0
sine_wave_specs, sine_wave_signal = generate_sine_signals(f=50.0, length=2.0, sr=1000, amplitude=1.0)

# Plot as spectrogram using matplotlib
Pxx, freqs, spectimes, cax = plt.specgram(sine_wave_signal,
                                          Fs=sine_wave_specs['sr'],
                                          scale='dB',
                                          mode='psd')
plt.title('Spectrogram of sine wave with frequency=1 Hz', size=20)
plt.colorbar(label='dB')
plt.ylabel('Frequency (Hz)')
plt.xlabel('Time (s)')
plt.show()

In [None]:
# Change amplitude to 100.0
sine_wave_specs, sine_wave_signal = generate_sine_signals(f=50.0, length=2.0, sr=1000, amplitude=100.0)

Pxx, freqs, spectimes, cax = plt.specgram(sine_wave_signal,
                                          Fs=sine_wave_specs['sr'],
                                          scale='dB',
                                          mode='psd')
plt.colorbar(label='dB')
plt.ylabel('Frequency (Hz)')
plt.xlabel('Time (s)')
plt.show()

In [None]:
# Generate signal with two different frequencies but same amplitude
sine_wave1_specs, sine_wave1_signal = generate_sine_signals(f=50.0, length=2.0, sr=2000, amplitude=1.0)
sine_wave2_specs, sine_wave2_signal = generate_sine_signals(f=500.0, length=2.0, sr=2000, amplitude=1.0)

# Combine signals
sine_wave = sine_wave1_signal + sine_wave2_signal

# Plot
Pxx, freqs, spectimes, cax = plt.specgram(sine_wave,
                                          Fs=sine_wave1_specs['sr'],
                                          scale='dB',
                                          mode='psd')
plt.colorbar(label='dB')
plt.ylabel('Frequency (Hz)')
plt.xlabel('Time (s)')
plt.show()

In [None]:
# Generate signal with two different frequencies and amplitudes
sine_wave1_specs, sine_wave1_signal = generate_sine_signals(f=50.0, length=2.0, sr=2000, amplitude=100.0)
sine_wave2_specs, sine_wave2_signal = generate_sine_signals(f=500.0, length=2.0, sr=2000, amplitude=1.0)

sine_wave = sine_wave1_signal + sine_wave2_signal

Pxx, freqs, spectimes, cax = plt.specgram(sine_wave,
                                          Fs=sine_wave1_specs['sr'],
                                          scale='dB',
                                          mode='psd')
plt.colorbar(label='dB')
plt.ylabel('Frequency (Hz)')
plt.xlabel('Time (s)')
plt.show()

### Mel-Spectrograms

See Zheng, Zhang, and Song 2001: https://link.springer.com/content/pdf/10.1007/BF02943243.pdf

In [None]:
# Generate dummy signal
sample_rate = 2000
duration = 5
t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
signal = np.sin(2 * np.pi * 100 * t) + np.sin(2 * np.pi * 512 * t)

In [None]:
# Plot waveform
plt.figure(figsize=(12, 8))
plt.plot(t, signal, color='#381a61', alpha=.7)
plt.title(f"Frequency = {f} Hz", size=20)
plt.ylabel(f"Amplitude {amplitude}", size=16)
plt.grid(True)
plt.show()

In [None]:
spec = np.abs(librosa.stft(signal, hop_length=512))
spec = librosa.power_to_db(spec, ref=np.max)
plt.figure(figsize=(10, 6))
librosa.display.specshow(spec, sr=sample_rate, x_axis='time', y_axis='hz', cmap='viridis')
plt.colorbar(format='%+2.0f dB')
plt.title('Spectrogram')
plt.show()

In [None]:
mel_spectrogram = librosa.feature.melspectrogram(y=signal, sr=sample_rate)
mel_spectrogram_dB = librosa.power_to_db(mel_spectrogram, ref=np.max)

plt.figure(figsize=(10, 6))
librosa.display.specshow(mel_spectrogram_dB, x_axis='time', y_axis='mel', sr=sample_rate, cmap='viridis')
plt.colorbar(format='%+2.0f dB')
plt.title('Mel Spectrogram')
plt.show()

## Reading and Writing

In [None]:
# READING
base_dir = os.path.join(os.getcwd(), 'css_fall2023/data/audio/class09')
fname = 'speaker0_q90'
audio_fpath = os.path.join(base_dir, fname + '.wav')

sr, signal = wavfile.read(audio_fpath)

print(f"Sampling rate: {sr}")
print(f"Number of samples: {len(signal)}")
print(f"Duration (s): {len(signal) / sr}")

In [None]:
# WRITING
wavfile.write(filename='/content/testfile.wav', rate=sr, data=signal)