#MIR Week 1 Day 1

CCRMA MIR workshop 2021, Notebook by Elena Georgieva & Iran Roman

###Today's Goal: Review 'basic' concepts from MIR, review Python, and learn about existing MIR tools.

Instructions: Complete the sections below, filling in code or responses where marked

First, we load in our audio files

In [1]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
!ls drive/MyDrive/CCRMA_MIR_2021/audio

drums = 'drive/MyDrive/CCRMA_MIR_2021/audio/drums.aif'
violin = 'drive/MyDrive/CCRMA_MIR_2021/audio/violin.wav'


Mounted at /content/drive
drums.aif  flute.aif  guitar.aif  piano.wav  trumpet.aif  violin.wav


## Part 1: Reading Audio

Librosa is a Python package for music and audio processing.

1. First, import librosa. Then, use librosa.load to load the 'drums' audio file into an audio array. You'll need a variable to store the audio array and a variable to store the sampling rate (fs or sr are common choices). 
2. Display the length of the signal and the sampling rate.


### Discuss: Sampling Rate
What is sampling rate?
Why do we need a sampling rate for digital audio?
What are common choices for sampling rates?

In [2]:
import ___

x, fs = ____

print('Signal Shape:', ___)
print('Sampling Rate:', ___)


ModuleNotFoundError: ignored

## Part 2: Playing Audio

IPython is another Python package we will use here. 
1. import IPython.display as ipd
2. Use IPython Audio to load and play the 'drums' audio file.

In [None]:
import ___
ipd.Audio(__, ___) 


## Part 2b: Read and Play the violin audio file

In [None]:
## Your Code Here

## Part 3: Visualizing Audio

1. Matplotlib is another Python library used for plotting. Import matplotlib.pyplot as plt.
2. Create a new figure using plt.figure
3. Use librosa.display.waveplot to display our drum signal and our violin signal. Do they look different?
4. Extra: Try to plot only some of the violin audio file, so it matches the length of the drum audio file. 
5. Extra: Add titles to your plots

Notice, this visualization is in the time domain (i.e. time is on the x axis). 

Discuss: Where have you seen audio represented like this?

In [None]:
## Your Code Here


## Part 4: Audio Features

Now we've plotted our signal, cool! How much information can we get from those plots?

Sometimes a bit of information. Sometimes... not much. We use some **audio features** to help us learn more about our audio file. 

We'll look at Spectral Centroid, RMS, and Zero-Crossing Rate.  


### Spectral Centroid
**Spectral centroid**: indicates at which frequency the energy of a spectrum is centered upon. (Wikipedia [link text](https://en.wikipedia.org/wiki/Spectral_centroid))


### RMS Energy
The **energy** of a signal is the total magntiude of the signal. For audio, that roughly corresponds to how loud the signal is. The **RMS Energy ** is the root mean square of the energy. (Wikipedia [link text](https://en.wikipedia.org/wiki/Audio_power))

\begin{equation}
\sqrt{ \frac{1}{N} \sum_n \left| x(n) \right|^2 }
\end{equation}

### Zero Crossing Rate
**Zero Crossing Rate**: The number of time a signal crosses the horizontal axis. (Wikipedia [link text](https://en.wikipedia.org/wiki/Zero-crossing_rate))

1. Two functions are proivided below. What does extract_features do? What does plot_features do?
2. Use these to learn more about the drums audio file. 

In [None]:
def extract_feature(x, feature, win_length):
    hop_length = int(win_length/2)
    spec_cent = librosa.feature.spectral_centroid(x, 44100)[0] 
    rms = librosa.feature.rms(x, win_length, hop_length)[0] 
    zcr = librosa.feature.zero_crossing_rate(x,win_length, hop_length)[0] 
    if feature == "spec_cent": 
        return spec_cent
    if feature == "rms":
        return rms
    if feature == "zcr":
        return zcr

In [None]:
def plot_features(x, fs, win_length):
    spec_cent = extract_feature(x, "spec_cent", win_length) # Calls above functions
    rms = extract_feature(x, "rms", win_length)
    zcr = extract_feature(x, "zcr", win_length)

    # change from samples to time
    hop_length = int(win_length/2)
    frames = range(len(x))
    t = librosa.frames_to_time(frames, hop_length)
    plt.figure(figsize=(15, 17))
    ax = plt.subplot(4, 1, 1)
    librosa.display.waveplot(x, fs);
    plt.title("Audio File")

    frames = range(len(spec_cent))
    t = librosa.frames_to_time(frames, hop_length)
    plt.subplot(4, 1, 2)
    plt.plot(spec_cent)
    plt.title("Spec_cent")

    frames = range(len(rms))
    t = librosa.frames_to_time(frames, hop_length)
    plt.subplot(4, 1, 3)
    plt.plot(t, rms)
    plt.title("RMS")
    
    frames = range(len(zcr))
    t = librosa.frames_to_time(frames, hop_length)
    plt.subplot(4, 1, 4)
    plt.plot(t, zcr)
    plt.title("ZCR")
    plt.show()

In [None]:
## Your Code Here



## Part 5: Fourier Transform

The Fourier Transform is one of the most fundamental operations in applied mathematics and signal processing.

It transforms our **time-domain signal** into the **frequency domain**. The time domain we have above expresses our signal as a sequence of samples, and the frequency domain expresses our signal as a superposition of sinusoids of varying magnitudes, frequencies, and phase offsets.
[
(Wikipedia.)](https://https://en.wikipedia.org/wiki/Fourier_transform)

1. import numpy as np and import scipy
2. Compute a Fourier Transform
3. Plot the spectrum and play around with the plot ("zooming in") such that the peaks are clear.
4. Do this for both the drums and the violin audios

Notice, we are now in the frequency domain (i.e. frequency is on the x axis). 

Discuss: What do you see? When is this visualization more useful than the time-domain visualization in the previous section?

In [None]:
## Your Code Here

## Part 6: STFT

Music signals change over time. It's rather meaningless to compute a single Fourier Transform over a whole song. 

Short-time Fourier transform (STFT) is obtained by computing the Fourier transform like above but for successive frames in a signal. [You can read more about the STFT on Wikipedia](https://https://en.wikipedia.org/wiki/Short-time_Fourier_transform). 


1. Use librosa.stft to compute an STFT. Please use a hop length of 512 and a frame size of 2048 (these are somewhat standard selections). 
2. Print the shape of the STFT. 
3. Do this for both the Drum and Violin signals.

In [None]:
## Your Code Here

##Part 7: Spectrograms 

The STFT we did above is the first step towards making a spectrogram!
What is a spectrogram? Let's try one out! 


[Harvard The Music Lab Spectrogram](https://musiclab.chromeexperiments.com/spectrogram/)

A spectrogram shows the intensities of frequencies over time. It is simply the squared magnitude of the stft.

1. Use librosa.amplitude_to_db to take the log amplitude of our STFT above. We do this because human perception of sound intensity is logarithmic.
2. Use librosa.display.specshow to print out our spectrogram. 
3. You can also use the following line to add a legend for the plot: plt.colorbar(format='%+2.0f dB')
4. Do this for both the drum and violin signals. 

Discuss: How different do the drum and violin signals look? Can you tell which is which?

In [None]:
## Your Code Here

## Part 7b: Mel Spectrogram

Next, let's do a Mel spectrogram. Human perception of sound intensity is logarithmic. Therefore, like the STFT, we are interested in the log amplitude.

1. Use librosa.feature.melspectrogram to create a mel spectroram.
2. use librosa.power_to_db 
3. Display the new spectrogram using librosa.display.specshow. Include a title and the colorbar as above.
4. Do this for both the drums and violin.

What do you see? How does it compare to The Music Lab's spectrogram demo?

In [None]:
## Your Code Here

# Great work! 

## Questions??