# Music Synthesis

In this lab, you will apply the skills that you have learned over the past weeks to an interesting signal processing problem. 

We will take a well known piece of music - provided in a "digital" version of sheet music - and synthesize this music. In a sense, we are programming our computers to be a musical instrument.

Don't be concerned if you don't have experience reading or playing music. All you need to complete this lab are your signal processing and Python/Numpy skills.

This lab is split into two parts:
* In the first part, you will learn to synthesize a musical scale and a simple version of the piece of music.
* In the second part, you will improve how the music is synthesized

Each of these parts will take one week.

In [None]:
# Import the usual suspects, including the Audio widget
%matplotlib inline
import matplotlib.pyplot as plt

import numpy as np

from IPython.display import Audio

## This notebook is incomplete

In this notebook, there are multiple places for you to fill in either code or text.

You should do that directly in this notebook. 

Once you have completed all your work in this notebook, rerun the entire notebook using "Kernel > Restart and Run All" from the menubar. 

Fix any errors, then remove this cell (your notebook is now complete), and submit.

## Background: Piano Keys and frequencies

Our aim will be to synthesize music from sheet music. However, to simplify matters we will use keys on the piano instead of the placement of notes on the sheet of music to designate the frequency of the tones that make up the music. 
Specifically, we number all the keys on the piano from left to right and each key produces one of the tones that the sheet music indicates.

We need to take a quick look at how the number of a piano key relates to the frequency of the tone that it produces. 

On a piano, the keyboard is divided into octaves — the notes in one octave being twice the frequency of the notes in the next lower octave. 
For example, we will use as our reference note the A-note near the middle of the piano keyboard. This tone is sometimes called A-440 because its frequency is 440 Hz. The piano key to play A-440 is the 49-th key from the left. Therefore, we will designate the tone A-440 by its piano key number 49.

<p style="margin-bottom: 15px; padding: 4px 12px; background-color: #e7f3fe; border-left: 6px solid #2196F3;">
Piano key 49 corresponds to frequency 440 Hz.
</p>

To determine the frequency of other keys on the piano, we need a second important relationship. There are 12 piano keys in each octave and the frequency doubles in each octave. Moreover, the ratio of frequencies produced by adjacent keys is constant. From these observations, it follows that the frequency from one key to the next increases by $2^{1/12} = \sqrt[12]{2}$.

Putting all our observations together, we can conclude that the frequency $f_k$ produced by the $k$-th piano key is given by:
$$
f_k = 440 \cdot 2^{\frac{k-49}{12}}.
$$

### Task: A function for translating key numbers to frequencies

Write a function `key_to_freq` that computes the frequency of the tone that is produced when the $k$-th tone on a piano keyboard is pressed.

The only input to this function is an integer $k$ and the only value returned from this function is the frequency of the tone.

In [None]:
def key_to_freq( k ):
    
    FILL_ME_IN

In [None]:
## checks
assert abs(key_to_freq(49) - 440) < 1e-8
assert abs(key_to_freq(61) - 880) < 1e-8
assert abs(key_to_freq(37) - 220) < 1e-8
assert abs(key_to_freq(40) - 261.6255653) < 1e-8

print('OK')

## Synthesizing a Musical Scale

Towards synthesizing our music, we begin with a musical scale. 

The C-major scale is produced, when the notes C, D, E, F, G, A, B, C are played consecutively. The corresponding keys are the white keys in the middle of the middle of the piano; they are numbered `[40, 42, 44, 45, 47, 49, 51, 52]`.

Our first goal is to synthesize the sound that is produced by played these eight notes consecutively for a quarter second each.

As the first step towards that goal,  write a function to synthesize a single tone of a given duration. 

### Task: Write a function to synthesize tones

Write a function `key_to_tone` that synthesizes the samples to play the tone indicated by a piano key `k` for `dur` seconds.

Your function must take the following parameters:
* `k` - piano key number
* `dur` - duration of the tone in seconds
* `fs` - sample rate; this parameter is optional and has a default value of 11025 Hz

Your function must return a NumPy vector of samples. When this vector is passed to the `Audio` widget the result must be a tone of the correct frequency and duration.

In [None]:
def key_to_tone(k, dur, fs=11025):
    
    FILL_ME_IN

In [None]:
# play A-440 for one second
fs = 11025
Audio(key_to_tone(49, 1, fs), rate=fs)

### Synthesizing the Scale

Now, synthesizing the scale is fairly straightforward. All we have to do is 
* synthesize each tone
* put the tones one-after-another

For the second step, there are at least two options:
* we can use NumPy's `np.concatenate` function to build up the sequence of tones
* we can allocate all samples (using `np.zeros`) and then insert the samples for each tone in the right locations

Let's look at both of these options, beginning with the one using `np.concatenate`.

In [None]:
## synthesize a musical scale using np.concatenate()
keys = np.array([40, 42, 44, 45, 47, 49, 51, 52])
dur = 0.5

xx = np.array([])  # initialize signal
for k in keys:
    tone = key_to_tone(k, dur, fs)
    xx = np.concatenate((xx, tone))  # the double-parentheses are needed

Audio(xx, rate=fs)

An alternative way to construct our scale is to first allocate all samples and then copy the tones in the right places of the allocated vector.

In this version, the variable `seg` keeps track of the sample number where the generated tone is inserted into the pre-allocated vector `xx`. The `Nt` samples produced by `key_to_tone` are inserted into the slice `xx[seg : seg+Nt]`.

In [None]:
## synthesize scale using pre-allocation
# compute total number of samples and samples per tone
Nt = len( key_to_tone(49, dur, fs))  # determine how long each tone is
Ns = Nt * len(keys)

xx = np.zeros(Ns)  # allocate memory to hold signals
seg = 0

for k in keys:
    xx[seg : seg+Nt] = key_to_tone(k, dur, fs) # insert tone in the right place in `xx`
    seg += Nt

Audio(xx, rate=fs)

The first method appears simpler as we do not need to concern ourselves with any of the sample indices. 

However, that method also has at least two serious drawbacks:
1. it is slow as new memory needs to be allocated each time `np.concatenate` is called and the new vector must be copied to the new location.
  - the computational complexity of this method is $O(N^2)$ where $N$ is the number of samples
2. it is not easy to insert pauses of a given length into the signal
  - doing so will likely require calculation of sample indices

For this reason, we will adopt the method based on the computation of sample indices going forward.

#### Task: Spectrogram

Before moving on, plot the spectrogram of the synthesized signal. Make sure to include labels for the axes and a colorbar.

Try different values for the parameter `NFFT` of the spectrogram function. The value you pass should be a power of 2. Reasonable values are between 512 and 2048.

In [None]:
# plot the spectrogram of the signal
FILL_ME_IN

plt.show()

### Tempo and time units

Peeking ahead again, in real music notes are not all of the same duration; there are half notes, quarter notes, eighth notes, etc. 

Moreover the length of these notes is relative to some tempo that is specified in the instructions. Musicians think of tempo in terms of *beats per minute* (`BPM`) where the duration of a beat usually corresponds to a quarter note.

For example, if `BPM = 120` then there are 2 quarter notes per second and, therefore, each quarter note and beat has a duration of 0.5 seconds. 

Since there are even shorter notes than the quarter notes, we need a time unit that is smaller than a beat. Computer programs that synthesize or manipulate sounds are called *sequencers* and they refer to this smaller time unit as a *pulse*. 
The relation between beats and pulses is the parameters *pulses per quarter note* (`PPQ`). We will be using `PPQ = 4` as the shortest note we will be synthesizing is a sixteenthh note; commercial synthesizers use much larger values for PPQ to achieve very fine time resolution.

In summary, we have two parameters to specify tempo and time resolution:
* `BPM` indicates how many quarter notes per minute can be played
* `PPQ` indicates how many pulse occupy a quarter note; we will use `PPQ = 4` consistently

From these parameters, we can compute the following time-units:
``` python
beats_per_second = BPM / 60
seconds_per_beat = 1 / beats_per_second    # also duration of a quarter note
seconds_per_pulse = seconds_per_beat / PPQ # duration of a pulse

samples_per_pulse = int( seconds_per_pulse * fs )
```

<p style="margin-bottom: 15px; padding: 4px 12px; background-color: #ddffdd; border-left: 6px solid #04AA6D;">
Going forward we will express all durations and time instances in terms of <tt>seconds_per_pulse</tt>.
</p>

### Revisiting the musical scale

To put our discussion of tempo and time-units to use, lets synthesize the musical scale again. This time, we will specify all time units in terms of pulses and we will insert short pauses between notes.

Specifically, we will synthesize the musical scale as follows:
* Notes start at multiples of 4 pulses
* Each note has a duration of 3 pulses

we capture these two requirements via two vectors, `durations` and `startPulses`, that are of the same length as the vector `keys`. According to the specifications above, the vectors `durations` and `startPulses` must be as follows.

In [None]:
## specify `durations` and `startPulses` for musical scale
# all times and duration in pulses
N_notes = len( keys )

durations = 3 * np.ones_like(keys)
startPulses = np.arange(N_notes) * 4

print('keys: ', keys)
print('durations: ', durations)
print('startPulses:', startPulses) 

#### Synthesize the revised musical scale

Before synthesizing the scale, we define the necessary timing parameters.

Specifically, we want 120 beats per minutes and set `PPQ = 4`.

With these definitions and specification of the sample rate `fs`, we can compute the following timing values:

In [None]:
## Timing related parameters and time units
# Timing parameters
BPM = 120
PPQ = 4

fs = 11025

# derived parameters
beats_per_second = BPM / 60
seconds_per_beat = 1 / beats_per_second    # also duration of a quarter note
seconds_per_pulse = seconds_per_beat / PPQ # duration of a pulse

samples_per_pulse = int( seconds_per_pulse * fs )


Now, we can synthesize the scale. As before, we proceed as follows:

* allocate a vector `xx` that is long enough to hold all samples
  - the length of the vector can be derived from the provided parameters
    + the start time of the last pulse lets us compute the number of samples before the last tone
    + we add the number of samples for the last tone 
    + to find the total number of samples `Ns`
* Each tone is synthesized individually
  - the frequency is given by `keys[n]`
  - the duration (in seconds) equals `durations[n] * seconds_per_pulse`
* the starting sample number `seg` for each tone
  - is given by `startPulses[n] * samples_per_pulse`
* the samples for each tone are inserted into the slice `xx[seg : seg+Nt]`
  - where `Nt` is the length of the tone

In [None]:
## Allocate memory to hold the signal
Ns = startPulses[-1] * samples_per_pulse + len(key_to_tone(keys[-1], durations[-1]*seconds_per_pulse, fs)) 
xx = np.zeros(Ns)

# compute samples
for n in range(len(keys)):
    # tone has duration `durations[n]*seconds_per_pulse``
    tone = key_to_tone(keys[n], durations[n] * seconds_per_pulse, fs) 
    # first sample for this pulse
    seg = startPulses[n] * samples_per_pulse
    # length of tone in samples
    Nt = len(tone)

    # insert tone
    xx[seg : seg+Nt] = tone

Audio(xx, rate=fs)

Let's again, look at the spectrogram of the synthesized signal. 

Notice that the pauses between tones are clearly visible.

<p style="margin-bottom: 15px; padding: 4px 12px; background-color: #ffffcc; border-left: 6px solid #ffeb3b;">
You may get a warning about a division by zero. If so, try to add a small constant (e.g., <tt>1e-4</tt>) when you pass the signal to the spectrogram function.
</p>

In [None]:
FILL_ME_IN

plt.show()


### Task: A function to synthesize the scale

Turn the code above into a function `synthesize`. This function takes as parameters:

* a vector `keys` that specifies the piano keys and, thus, the frequencies to be played
* a vector `durations` that specifies the length of each note in pulses
* a vector `startPulses` that provides the start time of each tone in pulses
* a tempo variable `BPM`
* a sample rate `fs` with a default value of 11025
* the number of pulses per beat `PPQ` with a default value of 4.

The function must return a vector of samples. When this vector is passed to the `Audio` widget it must play the "music" specified by the input parameters.

Do not invoke the `Audio` widget from within your function.

In [None]:
def synthesize(keys, durations, startPulses, BPM, fs=11025, PPQ=4):
    """Synthesize a sequence of tones
    
    Parameters:
    -----------
    keys - a vector that specifies the piano keys and, thus, the frequencies to be played
    durations - a vector that specifies the length of each note in pulses
    startPulses - a vector that provides the start time of each tone in pulses
    BPM - tempo variable 
    fs - sample rate with a default value of 11025
    PPQ - the number of pulses per beat with a default value of 4.

    Returns:
    --------
    a vector of samples
    """

    FILL_ME_IN

In the cell below, verify that your function works.

In [None]:
xx = synthesize(keys, durations, startPulses, BPM, fs, PPQ)

Audio(xx, rate=fs)

## Synthesizing Music

The piece of music we want to synthesize may be something that you recognize. It is called *Fugue #2 for the Well-Tempered Clavier* by J.S. Bach. The first few measures of the piece are shown in the image below. Clearly, this is not a format that we can work with directly.

![Sheet Music](SheetMusic.png)



### Data Format

The sheet music has [been transcribed](https://dspfirst.gatech.edu/chapters/DSP1st2eLabs/bach_fugueData.zip) into a format that fits well with what we have been doing so far.

In this piece, up to three notes are being played simultaneously. Therefore, the notes have been organized into three *voices*. In this first part of the lab, we will only synthesize the first of these voices.

To begin, we load the data file. The data are stored in a standard format used in the Python community; this format is called the  `pickle` format. The tools to read a file in this format are part of the Python standard library.

The code below loads the data that represent the sheet music into the variable `the_voices`.

In [None]:
## load data from pickle file
# load the pickle module
import pickle

# read data from file
with open('bach_fugue.pkl', 'rb') as f:
    # The protocol version used is detected automatically, so we do not
    # have to specify it.
    the_voices = pickle.load(f)

We can now examine the first voice, `the_voices[0]`, (see below) and recognize that
* it is a dictionary with three key-value pairs
* `startPulses` is the key for an array that indicates the start times (in pulses) for notes to be played
* `durations` is the key for an array that indicates the note durations (in pulses)
* `noteNumbers` is the key for the notes to be played (expressed a piano key numbers)

Thus, we can use the same method that we used for synthesizing the scale to play this music.

In [None]:
the_voices[0]

On the sheet music shown above, the tempo for this piece is indicated. Specifically, this piece is to be played at 80 beats per minute.

Thus, we can synthesize and play the (first voice) of this music as follows:

In [None]:
## synthesize music
# which voice to play
n_voice = 0

# synthesize
ss = synthesize(the_voices[n_voice]['noteNumbers'],
                the_voices[n_voice]['durations'],
                the_voices[n_voice]['startPulses'],
                BPM=80,
                fs=11025,
                PPQ=4)

# play it
Audio(ss, rate=fs)

### Spectrogram

To wrap up this first part, we compute the spectrogram of the signal we just synthesized.

* Again, you may need to add a small constant to avoid warnings related to division by zero
* Setting the `NFFT` parameter to 2048 appears to work reasonably well

Try to play the signal while you're following along on the spectrogram. Can you correlate what you're hearing to what the spectrogram shows?
* do you see the pauses?
* do you hear and see passages with decreasing or increasing frequency?
* do you hear the lower frequencies towards the end of the piece?

In [None]:
plt.specgram(ss + 1e-5, NFFT=2048, Fs=fs)
plt.ylim(0, 1000)
plt.xlabel('Time (s)')
plt.ylabel('Frequency (Hz)')
plt.colorbar()

plt.show()

## Summary

The sound we synthesized is not a bad start, but it's not going to win us a Grammy. It clearly sounds artificial.

In the second part of the lab, we will look to improve on what we have so far by:
* adding in the other two voices
* use an envelope to fade the tones in and out
* add harmonics to each tone 


### Deliverable

Submit a PDF version of this notebook. Make sure that:
* all cells are properly typeset and the "Incomplete" cell near the top is removed
* all code cells are complete (no `FILL_ME_IN` left)
* all functions are properly documented
* all plots have proper axis labels and other adornments as appropriate