<a href="https://colab.research.google.com/github/comp0161/colab/blob/main/COMP0161_lab3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Generating Music with Deep Learning (Part 3)

In this final music generation lab session, we will play with the **sound** of the music produced in Lab 2, applying different instrument sounds and audio effects.

We will again be making use of [Google's Colab computing environment](https://colab.research.google.com/#). While some of the processes used are reasonably computationally demanding, they are not GPU-dependent in the way that training even a small deep learning model is, and should be fine with a CPU virtual machine.

As in previous labs, this session's code is in [Python](https://docs.python.org/3/tutorial/index.html). You do not need to know Python to complete the lab, but there will be some optional extra tasks you can try if you are comfortable with Python coding and want to explore further.

# Background

In previous labs we have considered only a symbolic representation of the music we're dealing with. But this has to be converted into physical sounds — actual vibrations in the air — in order to be heard. And the nature of those physical sounds can vary very widely, significantly changing the listener's experience.

In today's lab we'll look at converting from the symbolic representation generated last time into a physical representation — **synthesis** — and then at modifications that may be applied directly to the physical representation in order to change how it sounds — **audio effects**.

## Synthesis

A symbolic music representation like MIDI contains only the bare outline of the sounds to be heard — things like pitch, volume and duration. In order to convert that to a physical sound, all the minutiae of vibration have to be filled in. It is possible to use MIDI to drive real instruments, in which case most of the details of the vibration arise naturally from the material structure of the instrument. But to generate the sound digitally, these details must somehow be manufactured.

There are broadly two ways of doing this. Oscillator based synthesis builds up the sound from an abstract model of its constituent parts, while sample based synthesis creates it from sound snippets recorded from real (or previously synthesised) instruments. The former is in some sense more general, but it is usually much easier to emulate any real world sound using the latter, because the samples will capture all the temporal and harmonic complexities that you'd otherwise have to put there explicitly.

For this lab we will continue to make use of the FluidSynth engine that we've used before. FluidSynth is sample based, making use of packages of samples known as soundfonts to provide the details of the sounds it creates.

## Audio effects

Effects act directly on an audio signal — or it's representation in terms of physical data — in order to transform the sound, modifying its temporal or spectral properties. There are many different kinds of effect, and we'll explore the main ones below.

(It's worth noting that some kinds of effect, especially filters, also play a significant role in oscillator based synthesis, as a way of getting richer and more complex tones out of basic oscillator waveforms.)

# Setting up



## Data

Fetch some files that we'll need, including the default primer-based MIDI file generated in Lab 2:

In [None]:
!git clone https://github.com/comp0161/labs_data.git data

If you played around with the settings last week, or generated any kind of MIDI file other than the default, you should upload it using the file browser accessible via the folder icon in the sidebar. Name your file `data/music.mid`

## Music handling

We'll use the same packages for music handling as last week: [Music21](https://web.mit.edu/music21/) for music streams and MIDI, [LilyPond](https://lilypond.org/index.html) for notation and [FluidSynth](https://www.fluidsynth.org) for audio rendering. In addition, we'll add Spotify's [pedalboard](https://github.com/spotify/pedalboard) package for audio effects.

(As before, we discard a lot of installation messages here; if there are problems, remove the `> /dev/null` to help diagnose what's going wrong.)

In [None]:
# software for rendering music notation
print('installing lilypond...')
!apt-get install lilypond > /dev/null
print('done')

# software for rendering MIDI to WAV
print('installing fluidsynth...')
!apt-get install fluidsynth > /dev/null
!cp /usr/share/sounds/sf2/FluidR3_GM.sf2 ./font.sf2
print('done')

# install the music21 package for reading and transforming the MIDI data
# (an older version seems to be already installed on Colab as of this writing, but
# we want to be up to date)
%pip install --upgrade music21

# install spotify's pedalboard package for audio effects
%pip install pedalboard


## Python library imports

In [None]:
# always useful imports
import sys, os, os.path
import copy
import numpy as np
import numpy.random
import json

# specialities
import music21 as MU
from IPython.display import Image, Audio
import pedalboard, pedalboard.io
import matplotlib.pyplot as plt


## Configuration

In [None]:
# configuration variables -- there's comparatively little to configure this time out
SEED = 9907
shared_rng = numpy.random.default_rng(seed=SEED)

# default filenames for intermediate data
MUSIC_MID = 'music.mid'
MUSIC_WAV = 'music.wav'

# dataset configuration
COMPILED_DATA = 'data'
SOURCE = os.path.join(COMPILED_DATA, MUSIC_MID)


## Display helpers

Jupyter notebook doesn't know how to show or play music, but it does support images and audio, so we'll convert to those formats for presentation.

Note: these functions have been modified a little from the versions in Labs 1 & 2. In particular, we've added the ability to specify a different soundfont for rendering the audio.

In [None]:
# helper functions for displaying music in the notebook
# note that these are probably only useful for fairly small
# music snippets -- generating a large PNG or WAV may exceed
# Colab resource limits and/or produce ugly results

def music_show(music):
  """
  Render music to a PNG and display it inline in the Jupyter notebook.
  """
  display(Image(str(music.write('lily.png'))))

def midi_play(filename, rate=22050, font='font.sf2', wav_name=MUSIC_WAV):
  """
  Render MIDI to WAV and display inline as Audio.
  """
  !fluidsynth -ni $font $filename -F $wav_name -r $rate > /dev/null
  display(Audio(wav_name))

def music_play(music, rate=22050, font='font.sf2', midi_name=MUSIC_MID, wav_name=MUSIC_WAV):
  """
  Write music to MIDI, then pass that to `midi_play`.
  """
  filename = music.write('mid')
  os.rename(filename, midi_name)
  midi_play(midi_name, rate, font, wav_name)

# Synthesis


Synthesis is a large topic and we're going to focus on only a very limited portion of it here.

It is quite easy to generate simple audio waveforms:

In [None]:
def play_sine ( hz=440, duration=1, sample_rate=22050 ):
  """
  Play a basic sine wave of specified frequency and duration.
  """
  tt = np.arange(duration * sample_rate) / sample_rate
  rr = tt * 2 * np.pi * hz
  osc = np.sin(rr)

  display(Audio(osc, rate=sample_rate))

play_sine()

However, going from that to fully rendering music from MIDI requires a fair bit of infrastructure that would take the whole lab to implement. Instead, we're going to stick with the existing FluidSynth implementation that we've been using to date and modify what we generate with it. (If you're interested in getting into the nuts and bolts there are some suggestions in the Further work section at the end of the notebook.)

## Changing instruments

FluidSynth uses the audio sample information contained in a SoundFont in order to convert the symbolic specification of the MIDI into physical audio representation. A typical SoundFont will contain samples of many different instruments — FluidSynth chooses the ones that correspond to whatever instruments are specified within the MIDI file itself.

In our case, we never actually specified an instrument explicitly. In that case, the default will be an Acoustic Grand Piano. (In practice, Music21 will have added a generic instrument element at the start of our stream, which is interpreted much the same as there not being one at all.)

A simple way of changing how our music sounds when rendered is to specify some other instrument to use instead. The function below does that.

(Specifically, it replaces any instruments matching the `target` parameter with instances of the class passed to `replacement`. In the MIDI file created by the default code from Lab 2 there will only be a single track with a single instrument. If you use this with some other MIDI file you might find there are multiple instruments and you can change them in a more granular way, although this function is quite simplistic.)

In [None]:
def change_instruments(music, replacement=MU.instrument.Accordion, target='Instrument', inPlace=False):
  """
  Replace specified instruments in the given music stream with the
  given instrument class. By default all instruments are replaced
  with accordions!
  """
  if not inPlace:
    music = copy.deepcopy(music)
  
  for elem in music.recurse():
    if target in elem.classes:
      elem.activeSite.replace(elem, replacement())
  
  return music

Let's load up our original MIDI source file and try out a couple of different instruments.

In [None]:
original = MU.converter.parse(SOURCE)

First, the default accordion:

In [None]:
music_play(change_instruments(original))

Or perhaps a violin:

In [None]:
music_play(change_instruments(original, MU.instrument.Violin))

Or possibly it needs a bit [more cowbell](https://en.wikipedia.org/wiki/More_Cowbell)?

In [None]:
music_play(change_instruments(original, MU.instrument.Cowbell))

Hang on, *that* doesn't sound much like our little ditty played on a different instrument!

In fact, *unpitched* percussion instruments like drums, cymbals, woodblocks and cowbells are commonly encoded as single *notes* in an aggregated percussion instrument or **drum kit**. When we specify `MU.instrument.Cowbell` above, that gets interpreted as meaning we want the whole kit and our pitches will refer to different instruments. So the *tune* doesn't really come across, it's the *rhythm* that counts.

But there are also *pitched* percussion instruments we might try.

In [None]:
music_play(change_instruments(original, MU.instrument.Timpani))

If you want to try out some other sounds, you can find information on the range of instruments available in Music21 (mostly corresponding to things you might find in a General MIDI SoundFont) in the `music21.instrument` [documentation](https://web.mit.edu/music21/doc/moduleReference/moduleInstrument.html).

## Changing SoundFont

A variety of SoundFonts are available. (Some useful links can be found at FluidSynth's [GitHub Wiki](https://github.com/FluidSynth/fluidsynth/wiki/SoundFont).) Often they will include the same general classes of instruments but with different styles or qualities of sample. Exactly what is in them can vary significantly though. In the example below, all (non-percussion) instruments have been substituted with much simpler **sine wave** samples.

Here, we're playing the MIDI file generated as an intermediate in the last cell — where the instrument is meant to be a tuned **kettle drum**:


In [None]:
midi_play(MUSIC_MID, font='data/sines.sf2', wav_name='sines.wav')

# Audio effects

In addition to modifying the direct conversion from MIDI to audio, we can also make various changes to the audio data itself. In the remainder of the lab we will use the same baseline audio file and transform how it sounds by applying **effects** directly to the physical audio data.

The cells below will make use of whatever audio was last rendered to the intermediate `music.wav` file. For the moment, let's revert that to the default piano sound. (You're welcome to use a different sound if you prefer.)

In [None]:
midi_play(SOURCE)

We'll load that synthesised audio data into a form that we can apply effects to — basically just a big array of numbers representing amplitude at each sample time.

<details>
<summary>Side note</summary>
If you want to use the file browser to upload some other WAV file to use here instead, that's fine, but don't upload something too huge or the rest of the notebook will take forever to run and you may exceed Colab limits.
</details>

In [None]:
# load our baseline rendered music as audio data
with pedalboard.io.AudioFile(MUSIC_WAV) as f:
  audio = f.read(f.frames)
  samplerate = f.samplerate

We'll also define some functions for playing and viewing this audio data and applying effects to it.

In [None]:
# shorthand functions for applying an effect
# this is basically to wrap some settings for presentational simplicity later
def play(audio=audio, samplerate=samplerate, normalize=True):
  display(Audio(audio, rate=samplerate, normalize=normalize))

def apply_effect(effect, audio=audio, samplerate=samplerate):
  return effect.process(audio, sample_rate=samplerate)

def play_effect(effect, audio=audio, samplerate=samplerate, normalize=True):
  play(apply_effect(effect, audio, samplerate), samplerate, normalize)

def save_audio(xx, filename, samplerate=samplerate):
  with pedalboard.io.AudioFile(filename, 'w', samplerate=samplerate, num_channels=xx.shape[0]) as f:
    f.write(xx)

# function for displaying a spectrum
def plot_spectrum(xx, sr=samplerate, figsize=(14,5)):
  fig, axs = plt.subplots(figsize=figsize)
  _ = axs.magnitude_spectrum(xx, Fs=1/sr, scale='dB')

## Filters

Spectral filters are one of the most essential building block types of audio effect. They come in a great variety of flavours, configurations and implementations, but their basic function is the same: to change the balance of the different **frequencies** in the sound. Most often, they *attenuate* some frequencies in the signal and/or *boost* others.

To illustrate this, it's useful to have some audio with a broad spread of different frequencies in it, so let's make a bit of noise.

<details>
<summary>Implementation detail</summary>
<p>By convention, audio data represented as floating point numbers should be kept in the range [-1, 1]. Many systems will cope if this is not the case, others may behave oddly or make horrible noises.
</p><p>
Here we're generating Gaussian noise with mean 0 and standard deviation 0.33, which will concentrate in the desired region but may take any value in [-∞, ∞]. We could just clip to the desired range, but instead we'll apply the hyperbolic tangent function <tt>tanh</tt> to *compress* the infinite range into [-1, 1]. We'll talk a bit more about compression later on.
</p>
</details>

In [None]:
# create some noise for illustration purposes
noise = np.tanh(shared_rng.normal(scale=0.33, size=int(samplerate)))

# you can listen to this, but it doesn't sound very nice
play(noise)

Let's have a look at the spectrum of this noise.

In [None]:
plot_spectrum(noise)

We can see it's pretty **white**: it contains about the same amount of every frequency. This makes it relatively easy to see what changes when we apply a filter.

### Low Pass

A **low pass filter** leaves low frequencies in the signal unchanged and attenuates high frequencies. It is usually characterised by a **cutoff frequency**, which is the frequency above which it attenuates.

Low pass filters (and all the others we'll see) are not *all or nothing* — it is practically impossible and usually also aesthetically undesirable to cleanly pass all frequencies below the exact cutoff and entirely block them above. Instead, there tends to be a gradual change of attenuation with frequency, known as the *frequency response* of the filter. This can be seen in the spectrum:

In [None]:
lpf = pedalboard.LowpassFilter(cutoff_frequency_hz=50)
plot_spectrum(apply_effect(lpf, noise))

Comparing this spectrum with the previous one, we can see that the lowest frequencies are about the same, while the higher ones are attenuated increasingly with frequency. The rate of this decrease is often described as the **roll-off** in *decibels (dB) per octave*. (An octave is basically a doubling of frequency.)

**Important:** When playing sounds that have been filtered or otherwise modified, the changed content may lead to significant differences in playback loudness. Always be cautious with your audio volume when listening to sounds after you've applied effects to them!

So, what does the music sound like after low pass filtering? Is it noticeably different from the original?

In [None]:
play_effect(lpf)

### High Pass

A **high pass filter** does the opposite of a low pass. Frequencies above the cutoff are passed, while those below are attenuated. So the spectrum curves the other way:

In [None]:
hpf = pedalboard.HighpassFilter(cutoff_frequency_hz=500)
plot_spectrum(apply_effect(hpf, noise))

Once again, the resulting sound may be significantly louder, so be cautious with you headphone volume.

How does the sound change this time? How does the filtering manifest (if at all) in the perceptual, textural, even *emotional* qualities of the music? Does it evoke a different environment from the low pass filtered version?

In [None]:
play_effect(hpf)

### Peak / Notch

Peak and notch filters may be thought of (and sometimes even implemented) as the summation of a low pass and a high pass. 

A **peak filter** boosts frequencies around the cutoff, while a **notch filter** attenuates them. Frequencies far from the cutoff, both low and high, are (more or less) passed unchanged.

As well as the cutoff frequency itself, these filters are characterised by a parameter known as **Q** (or *quality factor*), which essentially describes the breadth or **bandwidth** of the peak or notch. A high Q corresponds to a narrow peak or notch, concentrated in a narrow range of frequencies, while a low Q produces one that is broader, affecting more frequencies. This can be seen in the spectra below.

In [None]:
# note that peak and notch are functionally the same thing here
# the key difference being the direction (sign) of the gain
peak = pedalboard.PeakFilter(cutoff_frequency_hz=1000, gain_db=20, q=10)
notch = pedalboard.PeakFilter(cutoff_frequency_hz=1000, gain_db=-20, q=1)
plot_spectrum(apply_effect(peak, noise))
plot_spectrum(apply_effect(notch, noise))

Again, compare the sonic effect of the two filters.

In [None]:
play_effect(peak)
play_effect(notch)

### Ladder

We've talked about filters in terms of abstract functionality, but they also need to work in practice. In the digital domain they are usually implemented in terms of moving window weighted averages of neighbouring samples. In the era of analogue electronics it was traditionally done using variations on [RC circuits](https://en.wikipedia.org/wiki/RC_circuit). But in the mid 1960s synthesiser pioneer [Robert Moog](https://en.wikipedia.org/wiki/Robert_Moog) devised a versatile filter circuit using the recently-invented **transistor**, known as the **ladder filter**, that became extremely successful. The ladder filter was a key element of the Moog synthesiser sound and an influential component in modern music more generally. As a result, it's a popular choice for emulation in software — eg, the pedalboard library we're using here.

The ladder filter can operate in different modes — including low and high pass, but also **bandpass**, where both low and high frequencies are attenuated and those around the cutoff are passed. It can produce a steep roll-off of 12 or 24 dB per octave. It allows for a characteristic **resonance** around the cutoff frequency and also includes a capacity for soft-clipping **distortion** — we'll come back to what that means below, but it's an effect that can produce an appealing sound.

In [None]:
ladder = pedalboard.LadderFilter(mode=pedalboard.LadderFilter.Mode.BPF24, cutoff_hz=700, resonance=0.5, drive=40)
plot_spectrum(apply_effect(ladder, noise))

In [None]:
play_effect(ladder)

## Delays

Delays are the second essential building block effect type. In its simplest form, a delay simply mixes a copy of the sound as it was at some previous time into the sound now, creating an **echo** effect.

<details>
<summary>Aside</summary>
We tend to think of delays and filters as different, but they are closely related, especially in the digital domain. A filter entails some kind of weighted summation over neighbouring samples — in effect adding together delayed versions of the signal. The distinction is usually that delays are active over much longer timescales. Simple delays over very short times give rise to a kind of filter known as a <b>comb filter</b>, but we won't get into those here.
</details>

In [None]:
simple_delay = pedalboard.Delay(delay_seconds=1.0, feedback=0, mix=0.5)
play_effect(simple_delay)

In the example above there is just a *single* echo — just one delayed copy of the signal added to itself. A very common extension is for some portion of the effected signal to **feed back** into the delay line. So the first echo gets echoed itself when the delay time comes round again, and then that echo of an echo echoes, and so on.

In [None]:
feed_delay = pedalboard.Delay(delay_seconds=1.0, feedback=0.75, mix=0.5)
play_effect(feed_delay)

Usually the feedback is at a diminished level, so the echoes get quieter and quieter, gradually fading to nothing. This is not *required*, though. If you keep adding back the whole of the signal each time, eventually it all gets a bit out of hand.

In [None]:
overfeed_delay = pedalboard.Delay(delay_seconds=1.0, feedback=1, mix=0.5)
play_effect(overfeed_delay)

So far, we've kept the delay time in sync with the tempo of the music.

<details>
<summary>Notes</summary>

1. If you're using some other audio or MIDI not generated by the default mechanisms of these labs, you've probably been out of sync the whole time, in which case the above statement won't make any sense so ignore it.
2. If you are using the default mechanisms but also have been paying attention you might complain that we've never actually *set* a tempo for the music. That is absolutely true. But because we haven't set anything the MIDI rendering will have assumed a default tempo of 120 bpm, hence our 1 second delay corresponds to 2 beats and everything remains in sync.

</details>

That can still get pretty cacophonous, as in the last example, but what happens if we set the delay to something other than an integral number of beats?

In [None]:
unsync_delay = pedalboard.Delay(delay_seconds=0.9, feedback=0.5, mix=0.5)
play_effect(unsync_delay)

In this case, the (fed back) delay drifts in and out of sync — or [*phase*](https://en.wikipedia.org/wiki/Phase_music) — with the primary beat. This sort of effect has been used effectively by composers such as [Steve Reich](https://youtu.be/RTke1tQztpQ).

## Distortion


Distortion is a non-linear transformation of the waveform — and hence harmonic content — of a sound, traditionally arising from *limitations* of the reproduction process.

When a signal pathway lacks the capacity to capture the sounds it is attempting to reproduce — when they are too loud or too rich — then some of the original detail gets lost. But new qualities may be introduced as a result of that loss.

In particular, when the amplitude of a signal exceeds the range that the receiver can represent, the loudest parts have to be truncated or scaled down. For every individual frequency in the sound, this truncation changes that frequency's simple sine oscillation into something flatter, and that flatter wave must contain *other* frequencies to make it flat. The effect of distortion is thus to introduce greater harmonic complexity into the sound, making it richer, maybe *harsher* — and also often perceptually **louder**.

All sound reproduction technology has limitations, and early recording equipment was particularly constrained. But those constraints often gave rise to rich and exciting sonic textures that musicians have striven to recapture ever since. The distortion profiles — the *inadequacies* — of particular pieces of equipment can be highly sought after, even fetishised, because of the aesthetically pleasing *faults* they exhibit.

### Clipping

The flattening of the waveform that occurs when its amplitude exceeds the channel capacity is known as **clipping**. In digital representations, the most basic kind of clipping — known as **hard clipping** — just truncates the samples at the maximum of their range, leaving a completely flat plateau with sharp corners. This introduces many harsh high frequencies that can sound grating and unpleasant.

In [None]:
clip = pedalboard.Clipping(threshold_db=-40)
play_effect(clip)

Analogue audio equipment generally does not produce perfect hard clipping. Instead, amplitudes approaching the limit get progressively more attenuated, creating a less abrupt **soft clipping** without such sharp corners. This still creates greater harmonic complexity — more loudness and "warmth" — but with fewer harsh overtones, and the result tends to be a nicer sound with a bit less "fizz". Most digital distortion effects attempt to emulate some form of soft clipping.

In [None]:
distort = pedalboard.Distortion(drive_db=40)
play_effect(distort)

### Sampling distortion

A digital audio signal consists of a sequence of numeric samples measured (or synthesised) at regular intervals. The temporal detail and range of frequencies it can contain is determined by the sampling frequency. At lower sampling frequencies, higher frequency sounds cannot be correctly captured as the frequencies they are, but may be incorrectly interpreted as lower frequencies — a phenomenon known as [aliasing](https://en.wikipedia.org/wiki/Aliasing).

**Downsampling** the audio simulates what it would be like if captured at a lower sample rate

In [None]:
down = pedalboard.Resample(2500)
play_effect(down)

The amount of detail that can be captured by each sample depends on the numerical resolution, usually characterised in terms of the bit depth. A **bit crusher** effect reduces the numerical precision to correspondingly change the sound quality.

In [None]:
crush = pedalboard.Bitcrush(bit_depth=5)
play_effect(crush)

## Dynamics

Dynamic range processing is related to the clipping discussed in the previous section. Dynamic range effects adjust the balance of amplitudes in the signal — the spread of loud and quiet parts.

### Compression

The principal dynamic range effect is the [compressor](https://en.wikipedia.org/wiki/Dynamic_range_compression). There are several variations, but the basic operation is to reduce the volume — or rather, the *rate at which the volume increases* — when it goes over some threshold level. The effect is to shrink the original dynamic range into a smaller one.

<details>
<summary>Note</summary>
<p>You might notice an unfamiliar <tt>normalize=False</tt> argument in the play commands below. By default Jupyter notebook normalises numeric array data for audio playback,scaling it for optimum volume. This can hide what dynamic range effects are doing, so we turn it off for this part of the lab.
</p>
</summary>

In [None]:
compress = pedalboard.Compressor(ratio=5, threshold_db=-50, attack_ms=50, release_ms=200)
play_effect(compress, normalize=False)

In fact, the compressed version is so much quieter that it's hard to hear what's going on. We can add some gain to bring it back up to a reasonable volume.

In [None]:
gain = pedalboard.Gain(gain_db=12)
play(apply_effect(gain, apply_effect(compress)), normalize=False)

Compare this with the original uncompressed sound:

In [None]:
play(audio, normalize=False)

The effect is relatively subtle, but you might be able hear that the quieter parts of the track have been amplified relative to the loud parts, making the overall sound a bit "fuller".

Compressors are often used to even out variations in a performance and create a more consistent sound.

### Limiting

A [limiter](https://en.wikipedia.org/wiki/Limiter) is a compressor where the compression ratio — how much the above-threshold sound is scaled down — is very high, or infinite — preventing the signal amplitude from getting too high. It is closely related to the kind of clipping that produces distortion, but (at least in theory) attempts pre-emptively to avoid that.

Perhaps confusingly, limiters are commonly used to *maximise* the loudness of a track by restricting it to some lower amplitude range and then scaling the result up to the fill the full range. Indeed, the limiter in pedalboard implicitly includes a compensatory gain, which is why the example below, limiting the audio to a threshold of -30 dB, actually makes the sound louder.

(See also: [loudness war](https://en.wikipedia.org/wiki/Loudness_war).)

In [None]:
limit = pedalboard.Limiter(threshold_db=-30, release_ms=100)
play_effect(limit, normalize=False)

## Modulation

In the general sense, **modulation** just refers to the use of one signal — often a low frequency oscillator or [LFO](https://en.wikipedia.org/wiki/Low-frequency_oscillation) — to modify another. Audio effects often have parameters than can varied over time in such a way to produce interesting results. But some applications of modulation are sufficiently distinctive and widely used that they have become institutionalised as named effect types in their own right.

### Chorus

Probably the most popular modulation effect is the [chorus](https://en.wikipedia.org/wiki/Chorus_(audio_effect)), in which the modulating signal is applied to one or more fairly short delays. The duplicated but not perfectly synchronised signal very crudely emulates the natural variation of multiple performers playing — or singing — together, hence the name.

In [None]:
chorus = pedalboard.Chorus(rate_hz=0.33, depth=0.25, feedback=0, mix=0.5, centre_delay_ms=8)
play_effect(chorus)

### Flanger

[Flanging](https://en.wikipedia.org/wiki/Flanging) is similar to chorus, in that it adds a second copy of the signal at a varying offset. But the delay time is short enough that there is significant **interference** between the two signals, creating a sweeping [comb filter](https://en.wikipedia.org/wiki/Comb_filter) effect — often compared to the whoosh of a jet engine.

In [None]:
flanger = pedalboard.Chorus(rate_hz=1.0, depth=0.5, feedback=0.75, mix=0.5, centre_delay_ms=0.1)
play_effect(flanger)

### Phaser

In a [phaser](https://en.wikipedia.org/wiki/Phaser_(effect)) effect, the modulation is applied to a kind of filter not mentioned before: an [all pass filter](https://en.wikipedia.org/wiki/All-pass_filter). As the name suggests, this does not reduce the amplitude of any frequencies; insted, it shifts their *phase* by different amounts, with the maximal shift occurring at the cutoff frequency. A static phase shift tends not to very noticeable, but modulating the centre frequency makes it more obvious, producing a rhythmic textural sweep in the sound.

In [None]:
phaser = pedalboard.Phaser(rate_hz=2.0, depth=0.5, feedback=0.25, mix=0.5)
play_effect(phaser)

### Tremolo

[Tremolo](https://en.wikipedia.org/wiki/Tremolo) is probably the simplest modulation effect — it just modulates the signal volume. At the time of writing, pedalboard doesn't include a tremolo effect, but we can do it explicitly by just generating a sine wave of the desired frequency and multiplying.

In [None]:
def tremolo ( audio=audio, freq=8, lo=0, hi=1, samplerate=samplerate ):
  # sine wave the length of the audio scaled into [0, 1]
  modulator = (1 + np.sin(np.arange(0, audio.shape[1]) * 2 * np.pi * freq / samplerate)) / 2

  # scale and offset into the target range
  modulator = modulator * (hi - lo) + lo

  # apply
  return audio * modulator

# tremolo is just a function, not a pedalboard effect object
# so we call directly rather than passing to play_effect
play(tremolo())

### Vibrato

Vibrato is similar to tremolo, but instead of modulating the amplitude it modulates the pitch.

<details>
<summary>Notes</summary>

Vibrato is produced on physical instruments by techniques such as wiggling the finger fretting a string. This changes the tension and makes the pitch shift up and down.

Vibrato is often confused with tremolo. For example, electric guitars have a lever that modulates the string tension. This is known as a **tremolo arm** (or *whammy bar*) but it actually produces a *vibrato* effect, modulating the string tension and hence the pitch.

Note that in pedalboard, vibrato (like flanging) is just a different parameterisation of the Chorus effect — the three effects are really just variations on a theme.

</details>

In [None]:
vibrato = pedalboard.Chorus(rate_hz=8.0, depth=0.1, feedback=0, mix=1.0, centre_delay_ms=0.5)
play_effect(vibrato)

## Reverb

[Reverberation](https://en.wikipedia.org/wiki/Reverberation) occurs when a sound is made in any physical environment: the sound waves bounce off the surfaces and objects in the environment — which will often not reflect all frequencies equally — and all the overlapping, differently-filtered echoes combine to create a sense of the space. Reverb effects attempt to simulate this process to imbue sound with that kind of evocative spatial feel.

Reverb is probably the most complex and varied kind of audio effect. In analogue recording studios elaborate reverb units would physically play the sound through reverberant objects like springs or metal plates and record the results back with a microphone. In the digital domain, there are a wide range of techniques used to generate reverberation with various degrees of realism, but they fall into two basic categories: **algorithmic** and **convolution**. An algorithmic reverb implements some abstracted model of what reverberation is, typically using a network of many delay lines and filters. A convolution reverb combines the audio with a recording that captures the reverberant qualities of the desired space or piece of equipment — its **impulse response** or IR. Convolution reverbs can be more "authentic" sounding, but are very much at the mercy of the IR. Algorithmic reverbs are usually more tweakable and tend to have distinctive and interesting sounds of their own.

In [None]:
reverb = pedalboard.Reverb(room_size=0.75, damping=0.5, wet_level=0.5, dry_level=0.5)
play_effect(reverb)

In the convolution examples below, you can listen to the IR as well as the convolved music.

In [None]:
display(Audio('data/ir-1.wav'))
convo1 = pedalboard.Convolution('data/ir-1.wav')
play_effect(convo1)

In [None]:
display(Audio('data/ir-2.wav'))
convo2 = pedalboard.Convolution('data/ir-2.wav')
play_effect(convo2)

## Combining Effects

We've mostly considered audio effects in isolation so far, but in practice they are often used in combination, with the output of one fed as input to another. The guitar player's pedalboard — from which the Spotify effects package takes its name — is a classic example: individual effects are implemented as hardware pedals, and these can be wired together to produce more complex tones.

Pedalboard implements two mechanisms for easily combining effects. The first is the Pedalboard object, which simply chains a number of effects in sequence:

In [None]:
board = pedalboard.Pedalboard([
    pedalboard.Phaser(rate_hz=1.0, depth=0.5, feedback=0.25, mix=0.5),
    pedalboard.Chorus(rate_hz=0.79, depth=0.5, feedback=0.25, mix=0.5),
    pedalboard.Delay(delay_seconds=0.5, feedback=0.5, mix=0.5),
    pedalboard.Delay(delay_seconds=0.57, feedback=0.75, mix=0.25),
    pedalboard.LadderFilter(mode=pedalboard.LadderFilter.Mode.HPF12, cutoff_hz=440, resonance=0.25, drive=20),
    pedalboard.Convolution('data/ir-1.wav'),
    pedalboard.Limiter(threshold_db=-24, release_ms=100)
])

play_effect(board)

(Note that the whole board can be treated as a single effect.)

The second mechanism is the [`Mix`](https://spotify.github.io/pedalboard/reference/pedalboard.html#pedalboard.Mix) plugin, allowing for effects to be applied in parallel rather than in series. Unfortunately, at the time of writing using this seems to crash Colab. But we can implement a rudimentary substitute ourselves just using a weighted sum:

In [None]:
def mix_apply ( effect_1, effect_2, mix_1=0.5, mix_2=0.5, audio=audio, samplerate=samplerate ):
  """
  Apply two different effects to the same audio and then mix
  the results with the specified weighting.
  """
  first = apply_effect( effect_1, audio, samplerate )
  second = apply_effect( effect_2, audio, samplerate )
  return mix_1 * first + mix_2 * second

In [None]:
# as with tremolo earlier, mix_apply is just a function
# so pass results to play rather than using play_effect
play(mix_apply(board,
               pedalboard.Distortion(drive_db=36),
               mix_1=0.75, mix_2=0.25))

## Playing around

Most of the examples above are illustrative rather than aesthetically pleasing. Once you feel you've got a grip on them, you should play around with effect combinations and parameters — and different source audio — to see if you can come up with something that sounds good. You might find the [pedalboard effect documentation](https://spotify.github.io/pedalboard/reference/pedalboard.html) useful.


# Discussion

* What purposes might different audio effects serve?
* Can you identify what they are doing to sounds that you hear?
* Listen to a recorded music track that you like and know well. (If appropriate to the lab context, you could even play it for your fellow students.) Can you pick out any of the effects used in its production? How are they contributing the feel of the music? What would it sound like without them?

# Further work

If you're interested in exploring oscillator based synthesis, combining basic waveform generation like the sine wave above with filters allows for some simple testing — try generating square or sawtooth waves, for example.

However, for anything less basic you are probably better off with environments that are specifically designed for audio, rather than just generic Python running in Colab. The following all provide good playgrounds for experimentation:

* [SuperCollider](https://supercollider.github.io)
* [Pure Data](https://puredata.info)
* [Csound](https://csound.com)
* [Cmajor](https://cmajor.dev)

