# Chorale Rearranger

For this assignment, we were interested in the direct difference between the information offered by midi and audio data, namely the information about a piece of music that you can access from a Midi data that is difficult or impossible to obtain directly from an audio file. So we decided upon a project that would use this information (namely the note start and end times and the pitch values), to edit an audio file.

We decided to look at the chorale, a form of music with clearly defined melodic lines, yet where all of the lines blend together so that it becomes difficult to ascertain the lines individually. However, in a midi data, we would have all of this information from each line directly. So we decided to try and segment, alter, and rearrange the individual lines of a recording of the third movement of Bach's Matthäuspassion (https://www.youtube.com/watch?v=1AnEuCJpcCE) based upon a midi file that we found (https://www.cpdl.org/wiki/images/2/26/Ws-bwv-mp03.mid).

We would slice into the audio based upon the start and end times of the midi notes, and then apply a tight band pass filter around the pitch of each midi note. However, we would repeat this process four times, in each one altering the the centre frequency of the band pass. For the first version, we would centre on the the pitch of the corresponding voice, however, with each repition we would shift the voice from which we were taking the pitch values by one, so for example, the Soprano would firstly have the soprano pitches, then the alto pitches, then the tenor and the bass.

The idea is that the melodic lines would remain relatively similar, however the harmonies would shift around, between consonance and dissonance.

We would then generate a verison of the resulting piece with sine waves, that we would underlay the recording with, so as to keep the tonality pretty clear.

After that we would send the piece to Pure Data for some processing, and then return to Python to apply some reverb via convolution.

The result is an ambient version of the chorale repeated four times, with some clear melodic lines, but with a textural background that shifts between consonance and disonance.

In [None]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import IPython.display as ipd
import librosa
import librosa.display
import math
import scipy
import time
from scipy import signal
import pretty_midi as pm
import soundfile as sf
import os
from matplotlib import colors
from scipy import interpolate

sr = 48000

# function to convert freq in Hz to normalised frequency
    
def freqNormaliser(f):
    om = f/(sr*0.5)
    return om

# function to convert from Hz to Midi note value

def herztoMidi(f):
    midi =  np.around((12*np.log2(f/440) + 69), 0)
    return midi

# function to find the index of the nearst value in an array to an input
# (from https://stackoverflow.com/questions/2566412/find-nearest-value-in-numpy-array)

def nearestValue(array, value):
    dif = (abs(array-value)).argmin()
    return dif

# function to create a window with a sigmoid ramp shape
# this is used when appending audio with overlap
# chosen over hann, as hann still led to excesive amplitude modulations
# takes two arguments (N = size of window, c = length of ramp)

# got sigmoid code courtesy of Joachim:
        #z = np.exp(-i)
        #sig = 1 / (1 + z)


def sigmoidWindow(N,c):
    
    rampup = np.zeros(1)
    
    for i in range(c):

        z = np.exp(-i)
        sig = 1 / (1 + z)
        rampup = np.append(rampup, sig)
    
    window = np.concatenate((rampup, np.ones((N-2*c)-2), np.flip(rampup)))
        

    return window

# sine wave generation function

def sin_generator(amp, f, t):
    return amp*np.sin(2*np.pi*f*t)

# tight band pass filter around a frequency
# used to cut into the audio for each segment
# q arguement can be used to set pass band size
# IIR filter chosen as runs faster than FIR and will be run a lot!

def voiceFilter(f, song, q):
    
    Nband, Wband = signal.buttord([freqNormaliser(f-1), freqNormaliser(f+1)], [freqNormaliser(f-q), freqNormaliser(f+q)], 1, 10)
    bband, aband = signal.butter(N = Nband, Wn = Wband, btype='bandpass')
    filteredaudio = signal.lfilter(bband, aband, song)
    
    return filteredaudio

# a really bad saturator

def badSaturator(sigin, level):
    
    for i in range(sigin.size):
        
        if sigin[i] >= level:
            sigin[i] = level

        if sigin[i] <= -level:
            sigin[i] = -level
            
    return sigin

# Step 1 : Import the Audio File

The audio data used is taken from this YouTube video:

https://www.youtube.com/watch?v=1AnEuCJpcCE

We were not so concerned that we would have a low bandwidth audio file, no information above 10 kHz, as this area would be filted out anyway.

In [None]:
data = './Files/bach.wav'

bach, sr = librosa.load(data, sr = sr)

ipd.Audio(bach, rate = sr)

# Step 2: Import the Midi Data

The Midi data used is taken from:

https://www.cpdl.org/wiki/index.php/Matth%C3%A4uspassion,_BWV_244_(Johann_Sebastian_Bach)

In [None]:
midi_bach = pm.PrettyMIDI('./Files/bach.mid')

midilisten = midi_bach.synthesize(fs=sr)

ipd.Audio(midilisten, rate=sr)

In [None]:
# Now set each of the voices to be in their own variable, so that we can access data such as the starts,
# ends, and pitches of each note.

soprano = midi_bach.instruments[0]
alto = midi_bach.instruments[1]
tenor = midi_bach.instruments[2]
bass = midi_bach.instruments[3]

The biggest problem with our plan for slicing out of the audio between the starts and the ends of the Midi notes, is that currently the audio and the Midi do not allign in timing, tempo, or pitch. The first few steps to take are to reallign the audio and Midi so that they are a lot closer together. Even though it is next to impossible to get them to allign perfectly in the time given for the assignment, we were able to reach a pretty good standard that is enough for the purposes here.

# Step 3: Reconcile the Pitches of the Audio and the Midi Data

Through listening back to the audio and the synthesized Midi, it is possible to hear that both are in different keys. This need to be reconciled. It's possible to do this by ear in a DAW, but we aimed to do this within Python. We based this reconciliation upon the fact that the objective key of each doesn't matter, but rather the difference in pitch between them. So if we can make the pitches of the first chord in the Midi match the pitches in the first chord of the audio, we can then just subtract that value from all of the Midi note pitches.

The method that we are going to use is the following:

1. Find the rough duration of the first chord from the Midi data.
2. Cut this out of the audio.
3. Perform a Periodogram on the audio, and find the peak in this spectrum. This peak will most likely correspond to the fundamental of one of the voices.
4. Compare this to the 4 Midi pitch values of each voice in the first chord and find the nearest one.
5. Subtract these two values to find the difference.
6. Subtract this difference from all Midi pitches.

In [None]:
# create lists to dump all of the starting and ending values of the first notes into

firstchordstartlist = []
firstchordendlist = []

# find start and end of first notes of soprano and dump into lists

firstchordstartsop = soprano.notes[0].start*sr
firstchordstartlist.append(firstchordstartsop)
firstchordendsop = soprano.notes[0].end*sr
firstchordendlist.append(firstchordendsop)

# find start and end of first notes of alto and dump into lists

firstchordstartalt = alto.notes[0].start*sr
firstchordstartlist.append(firstchordstartalt)
firstchordendalt = alto.notes[0].end*sr
firstchordendlist.append(firstchordendalt)

# find start and end of first notes of tenor and dump into lists

firstchordstartten = tenor.notes[0].start*sr
firstchordstartlist.append(firstchordstartten)
firstchordendten = tenor.notes[0].end*sr
firstchordendlist.append(firstchordendten)

# find start and end of first notes of bass and dump into lists

firstchordstartbas = bass.notes[0].start*sr
firstchordstartlist.append(firstchordstartbas)
firstchordendbas = bass.notes[0].end*sr
firstchordendlist.append(firstchordendbas)

# Now get the min from the start and max from the end to ensure we get the entirety of the first chord

firstchordstart = min(firstchordstartlist)
firstchordend = min(firstchordendlist)

In [None]:
# get spectrum of audio on the first chord

freq, psd = signal.periodogram(bach[int(firstchordstart):int(firstchordend)], sr)

# find the peak in this spectrum

peak = librosa.util.peak_pick(psd, 0, psd.size, 0, psd.size, psd.max()*0.99, 10)

# convert this to a Midi value

audiofundamental = herztoMidi(freq[peak])

# Plot the Periodogran with the peak identified

plt.figure(figsize = (10, 10))
plt.title('Periodogram of First Chord in Audio')
plt.xlabel('Frequency (Hz (Logarithmic))')
plt.ylabel('Spectral Density (V*rms^2/Hz)')
plt.vlines(peak, 0, psd[peak]+(psd[peak]*0.1), linestyles ="dashed", colors ="r")
plt.semilogx(freq, psd)


In [None]:
# create a list of all the Midi pitches of the first notes

firstnotepitches = [soprano.notes[0].pitch, alto.notes[0].pitch, tenor.notes[0].pitch, bass.notes[0].pitch]

# find the nearest value in the list to the peak in the audio

midifundamental = firstnotepitches[nearestValue(firstnotepitches, audiofundamental)]

In [None]:
# subtract the difference between them from all of the Midi pitch values

for i, j, k, l in zip(soprano.notes, alto.notes, tenor.notes, bass.notes):
    i.pitch = i.pitch + (audiofundamental - midifundamental)
    j.pitch = j.pitch + (audiofundamental - midifundamental)
    k.pitch = k.pitch + (audiofundamental - midifundamental)
    l.pitch = l.pitch + (audiofundamental - midifundamental)

In [None]:
# now to listen to the synthesis of the new altered Midi data

synth = midi_bach.synthesize(fs = sr)
ipd.Audio(synth, rate = sr)

In [None]:
# And we can play it back on top of the audio to compare

kombi = bach + synth[:bach.size]

ipd.Audio(kombi, rate=sr)

We can hear that these are now both in the same key.

# Step 4: Remove Silence at Start

We can hear that there is also a few seconds of silence at the start of both the Midi and the audio data. We can remove this so that the music begins immediately on playback. We will first remove this from the audio and then from the Midi.

For the audio we will find the envelope and then the peaks of the envelope. We will then scale these to the values of the audio data. As there will be a peak at the start of the audio data, we will then remove the section between the first and second peak, resulting in the audio starting with the first note.

For the Midi, we will just minus the start time of the first note from all start and end times.

In [None]:
# create an onset envelop of the audio data

onset_env = librosa.onset.onset_strength(y=bach, sr=sr,hop_length=512,aggregate=np.median)
onset_env = onset_env/onset_env.max()

# find some peaks (doesn't really matter how many and where, as long as the peak at the start of the audio
# data and the peak at the start of the music are roughly caught)

peaks = librosa.util.peak_pick(onset_env[0:onset_env.size], 100, 100, 100, 100, 0.25, 100)

# plot the onset envelope with the peaks

plt.figure(figsize = (14, 3))
plt.title('Onset Envelope of the Audio Data')
plt.xlabel('Frame')
plt.ylabel('Normalised Strength')
plt.vlines(peaks, 0, onset_env.max()+onset_env.max()*0.1, linestyles ="dashed", colors ="r")

plt.plot(onset_env)

In [None]:
# scale peaks to the values in the audio data

peakscaler = int(bach.size/onset_env.size)
peaksscaled = peaks*peakscaler
times = (np.arange(0, bach.size, 1)/sr)

# plot the peaks on the waveform

plt.figure(figsize=(14, 3))
plt.title('Peaks on Waveform')
plt.xlabel('Time (Seconds)')
plt.ylabel('Amplitude')
plt.vlines(peaksscaled/sr, -0.5, 0.5, linestyles ="dashed", colors ="r")
plt.plot(times, bach)


plt.show()

In [None]:
# remove the section between the first and second peak from the audio data and listen

bach = bach[peaksscaled[1]:bach.size+1]

ipd.Audio(bach, rate=sr)

In [None]:
# remove the silence at the beginning of the Midi data

# create a list of all start times of each voice

midistartslist = [soprano.notes[0].start, alto.notes[0].start, tenor.notes[0].start, bass.notes[0].start]

# find the start of the very first note

chopchop = min(midistartslist)

# remove from all start and end times

for i, j, k, l in zip(soprano.notes, alto.notes, tenor.notes, bass.notes):
    i.start = i.start - chopchop
    i.end = i.end - chopchop
    j.start = j.start - chopchop
    j.end = j.end - chopchop
    k.start = k.start - chopchop
    k.end = k.end - chopchop
    l.start = l.start - chopchop
    l.end = l.end - chopchop

In [None]:
# synthesize and listen back

synth = midi_bach.synthesize(fs = sr)

ipd.Audio(synth, rate = sr)

In [None]:
# Now we can combine the audio and the synthesized Midi again and listen back

kombi = bach + synth[:bach.size]

ipd.Audio(kombi, rate=sr)

What's noticable here is that the audio and the midi are performed at different tempi. Reconciling this difference is what we will do next.

# Step 5: Reconcile Tempi

Similar to the pitch, the objective tempi of the audio and the Midi are not so important here, but rather the difference between them. Both librosa and pretty Midi have a tempo estimation function which will form the basis of the reconciliation. We will find the difference between these estimations and then scale the durations of the Midi notes accordingly (this is a very rough method and doesn't offer perfect results, but it works well enough for the purposes here).

The method we will use is as follows:

1. Estimate the tempi of audio and Midi
2. Find a scale factor by dividing the Midi tempo by the audio tempo.
3. Find the durations of the Midi notes by subtracting end time from start time for each note and put them into an array.
4. Multiply the array by the scale factor to get the new durations of each note.
5. Go through the Midi data and change the start values and end values based upon these durations.

In [None]:
# estimate tempi of audio/midi

bachtempo = float(librosa.beat.tempo(bach, sr=sr))

miditempo = midi_bach.estimate_tempo()


In [None]:
# find scale factor

scalefactor = miditempo/bachtempo

In [None]:
# get note durations

sopranodurations = np.empty(0)
altodurations = np.empty(0)
tenordurations = np.empty(0)
bassdurations = np.empty(0)

#can't zip here cause different lengths

for i in soprano.notes:
    sopranodurations = np.append(sopranodurations, (i.end - i.start))
    
for i in alto.notes:
    altodurations = np.append(altodurations, (i.end - i.start))
    
for i in tenor.notes:
    tenordurations = np.append(tenordurations, (i.end - i.start))
    
for i in bass.notes:
    bassdurations = np.append(bassdurations, (i.end - i.start))


In [None]:
# multiply  note durations by scale factor

sopranodurations = sopranodurations*scalefactor
altodurations = altodurations*scalefactor
tenordurations = tenordurations*scalefactor
bassdurations = bassdurations*scalefactor

In [None]:
# pull out start values

sopranostarts = np.empty(0)
altostarts = np.empty(0)
tenorstarts = np.empty(0)
bassstarts = np.empty(0)

for i in soprano.notes:
    sopranostarts = np.append(sopranostarts, i.start)

for i in alto.notes:
    altostarts = np.append(altostarts, i.start)

for i in tenor.notes:
    tenorstarts = np.append(tenorstarts, i.start)

for i in bass.notes:
    bassstarts = np.append(bassstarts, i.start)

In [None]:
# rescale the note lengths to change tempo

for i, j in enumerate(soprano.notes):
    j.end = sopranostarts[i]+sopranodurations[i]
    j.start = sopranostarts[i]
    if i+1 == len(soprano.notes):
        break
    sopranostarts[i+1] = sopranostarts[i]+sopranodurations[i]

for i, j in enumerate(alto.notes):
    j.end = altostarts[i]+altodurations[i]
    j.start = altostarts[i]
    if i+1 == len(alto.notes):
        break
    altostarts[i+1] = altostarts[i]+altodurations[i]

for i, j in enumerate(tenor.notes):
    j.end = tenorstarts[i]+tenordurations[i]
    j.start = tenorstarts[i]
    if i+1 == len(tenor.notes):
        break
    tenorstarts[i+1] = tenorstarts[i]+tenordurations[i]

for i, j in enumerate(bass.notes):
    j.end = bassstarts[i]+bassdurations[i]
    j.start = bassstarts[i]
    if i+1 == len(bass.notes):
        break
    bassstarts[i+1] = bassstarts[i]+bassdurations[i]

In [None]:
# Now we can synthesize the Midi and listen back

synth = midi_bach.synthesize(fs = sr)
ipd.Audio(synth, rate = sr)

In [None]:
# and listen to it compared to the audio

bachkombi = bach + (np.append(synth, np.zeros(bach.size-synth.size)))

ipd.Audio(bachkombi, rate=sr)

While this works to an extent, with the Midi and audio alligning perfectly in some sections, in others they are completely out of allignment.

# Step 6: Allign the Midi Notes with the Audio Notes

There is one last method that we can attempt to put them into better allignment. From listening to the audio (and examining the waveform above), we can here that there are several expressive pauses between phrases. These are not found in the Midi data, with each phrase carrying directly on into the next. We will therefore attempt to line up the start of each phrase in the Midi with the start of each phrase in the audio, so that even if the drift out of allignment over the course of the phrase, at least they will come back into allignment at the start of the next.

The method we will use is as follows:

1. Find the start times of each of the phrases. This is done through the envelope and peak analysis method as used above, except here it keeps running through this analysis until only the start of each phrase is found. This is done by manually setting the number of phrases and running through until these values are found (ideally this would also be done automatically without having to manaully set it, but I don't have time or any ideas on how to do this at the moment).

2. Scale the peak values from frames to seconds.

3. Find the nearest note in each voice to the peak and calculate the difference between them.

4. Add this difference onto the start and end values of each Midi note that comes after this peak.

5. Repeat steps 3 and 4 for each peak value.

In [None]:
numberofphrases = 4

# find peaks in audio after silence to shift midi too

onset_env = librosa.onset.onset_strength(y=bach, sr=sr,hop_length=512,aggregate=np.median)


# pull in the amount of the envelope over which the peaks are searched for, while increasing the number of samples
# over which the analysis is performed until only (numberofphrases)peaks remain.

for i in range(onset_env.size):
    peaks = librosa.util.peak_pick(onset_env[0+i:onset_env.size-i], i, 300, i, 300, 0.5, 800)
    if peaks.size == numberofphrases:
        break
        
# normalise the values in the onset envelope

onset_env = onset_env/onset_env.max()

# plot the peaks

plt.figure(figsize = (14, 3))
plt.title('Onset Envelope of the Audio Data')
plt.xlabel('Frame')
plt.ylabel('Normalised Strength')
plt.vlines(peaks, 0, onset_env.max()+onset_env.max()*0.1, linestyles ="dashed", colors ="r")
plt.plot(onset_env)


plt.show()

In [None]:
# I will also plot theses on the waveform

# scale peaks to the samples in the audio data

peakscaler = int(bach.size/onset_env.size)
peaksscaled = peaks*peakscaler


plt.figure(figsize=(14, 3))
plt.title('Peaks on Waveform')
plt.xlabel('Time (Seconds)')
plt.ylabel('Amplitude')
plt.vlines(peaksscaled/sr, -0.5, 0.5, linestyles ="dashed", colors ="r")
plt.plot(np.arange(0, bach.size, 1)/sr, bach)


plt.show()

In [None]:
# convert the times of the peaks from samples to seconds

peaktimes = peaksscaled/sr

In [None]:
for i in peaktimes:
    
    a = nearestValue(sopranostarts, i)
    dif = i-sopranostarts[a]
    
    for j in range(a, len(sopranostarts)):
        soprano.notes[j].start = soprano.notes[j].start+dif
        soprano.notes[j].end = soprano.notes[j].end+dif
        
    a = nearestValue(altostarts, i)
    dif = i-altostarts[a]
    
    for j in range(a, len(altostarts)):
        alto.notes[j].start = alto.notes[j].start+dif
        alto.notes[j].end = alto.notes[j].end+dif
        
    a = nearestValue(tenorstarts, i)
    dif = i-tenorstarts[a]
    
    for j in range(a, len(tenorstarts)):
        tenor.notes[j].start = tenor.notes[j].start+dif
        tenor.notes[j].end = tenor.notes[j].end+dif
        
    a = nearestValue(bassstarts, i)
    dif = i-bassstarts[a]
    
    for j in range(a, len(bassstarts)):
        bass.notes[j].start = bass.notes[j].start+dif
        bass.notes[j].end = bass.notes[j].end+dif

In [None]:
# Now we can synthesize the Midi and listen back

synth = midi_bach.synthesize(fs = sr)
ipd.Audio(synth, rate = sr)

In [None]:
# and listen to it compared to the audio

bachkombi = bach + (np.append(synth, np.zeros(bach.size-synth.size)))

ipd.Audio(bachkombi, rate=sr)

In [None]:
# Now to make the audio and the Midi the same length, we will extend the duration of the final notes so that they end
# at the same point as the length of the audio data

soprano.notes[len(soprano.notes)-1].end = bach.size/sr
alto.notes[len(alto.notes)-1].end = bach.size/sr
tenor.notes[len(tenor.notes)-1].end = bach.size/sr
bass.notes[len(bass.notes)-1].end = bach.size/sr

In [None]:
# Now we can synthesize the Midi and listen back

synth = midi_bach.synthesize(fs = sr)
ipd.Audio(synth, rate = sr)

In [None]:
# and listen to it compared to the audio

# pad the audio to make same length

bach = np.concatenate((bach, np.zeros(synth.size-bach.size)))

bachkombi = synth + bach

ipd.Audio(bachkombi, rate=sr)

The result is still not really extremely close to what we want, but due to issues of time we will have to move on from this.

This has been an interesting experiment. The issue of reconciling the Midi with the audio was one that we did not really think about before starting, and has turned out to be a lot more difficult than expected. This could be something to work on for the improvements.

# Step 7: Prepare the Data Needed for Rearranging Audio

Before we start rearranging the audio, there are still a few arrays to be prepare of the information contained in the Midi notes. We are doing this so that it is easier to access this data directly instead of having to dive into the Midi note information.

We will prepare arrays of start time, end times, durations, and pitches for each Note in each voice.

In [None]:
# start values

sopranostarts = np.empty(0)
altostarts = np.empty(0)
tenorstarts = np.empty(0)
bassstarts = np.empty(0)

for i in soprano.notes:
    sopranostarts = np.append(sopranostarts, i.start)

for i in alto.notes:
    altostarts = np.append(altostarts, i.start)

for i in tenor.notes:
    tenorstarts = np.append(tenorstarts, i.start)

for i in bass.notes:
    bassstarts = np.append(bassstarts, i.start)
    
# end values

sopranoends = np.empty(0)
altoends = np.empty(0)
tenorends = np.empty(0)
bassends = np.empty(0)

for i in soprano.notes:
    sopranoends = np.append(sopranoends, i.end)

for i in alto.notes:
    altoends = np.append(altoends, i.end)

for i in tenor.notes:
    tenorends = np.append(tenorends, i.end)

for i in bass.notes:
    bassends = np.append(bassends, i.end)
    
# duration values

sopranodurations = sopranoends - sopranostarts
altodurations = altoends - altostarts
tenordurations = tenorends - tenorstarts
bassdurations = bassends - bassstarts

# pitch values

sopranopitches = np.empty(0)
altopitches = np.empty(0)
tenorpitches = np.empty(0)
basspitches = np.empty(0)

for i in soprano.notes:
    sopranopitches = np.append(sopranopitches, (pm.note_number_to_hz(i.pitch)))

for i in alto.notes:
    altopitches = np.append(altopitches, (pm.note_number_to_hz(i.pitch)))

for i in tenor.notes:
    tenorpitches = np.append(tenorpitches, (pm.note_number_to_hz(i.pitch)))

for i in bass.notes:
    basspitches = np.append(basspitches, (pm.note_number_to_hz(i.pitch)))

# Step 8: Rearrange the Chorale

Before we start rearring the chorale, there is a cell that creates several nests of the pitches array. Theses are to be accessed for the arrangement of each voice in the rearrangement cell. They run through pitches to be used by each voice in turn to create each variation. The pitches are offset by one voice in relation to the start and end times so that, for example, when arranging the soprano start and end times, the alto pitches are used.

The reearranging of the chorale takes place in one large cell, that involves several steps.

As an example we can look at what happens for the soprano start and end values.

Firstly, there is an iteration of the whole code block between 0 and 3, corresponding to each of the pitches arrays in the nest. So the first time through, the soprano notes take the soprano pitches, the second time through the alto pitches, the third time through, the tenor pitches, and the fourth time through the bass pitches.

Each time through, the audio is cut between the values of the start of the midi note and the end of the midi note. A very tight band pass filter is applied to this segment of the audio, focussing on the pitch of the midi note. With each variation, the pass band of this filter widens slightly. This is to allow more of the vocal to slowly come through. The audio is then passed through a saturator, and then a window function is applied. This segment is then concatenated onto the finished audio for this voice using an overlap technique.

There is an issue that came up for which a solution had to be decided upon. As each of the voices has a different amount of notes, and the note values do not match directly to one another (and this is good, as otherwise each variation would be identical) a solution had to be found for which pitches to assign to which notes.

The solution that was settled upon is that if there were more notes in the voice than in the voice from which the pitches were to be taken, the final pitch would just be repated for all of the excess notes. This is for a number of reasons: a) This would preserve the V-I resolution of the chorale, b) This would lead to greater variations over the course of the chorale while also preserving this resolution, c) This is the easiest to implement in code and time was short.

In [None]:
# create the nests of pitches arrays to be used by each voice

pitchesnest = [sopranopitches, altopitches, tenorpitches, basspitches]
pitchesnest2 = [altopitches, tenorpitches, basspitches, sopranopitches]
pitchesnest3 = [tenorpitches, basspitches, sopranopitches, altopitches]
pitchesnest4 = [basspitches, sopranopitches, altopitches, tenorpitches]


In [None]:
# declare the number of samples for the overlap between segments

overlap = 64

# declare the empty arrays onto which the segments will be concatenated
   
audio = np.empty(0)
audio2 = np.empty(0)
audio3 = np.empty(0)
audio4 = np.empty(0)

for z in range(0, 4):
    
    # declare the values used to ensure that the last pitch value is held for excess notes
    
    a = 1
    b = 1
    c = 1
    d = 1
    
    # create the soprano line
    
    for i, j in enumerate(sopranostarts): 
        
        # slice the audio between the start and end of each note
    
        bachcut = bach[int(j*sr):int(sopranoends[i]*sr)+1]
        
        # decide which pitch value index to take. If pitch index is above note value index it takes the final pitch
        
        if i >= len(pitchesnest[z]):
            i = i-a
            a = a+1
        else:
            i = i
            
        # apply the filter. For each variation the pass band is widened.
    
        bachfilt = voiceFilter(pitchesnest[z][i], bachcut, z+20)
    
        # apply saturation
        
        bachfilt = bachfilt + badSaturator(bachfilt, bachfilt.max()/6)
        
        # apply sigmoid window
        
        bachfilt = bachfilt*sigmoidWindow(bachfilt.size, int((overlap)*(i+1)))
        
        # append the segment using overlap. The first segment has nothing to be overlapped with, hence the
        # if statement
        
        if z == 0 and i == 0:
    
            audio = np.concatenate((audio, bachfilt))
        
        else:
        
            audio = np.concatenate((audio, np.zeros(bachfilt.size-overlap*(i+1))))
            audio = np.concatenate((audio[:(audio.size-bachfilt.size)], (audio[(audio.size-bachfilt.size)] + bachfilt)))
    
    # create the alto line
    
    for i, j in enumerate(altostarts):
        
        # slice the audio between the start and end of each note
    
        bachcut = bach[int(j*sr):int(altoends[i]*sr)+1]
        
        # decide which pitch value index to take. If pitch index is above note value index it takes the final pitch
        
        if i >= len(pitchesnest2[z]):
            i = i-b
            b = b+1
        else:
            i = i
            
        # apply the filter. For each variation the pass band is widened.
    
        bachfilt = voiceFilter(pitchesnest2[z][i], bachcut, z+20)
    
        # apply saturation
        
        bachfilt = bachfilt + badSaturator(bachfilt, bachfilt.max()/6)
        
        # apply sigmoid window
        
        bachfilt = bachfilt*sigmoidWindow(bachfilt.size, int((overlap)*(i+1)))
        
        # decide which pitch value index to take. If pitch index is above note value index it takes the final pitch
    
        if z == 0 and i == 0:
    
            audio2 = np.concatenate((audio2, bachfilt))
        
        else:
        
            audio2 = np.concatenate((audio2, np.zeros(bachfilt.size-overlap*(i+1))))
            audio2 = np.concatenate((audio2[:(audio2.size-bachfilt.size)], (audio2[(audio2.size-bachfilt.size)] + bachfilt)))

    # create the tenor line
    
    for i, j in enumerate(tenorstarts):
        
        # slice the audio between the start and end of each note

        bachcut = bach[int(j*sr):int(tenorends[i]*sr)+1]
        
        # decide which pitch value index to take. If pitch index is above note value index it takes the final pitch
        
        if i >= len(pitchesnest3[z]):
            i = i-c
            c = c+1
        else:
            i = i
    
        # apply the filter. For each variation the pass band is widened.
        
        bachfilt = voiceFilter(pitchesnest3[z][i], bachcut, z+20)
    
        # apply saturation
        
        bachfilt = bachfilt + badSaturator(bachfilt, bachfilt.max()/6)
        
        # apply sigmoid window
        
        bachfilt = bachfilt*sigmoidWindow(bachfilt.size, int((overlap)*(i+1)))
        
        # decide which pitch value index to take. If pitch index is above note value index it takes the final pitch
    
        if z == 0 and i == 0:
    
            audio3 = np.concatenate((audio3, bachfilt))
        
        else:
        
            audio3 = np.concatenate((audio3, np.zeros(bachfilt.size-overlap*(i+1))))
            audio3 = np.concatenate(((audio3[:(audio3.size-bachfilt.size)], (audio3[(audio3.size-bachfilt.size)] + bachfilt))))

    # create the bass line
    
    for i, j in enumerate(bassstarts):
        
        # slice the audio between the start and end of each note
    
        bachcut = bach[int(j*sr):int(bassends[i]*sr)+1]
        
        # decide which pitch value index to take. If pitch index is above note value index it takes the final pitch
    
        if i >= len(pitchesnest4[z]):
            i = i-d
            d = d+1
        else:
            i = i
        
        # apply the filter. For each variation the pass band is widened.
        
        bachfilt = voiceFilter(pitchesnest4[z][i], bachcut, z+20)

        # apply saturation
        
        bachfilt = bachfilt + badSaturator(bachfilt, bachfilt.max()/6)
        
        # apply sigmoid window
        
        bachfilt = bachfilt*sigmoidWindow(bachfilt.size, int((overlap)*(i+1)))
        
        # decide which pitch value index to take. If pitch index is above note value index it takes the final pitch
        
        if z == 0 and i == 0:
    
            audio4 = np.concatenate((audio4, bachfilt))
        
        else:
        
            audio4 = np.concatenate((audio4, np.zeros(bachfilt.size-overlap*(i+1))))
            audio4 = np.concatenate((audio4[:(audio4.size-bachfilt.size)], (audio4[(audio4.size-bachfilt.size)] + bachfilt)))

In [None]:
# here I create the summed audio data. Due to the fact that each voice's individual data is not the same length
# (but only by a few sample), I need to slice them to the same length.

if audio.size < audio2.size and audio.size < audio3.size and audio.size < audio4.size:
    
    audio2 = audio2[:audio.size]
    audio3 = audio3[:audio.size]
    audio4 = audio4[:audio.size]
    
elif audio2.size < audio.size and audio2.size < audio3.size and audio2.size < audio4.size:
    
    audio = audio[:audio2.size]
    audio3 = audio3[:audio2.size]
    audio4 = audio4[:audio2.size]
    
elif audio3.size < audio.size and audio3.size < audio2.size and audio3.size < audio4.size:
    
    audio = audio[:audio3.size]
    audio2 = audio2[:audio3.size]
    audio4 = audio4[:audio3.size]

    
elif audio4.size < audio.size and audio4.size < audio2.size and audio4.size < audio3.size:
    
    audio = audio[:audio4.size]
    audio2 = audio2[:audio4.size]
    audio3 = audio3[:audio4.size]


# sum the audio of the four voices

audiofin = audio + audio2 + audio3 + audio4

# normalise the audio

audiofin = audiofin/np.max(audiofin)

# plot the waveform and listen

plt.figure(figsize=(14, 3))
plt.title('Waveform of Rearranged Audio')
plt.xlabel('Time (Seconds)')
plt.ylabel('Amplitude')
plt.plot(np.arange(0, audiofin.size, 1)/sr, audiofin)

ipd.Audio(audiofin, rate = sr)

This is admittedly not the healthiest looking waveform, but the sound is there. To fill things out a little, we will underlie this with sines, also generated according to the Midi data.

# Step 9: Create Sine Waves to Underlie the Audio

The same basic process and architecture is used to generate the sines as was used to rearrange the audio.

(A note on windows: With the splicing of the audio, the sigmoid window function that we created in combination with the overlap both removed clicks as well as avoided excessive amplitude "pumping". However, for the sines this window still resulted in clicking, even at greater overlaps. We also then attempted with the hanning window, and while this removed the clicks excessive pumping is still there. As we am running low on time, we have decided to stick with the hanning window and overlap, the lesser of two evils so to speak. For the portfolio we will attempt to truly solve this problem.) 

In [None]:
sin = np.empty(0)
sin2 = np.empty(0)
sin3 = np.empty(0)
sin4 = np.empty(0)

for z in range(0, 4):
    
    # declare the values used to ensure that the last pitch value is held for excess notes
    
    a = 1
    b = 1
    c = 1
    d = 1
    
    # create the soprano line
    
    for i, j in enumerate(sopranostarts):
        
        # here for if the number of notes exceeds number of pitches so as just to take final pitch
        
        if i >= len(pitchesnest[z]):
            
            # generate sine
                        
            singen = sin_generator(0.25, (pitchesnest[z][i-a]), np.arange(0, sopranodurations[i], 1/sr))
            
            a = a+1
            
            # apply window
            
            singen = singen*np.hanning(singen.size)
            
            # apply overlap
            
            sin = np.concatenate((sin, np.zeros(singen.size-(overlap)*(i+1)))) 
            
            # add to final
            
            sin = np.concatenate((sin[:(sin.size-singen.size)], (sin[(sin.size-singen.size):] + singen)))

        # here for rest of notes
        
        else:
            
            # generate sine
            
            singen = sin_generator(0.25, (pitchesnest[z][i]), np.arange(0, sopranodurations[i], 1/sr))
            
            # apply window
            
            singen = singen*np.hanning(singen.size)
            
            # for first sine no need for overlap
            
            if z == 0 and i == 0:
    
                sin = np.concatenate((sin, singen))
        
            else:
                
                # apply overlap
        
                sin = np.concatenate((sin, np.zeros(singen.size-(overlap*(i+1)))))
            
                # add to final
                
                sin = np.concatenate((sin[:(sin.size-singen.size)], (sin[(sin.size-singen.size):] + singen)))

    
   # create the alto line     
        
    for i, j in enumerate(altostarts):
        
        # here for if the number of notes exceeds number of pitches so as just to take final pitch
        
        if i >= len(pitchesnest2[z]):
            
            # generate sine
                        
            singen = sin_generator(0.25, (pitchesnest2[z][i-b]), np.arange(0, altodurations[i], 1/sr))
            
            b = b+1
            
            # apply window
            
            singen = singen*np.hanning(singen.size)
            
            # apply overlap
            
            sin2 = np.concatenate((sin2, np.zeros(singen.size-(overlap*(i+1)))))
            
            # add to final
            
            sin2 = np.concatenate((sin2[:(sin2.size-singen.size)], (sin2[(sin2.size-singen.size):] + singen)))

        # here for rest of notes
        
        else:
            
            # generate sine
            
            singen = sin_generator(0.25, (pitchesnest2[z][i]), np.arange(0, altodurations[i], 1/sr))
            
            # apply window
            
            singen = singen*np.hanning(singen.size)
            
            # for first sine no need for overlap
            
            if z == 0 and i == 0:
    
                sin2 = np.concatenate((sin2, singen))
        
            else:
                
                # apply overlap
        
                sin2 = np.concatenate((sin2, np.zeros(singen.size-(overlap*(i+1)))))
            
                # add to final
            
                sin2 = np.concatenate((sin2[:(sin2.size-singen.size)], (sin2[(sin2.size-singen.size):] + singen)))
            
    # create the tenor line

    for i, j in enumerate(tenorstarts):
        
        # here for if the number of notes exceeds number of pitches so as just to take final pitch        
        
        if i >= len(pitchesnest3[z]):
            
            # generate sine
                        
            singen = sin_generator(0.25, (pitchesnest3[z][i-c]), np.arange(0, tenordurations[i], 1/sr))
            
            c = c+1
            
            # apply window
            
            singen = singen*np.hanning(singen.size)
            
            # apply overlap
            
            sin3 = np.concatenate((sin3, np.zeros(singen.size-(overlap*(i+1)))))
            
            # add to final
            
            sin3 = np.concatenate((sin3[:(sin3.size-singen.size)], (sin3[(sin3.size-singen.size):] + singen)))

        # here for rest of notes
        
        else:
            
            # generate sine
            
            singen = sin_generator(0.25, (pitchesnest3[z][i]), np.arange(0, tenordurations[i], 1/sr))
            
            # apply window
            
            singen = singen*np.hanning(singen.size)
            
            # for first sine no need for overlap
            
            if z == 0 and i == 0:
    
                sin3 = np.concatenate((sin3, singen))
        
            else:
                
                # apply overlap
        
                sin3 = np.concatenate((sin3, np.zeros(singen.size-(overlap*(i+1)))))
            
                # add to final
            
                sin3 = np.concatenate((sin3[:(sin3.size-singen.size)], (sin3[(sin3.size-singen.size):] + singen)))
            
    # create the bass line        

    for i, j in enumerate(bassstarts):
        
        # here for if the number of notes exceeds number of pitches so as just to take final pitch
        
        if i >= len(pitchesnest4[z]):
            
            # generate sine
                        
            singen = sin_generator(0.25, (pitchesnest4[z][i-d]), np.arange(0, bassdurations[i], 1/sr))
            
            d = d+1
            
            # apply window
            
            singen = singen*np.hanning(singen.size)
            
            # apply overlap
            
            sin4 = np.concatenate((sin4, np.zeros(singen.size-(overlap*(i+1))))) 
            
            # add to final
            
            sin4 = np.concatenate((sin4[:(sin4.size-singen.size)], (sin4[(sin4.size-singen.size):] + singen)))

        # here for rest of notes
        
        else:
            
            # generate sine
            
            singen = sin_generator(0.25, (pitchesnest4[z][i]), np.arange(0, bassdurations[i], 1/sr))
            
            # apply window
            
            singen = singen*np.hanning(singen.size)
            
            # for first sine no need for overlap
            
            if z == 0 and i == 0:
    
                sin4 = np.concatenate((sin4, singen))
        
            else:
                
                # apply overlap
        
                sin4 = np.concatenate((sin4, np.zeros(singen.size-(overlap*(i+1)))))
            
                # add to final
            
                sin4 = np.concatenate((sin4[:(sin4.size-singen.size)], (sin4[(sin4.size-singen.size):] + singen)))
            
          

In [None]:
# here I create the summed sine data. Due to the fact that each voice's individual data is not the same length
# (but only by a few sample), I need to slice them to the same length.

if sin.size < sin2.size and sin.size < sin3.size and sin.size < sin4.size:
    
    sin2 = sin2[:sin.size]
    sin3 = sin3[:sin.size]
    sin4 = sin4[:sin.size]
    
elif sin2.size < sin.size and sin2.size < sin3.size and sin2.size < sin4.size:
    
    sin = sin[:sin2.size]
    sin3 = sin3[:sin2.size]
    sin4 = sin4[:sin2.size]
    
elif sin3.size < sin.size and sin3.size < sin2.size and sin3.size < sin4.size:
    
    sin = sin[:sin3.size]
    sin2 = sin2[:sin3.size]
    sin4 = sin4[:sin3.size]

    
elif sin4.size < sin.size and sin4.size < sin2.size and sin4.size < sin3.size:
    
    sin = sin[:sin4.size]
    sin2 = sin2[:sin4.size]
    sin3 = sin3[:sin4.size]


# sum the sines of the four voices

sinfin = sin + sin2 + sin3 + sin4

# normalise the sines

sinfin = sinfin/np.max(sinfin)

# plot the waveform and listen

plt.figure(figsize=(14, 3))
plt.title('Waveform of Generated Sines')
plt.xlabel('Time (Seconds)')
plt.ylabel('Amplitude')
plt.plot(np.arange(0, sinfin.size, 1)/sr, sinfin)

ipd.Audio(sinfin, rate = sr)

# Step 10: Combine the Audio and the Sines

Although only differing in length by a couple of milliseconds, the audio and the sines are not identical in length. Therefore, the larger must be slightly trimmed before they are summed.

In [None]:
# check which is the larger data and slice

if audiofin.size > sinfin.size:
    audiofin = audiofin[:sinfin.size]
else:
    sinfin = sinfin[:audiofin.size]

In [None]:
# sum

summed = audiofin + (sinfin*0.25)

# cause waveform's so funky gonna have to do some magic to normalise

# normalise positive side
summed = summed/summed.max()

# flip polarity
summed = -summed

# normalise negative side, as now positive
summed = summed/summed.max()

# plot the waveform

plt.figure(figsize=(14, 3))
plt.title('Waveform of Summed Audio and Sines')
plt.xlabel('Time (Seconds)')
plt.ylabel('Amplitude')
plt.plot(np.arange(0, summed.size, 1)/sr, summed)

ipd.Audio(summed, rate=sr)

Again a bit of a strange waveform due to the filtering, but the sound is alright!

# Step 11: Send to PD for Processing

Now we will send the audio to Pure Data for some processing. We will add some band pass filters with automated centre frequencies to create some movement, add some subtle reverb, and create a stereo wave file return.

The audio will play as it processes.

In [None]:
# write the audio data to a wave file to be used in PD

sf.write('./Files/audioMidiSum.wav', summed, sr)

In [None]:
# choose the executable based upon operating system. If you have a completely different file path, you should copy
# this in.

# PC
# pd_executable = '"C:\\Program Files\\Pd\\bin\\pd.exe"'

# Mac
pd_executable = '/Applications/Pd-0.51-4.app/Contents/Resources/bin/pd'

# Linux
#pd_executable = '/usr/local/bin/pd '

pd_patch = './Files/ChoraleProcessor.pd'
send = ' -send ";duration ' + str(summed.size/sr*1000) + '"'

command = pd_executable + ' -open ' + pd_patch + send + ' -r ' + str(sr) + ' -nogui'
print(command)
os.popen(command);

time.sleep((summed.size/sr)+5)

We can now take a look at the waveform of each channel.

In [None]:
# import the processed audio

dataprocessed = './Files/PDManip.wav'

processed, sr = librosa.load(dataprocessed, sr = sr, mono=False)

ipd.Audio(processed, rate = sr)

In [None]:
f, (ax1, ax2) = plt.subplots(2, 1, sharey=True,  figsize=(14, 6))

plt.subplots_adjust(hspace = .5)

ax1.set_title('Waveform of Left Channel')
ax1.set_xlabel('Time (Seconds)')
ax1.set_ylabel('Amplitude')
ax1.plot(np.arange(0, ((processed[0].size-1)/sr), 1/sr), processed[0])

ax2.set_title('Waveform of Right Channel')
ax2.set_xlabel('Time (Seconds)')
ax2.set_ylabel('Amplitude')
ax2.plot(np.arange(0, ((processed[1].size-1)/sr), 1/sr), processed[1])

# Step 12: Apply Reverb through convolution

To fill out the frequency spectrum again and create the textural effect that we wanted, we decided to also apply reverb through convolution.

We decided to use the scipy convolution function, as it ran a lot faster than the numpy convolution version.

After testing many different impulse responses, we settled upon one that we liked.

In [None]:
# import impulse response and listen

dataimp = './Files/gozo_citadel_silo_ir_edit.wav'

imp, sr = librosa.load(dataimp, sr = sr)

ipd.Audio(imp, rate = sr)

In [None]:
# plot the impulse response

plt.figure(figsize=(14, 3))
plt.title('Waveform of Impulse Response')
plt.xlabel('Time (Seconds)')
plt.ylabel('Amplitude')
plt.plot(np.arange(0, imp.size, 1)/sr, imp)

In [None]:
# convolve each channel and normalise

finalleft = signal.convolve(processed[0], imp)
finalleft = finalleft/finalleft.max()
finalright = signal.convolve(processed[1], imp)
finalright = finalright/finalright.max()

# rejoin left and right channel

final = np.vstack((finalleft, finalright))

# listen to the final audio

ipd.Audio(final, rate = sr)

In [None]:
# and plot the simple waveforms of the left and right channel

f, (ax1, ax2) = plt.subplots(2, 1, sharey=True,  figsize=(14, 6))

plt.subplots_adjust(hspace = .5)

ax1.set_title('Waveform of Left Channel')
ax1.set_xlabel('Time (Seconds)')
ax1.set_ylabel('Amplitude')
ax1.plot(np.arange(0, ((finalleft.size)/sr), 1/sr), finalleft)

ax2.set_title('Waveform of Right Channel')
ax2.set_xlabel('Time (Seconds)')
ax2.set_ylabel('Amplitude')
ax2.plot(np.arange(0, ((finalright.size)/sr), 1/sr), finalright)

In [None]:
# reshape for soundfile write

finalReshape = final.T

In [None]:
# write to a Wave File

sf.write('./Files/BachRearrangedTexture.wav', finalReshape, sr)

We are pretty satisfied with the results. However, one issue that we do have is that due to the excessive filtering, there is little content above 800 Hz. For this reason we implemented the saturator, although it has to be driven extremely hard to fill the spectrum back up, and this doesn't sound very pleasant. Another way of dealing with this might be something think about for the portfolio.

On the whole though, the results are a little different to what we were expecting when starting the project, but we like it nonetheless.

# Step 13: Waveforms

We will now plot the waveforms of the final audio in two different styles:

First a Serato-Style waveform, where the colour of the waveform represents the spectral centroid. Red section have a lower spectral centroid, green sections a mid spectral centroid, and blue a higher spectral centroid.

In [None]:
hopLength = 512*3
frameLength = 4096
xTicksInterval = 10 #seconds between each tick. Decrease for smaller files, increase for larger files.

colorThreshLow = 250 #in hz
colorThreshHigh = 350 #in hz


#Colors in HEX
audacityColorScheme = {
    'wave': '#312fc1',
    'rms': '#6464df',
    'bg': '#c0c1c1',
}

#Colors in RGBA
seratoColorScheme = {
    'high': (0,0,1,.5),
    'mid': (0,1,0,.5),
    'bass': (1,0,0,.5),
    'bg': '#000000'
}

#Load Audio
audioData = final

#Split into two channels, easier to keep track.
audioDataL = audioData[0]
audioDataR = audioData[1]

#RMS Left
rmsL = librosa.feature.rms(audioDataL, frame_length=frameLength, hop_length=hopLength)
rmsL = np.ndarray.flatten(rmsL)

rmsLRepeat = np.repeat(rmsL, frameLength)
appliedRMSL = np.multiply(audioDataL, rmsLRepeat[:audioDataL.size])

#RMS Right
rmsR = librosa.feature.rms(audioDataR, frame_length=frameLength, hop_length=hopLength)
rmsR = np.ndarray.flatten(rmsR)

rmsRRepeat = np.repeat(rmsR, frameLength)
appliedRMSR = np.multiply(audioDataR, rmsRRepeat[:audioDataR.size])

#Spectral Centroid Left
centL = librosa.feature.spectral_centroid(audioDataL, n_fft=frameLength, hop_length=hopLength)
centL = np.ndarray.flatten(centL)

centLRepeat = np.repeat(centL, frameLength)
appliedCentL = np.multiply(audioDataL, centLRepeat[:audioDataL.size])

#Spectral Centroid Right
centR = librosa.feature.spectral_centroid(audioDataR, n_fft=frameLength, hop_length=hopLength)

centRRepeat = np.repeat(centR, frameLength)
appliedCentR = np.multiply(audioDataR, centRRepeat[:audioDataR.size])

lowCentL = np.zeros(audioDataL.size)
midCentL = np.zeros(audioDataL.size)
highCentL = np.zeros(audioDataL.size)
lowCentR = np.zeros(audioDataR.size)
midCentR = np.zeros(audioDataR.size)
highCentR = np.zeros(audioDataR.size)

#Isolate frequencies based on hoplength and centroid
for i, iHop in enumerate(range(0, audioDataL.size, hopLength)):
    if (centL[i] < colorThreshLow):
        lowCentL[iHop:iHop+frameLength] = audioDataL[iHop:iHop+frameLength]
        lowCentR[iHop:iHop+frameLength] = audioDataR[iHop:iHop+frameLength]
    elif (centL[i] > colorThreshLow and centL[i] < colorThreshHigh):
        midCentL[iHop:iHop+frameLength] = audioDataL[iHop:iHop+frameLength]
        midCentR[iHop:iHop+frameLength] = audioDataR[iHop:iHop+frameLength]
    else:
        highCentL[iHop:iHop+frameLength] = audioDataL[iHop:iHop+frameLength]
        highCentR[iHop:iHop+frameLength] = audioDataR[iHop:iHop+frameLength]

        
#Plot Serato Style
fig, (axL, axR) = plt.subplots(2, figsize=[15,8])
plt.xticks(np.arange(0, audioData.size, sr * xTicksInterval), np.arange(0, audioData.size/sr, xTicksInterval))
#Left Channel
axL.set_facecolor(seratoColorScheme["bg"])
axL.plot(lowCentL, color=seratoColorScheme["bass"])
axL.plot(midCentL, color=seratoColorScheme["mid"])
axL.plot(highCentL, color=seratoColorScheme["high"])
axL.set_ylabel("Amplitude (normalized -1 to 1)")
axL.set_xlabel("Time (sec)")
axL.set_title("Serato Style - Left Channel")
axL.set_xmargin(0)
axL.sharex(axR)

#Right Channel
axR.set_facecolor(seratoColorScheme["bg"])
axR.plot(lowCentR, color=seratoColorScheme["bass"])
axR.plot(midCentR, color=seratoColorScheme["mid"])
axR.plot(highCentR, color=seratoColorScheme["high"])
axR.set_ylabel("Amplitude (normalized -1 to 1)")
axR.set_xlabel("Time (sec)")
axR.set_title("Serato Style - Right Channel")
axR.set_xmargin(0)
fig.tight_layout() #Must be right before plt.show(), otherwise it scales the plots as well.
plt.show()

And secondly an Audacity style, where the inner waveform represents the RMS.

In [None]:
#Plot Audacity Style

fig, (axL, axR) = plt.subplots(2, figsize=[15,6])
plt.xticks(np.arange(0, audioData.size, sr * xTicksInterval), np.arange(0, audioData.size/sr, xTicksInterval))
#Left Channel
axL.set_prop_cycle(color=[audacityColorScheme["wave"], audacityColorScheme["rms"]])
axL.set_facecolor(audacityColorScheme["bg"])
axL.plot(audioDataL)
axL.plot(appliedRMSL)
axL.legend(['Waveform', 'rms'], loc='upper right')
axL.set_ylabel("Amplitude (normalized -1 to 1)")
axL.set_xlabel("Time (sec)")
axL.set_title("Audacity Style - Left Channel")
axL.set_xmargin(0)
axL.sharex(axR)

#Right Channel
axR.set_prop_cycle(color=[audacityColorScheme["wave"], audacityColorScheme["rms"]])
axR.set_facecolor(audacityColorScheme["bg"])
axR.plot(audioDataR)
axR.plot(appliedRMSR)
axR.legend(['Waveform', 'rms'], loc='upper right')
axR.set_ylabel("Amplitude (normalized -1 to 1)")
axR.set_xlabel("Time (sec)")
axR.set_title("Audacity Style - Right Channel")
axR.set_xmargin(0)
fig.tight_layout() #Must be right before plt.show(), otherwise it scales the plots as well.
plt.show()