## Group D Assignment 4 Sonification of Data

In this assignment we have attempted to create a piece of music by sonifying data.  We have chosen a set of weather data to sonify as it was based on occurances over time, but also was multi-dimensional enough to allow for interesting overallying of different sonified information with one another.

In [1]:
from matplotlib import pyplot as plt
import numpy as np
import librosa
import soundfile as sf
import IPython.display as ipd
import librosa.display
import pandas as pd
import pretty_midi

We define the column variables which we are interested to read from csv here:

In [2]:
relativeHumidityColumn = "Relative Humidity"
specificHumidityColumn = "Specific Humidity"
temperatureColumn = "Temperature"
precipitationColumn = 'Precipitation'
yearColumn = 'Year'
monthColumn = 'Month'
dayColumn = 'Day'

We define use arrayToMidi function to map any numpy array to midi number array

In [3]:
 def arrayToMidi(array):
        midiMapping = lambda item : librosa.hz_to_midi(item)
        return midiMapping(array)

We create EnvironmentData class in order to use later. The class keeps numpy arrays of temperature and relative humidity data as well as the multiplication of specific humidity and temperature as perceived temperature array.

In [4]:
class EnvironmentData:
    def __init__(self, csvFile):
        self.data = pd.read_csv(csvFile)
        self.temperatureArr = self.data[temperatureColumn].to_numpy()
        self.relativeHumidityArr = self.data[relativeHumidityColumn].to_numpy()
        self.yearArr = self.data[yearColumn]
        self.perceivedTemperature()
        
    def perceivedTemperature(self):
        humidityArr = self.data[specificHumidityColumn].to_numpy()
        temperatureArr = self.data[temperatureColumn].to_numpy()
        self.perceivedTemperatureArr = humidityArr*temperatureArr

### Importing the rainfall data

For our sonification, we used rainfall timeseries data from [kaggle](https://www.kaggle.com/datasets/poojag718/rainfall-timeseries-data)

We mapped the temperature, humidity and perceived temperature datas to midi numbers at this step. Later we will use those data in as starting point for our sonification.

In [5]:
csvFile = './data/Rainfall_data.csv'
dataDf = EnvironmentData(csvFile)

amountToAdd = 48
temperatureMidis = arrayToMidi(dataDf.temperatureArr).astype(int) + amountToAdd
relativeHumidityMidis = arrayToMidi(dataDf.relativeHumidityArr).astype(int) + amountToAdd
perceivedTempToMidis = arrayToMidi(dataDf.perceivedTemperatureArr).astype(int) + amountToAdd

Here we define the functions which we will use for slicing the sound. 

We used midi file in order to decide the duration of each slice, by taking the first note from the midi and analysing its duration by calculating the difference between start and end of the note. "noteInSamples" variable defines the sample each note will have.

Then we used vaw files as the bases of our slicing and sliced the audio by noteInSamples.

In [6]:
def sliceSignal(synthFile, sr, noteInSamples):
    synth, sr = librosa.load(synthFile, sr=sr, mono=True)   

    sliceLength = round(noteInSamples) #Every note lasts for 88200 samples
    offset = 0
    signal_slices = np.empty((0,sliceLength)) #size of the row in samples

    while True:
        try:
            signal_slices = np.append(signal_slices, [synth[offset : offset+sliceLength]], axis = 0)
            offset = offset + sliceLength
        except:
            print("Slicing of " + synthFile + "is done.")
            return signal_slices
            break

In [7]:
def getNoteInSamples(midiFile, multiplyFactor):
    synthMidi = pretty_midi.PrettyMIDI(midiFile)
    firstNoteOfSynth = synthMidi.instruments[0].notes[0]
    start = synthMidi.time_to_tick(firstNoteOfSynth.start)
    end = synthMidi.time_to_tick(firstNoteOfSynth.end)

    noteDuration = end-start
    noteInSeconds = synthMidi.tick_to_time(int(noteDuration))
    noteInSamples = sr/noteInSeconds*multiplyFactor
    return noteInSamples

In [8]:
def saveSlicesAsVaw(path, sr, synthAsSlices, notes):
    numberOfSlices = len(synthAsSlices[:,])
    for i, note in enumerate(notes):
        sf.write(path + note + '.wav', synthAsSlices[i], sr, subtype='PCM_24')

### Slice the synth sounds

We sliced the three synth sounds by the note range given below. After slicing we save each sliced sound which corresponds to a note in the folder "Synth x Sliced".

Afterwards we will use those sounds for our sonification.

In [9]:
sr=44100
notes = ['A', 'B', 'C', 'D', 'E', 'F', 'G']

In [10]:
noteInSamples = getNoteInSamples('./Corpus/Synth 1 notes/Synth 1 MIDI.mid', 4)

#Slicing the synth 1 into 7 notes of the same duration. This is taken from Notebook 12 
synth1AsSlices = sliceSignal('./Corpus/Synth 1 notes/Synth 1 Amin.wav', 44100, noteInSamples)

slicesFolderOfSynth1 = './Corpus/Synth 1 Sliced/'
saveSlicesAsVaw(slicesFolderOfSynth1, sr, synth1AsSlices, notes)

Slicing of ./Corpus/Synth 1 notes/Synth 1 Amin.wavis done.


In [11]:
noteInSamples = getNoteInSamples('./Corpus/Synth 2 notes/Synth 2 MIDI.mid', 4)

#Slicing the synth 2 into 7 notes of the same duration.
synth2AsSlices = sliceSignal('./Corpus/Synth 2 notes/Synth 2 Amin.wav', 44100, noteInSamples)

slicesFolderOfSynth2 = './Corpus/Synth 2 Sliced/'
saveSlicesAsVaw(slicesFolderOfSynth2, sr, synth2AsSlices, notes)

Slicing of ./Corpus/Synth 2 notes/Synth 2 Amin.wavis done.


In [12]:
noteInSamples = getNoteInSamples('./Corpus/Synth 3 notes/Synth 3 MIDI.mid', 16)

#Slicing the synth 3 into 7 notes of the same duration.
synth3AsSlices = sliceSignal('./Corpus/Synth 3 notes/Synth 3 Chords.wav', sr, noteInSamples)

slicesFolderOfSynth3 = './Corpus/Synth 3 Sliced/'
saveSlicesAsVaw(slicesFolderOfSynth3, sr, synth3AsSlices, notes)

Slicing of ./Corpus/Synth 3 notes/Synth 3 Chords.wavis done.


### Sonification

We used temperature, relative humidity and perceived temperature arrays from data as starting points for our sonification. Over time, we read each midi number from those arrays, define them as the pitch and play the corresponding note sound from three corpus, Synh 1, Synth 2 and Synth 3. We use numpy array values for defining the duration of each slice as well.

At the end we create three musical line with three synths and add them on top of each other to finalise our sonification.

In [None]:
def midiToAudioSlice(midiArray, durationArray, folderOfSlices, sr=48000):
    melody = np.array([])
    for midiNumber, duration in zip(midiArray, durationArray):
            duration = duration / 100
            noteName = pretty_midi.note_number_to_name(midiNumber)
            noteWithoutDiez = noteName[0:1].replace('#', '')
            sliceFile = folderOfSlices + noteWithoutDiez + '.wav'
            note, sr = librosa.load(sliceFile, sr=sr, duration=duration)
            melody = np.append(melody, note)
    return melody   
    
sr=48000
sonification = midiToAudioSlice(temperatureMidis, dataDf.temperatureArr*10, slicesFolderOfSynth1, sr)
+ midiToAudioSlice(relativeHumidityMidis, dataDf.relativeHumidityArr*10, slicesFolderOfSynth2, sr)
+ midiToAudioSlice(perceivedTempToMidis, dataDf.perceivedTemperatureArr, slicesFolderOfSynth3, sr)

sf.write('./Sonification.wav', sonification, sr, subtype='PCM_24')
ipd.display(ipd.Audio(sonification,rate=sr))



### Visualization

First we create a specshow of our sonification

In [None]:
signal, sr = librosa.load('./Sonification.wav', sr=sr)

plt.subplot(1, 1, 1)
Signal_db = librosa.amplitude_to_db(np.abs(librosa.stft(signal)), ref=np.max)
librosa.display.specshow(Signal_db, y_axis='log', fmax=16000, x_axis='time', sr=sr)

plt.savefig('specshow.png')

Here we tried creating an animation by using FuncAnimation but we couldnt make it work.

The goal was geting noteInSamples and slicing the sonification and visualizing each slice in a frame, ordered by index as it corresponds to time as well. If we could have made it work, we could have also add year month day information to the visualisation as well.

We would appreciate any feedback about how we can make this work.

In [None]:
from matplotlib.animation import FuncAnimation
%matplotlib notebook

fig = plt.figure()
# reshape the figure with index of time in x axis and range of amplitude in y
ax = plt.axes(xlim=(0, len(perceivedTempToMidis)), ylim=(0, np.amax(signal)))
temp = ax.text(1, 1, '', ha='right', va='top', fontsize=24)
colors = plt.get_cmap('coolwarm', len(perceivedTempToMidis))

# Animation function which will be called in each frame
def animate(i):
    # for x axis, we divide between 0 and 1 by sampling number of each frame
    x = np.linspace(0, 1, noteInSamples)
    # here we slice the signal by noteInSamples and get the signal for each frame
    y = fermi(x, 0.5, [signal[i*noteInSamples : i*noteInSamples+noteInSamples]])
    f_d.set_data(x, y)
    f_d.set_color(colors(i))
    temp.set_text('S')
    temp.set_color(colors(i))
    
# Create the animation
ani = FuncAnimation(fig=fig, func=animate, frames=len(perceivedTempToMidis), interval=500, repeat=True)

fig.tight_layout()


## Conclusion

This project combined elements from across the MT4001 program.  At higher level, we have made use of pandas dataframes to import our data, made extension use of librosa functions for importing and processing audio, used several functions from the pretty midi package, as well as matplotlib for visualisation.  At the lower levels we have created functions and a class to handle repeated tasks, created dictionaries for mapping our data to slices of audio and used general Python functionality to process our data into something more useable for the sonification process.

A general useage and understanding of digital audio has also been used throughout when importing, slicing and resequencing audio slices as well as choosing and processing our music corpus.  Overall, we have tried to make us of all aspects of the MT4001 course in the creation of this sonification project.