# Data sonification with Python

## Instructors

- Walt Gurley
- Scott Bailey

## Workshop Description
Data visualization is the process of using graphical elements to represent data. This workshop introduces the concept of data sonification, using characteristics of sound to represent information. Sonification can provide an alternate mode for communicating data with implications for accessibility, engagement, and discovery. Participants in this workshop will get an overview of sonification techniques and tools and learn basic processes for mapping data to sound using the Python programming language.

## Notebook setup

This cell loads the necessary libraries and modules. If this notebook is run in Google Colab, it will also install and load extra dependencies to create audio files and play audio files. If this notebook is run on a local machine, the process of creating and playing audio is simplified and does not require the creation of audio files.

Libraries imported into this notebook:
- [music21](https://web.mit.edu/music21/) - toolkit for computer-aided musicology
- pandas
- numpy
- matplotlib

Additional dependencies:
- [Fluidsynth](http://www.fluidsynth.org/) - a synthesizer for processing MIDI files
- [IPython Audio controls](https://ipython.readthedocs.io/en/stable/api/generated/IPython.display.html?highlight=audio#IPython.display.Audio) - a tool for playing audio and generating audio controls in a Jupyter notebook 

In [None]:
# Test if this notebook is running in Colab
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False
print("I am in Colab: " + str(IN_COLAB))

# If running in Colab install additional dependencies to create audio files from
# MIDI files
if IN_COLAB:
  # Install synthesizer to provide sound fonts (i.e., instruments)
  !apt install fluidsynth
  # Copy the soundfonts file to our session storage space
  !cp /usr/share/sounds/sf2/FluidR3_GM.sf2 ./font.sf2
  # Install and load midi2audio module to convert MIDI files to audio files
  !pip install midi2audio
  from midi2audio import FluidSynth

# Load modules from the music21 library for sonifying data
from music21 import instrument, note, scale, stream, midi, tempo

# Load data processing and visualization libraries
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

# Load the Audio display tool to play and control audio
from IPython.display import Audio

## A brief overview of audio properties and sonification
Sound travels through air like a wave as particles are compressed together and then stretched apart. By measuring how these particles change we can represent sound as a series of waves called a waveform. An audio waveform has two basic properties, amplitude and wavelength. Amplitude is measured as displacement over time and can be thought of as loudness. Wavelength is used to measure frequency. Frequency is a measure of how many times the waveform repeats over time. Frequency is directly related to pitch, lower frequencies have a lower pitch and higher frequencies a higher pitch. Humans have a general hearing frequency range of 20 Hz to 20,000 Hz (Hz = one cycle per second).

Here is a great [interactive guide to audio waveforms](https://pudding.cool/2018/02/waveforms/).

![An image representing a sound wave traveling through the air and as a waveform.](https://github.com/WaltGurley/rem-rem-cur/blob/gh-pages/MicIn/particleSound.png?raw=true "As a sound wave travels through the air particles are compressed and stretched apart. This can be modeled as a waveform")

Beyond the basic properties of sound waves we can also consider the audio properties of timbre (the perceived quality of a sound, e.g., how a guitar sounds different than a trumpet), tempo (the speed at which a collection of sounds are played, e.g., beats per minute), and spatial positioning (where a sound is coming from in space, e.g., panning left or right in stereo sound).

Just as we may map a data value to color, position, or size on a graph, we can use these properties of sound to represent data sonically. Sonification has the ability to represent data in a way that complements visualization and has valid application with regard to accessibility, engagement, and discovery:

- Accessibility: [SAS Graphics Accelerator](https://support.sas.com/software/products/graphics-accelerator/samples/index.html)

- Engagement: [Sounds from Around the Milky Way](https://www.nasa.gov/mission_pages/chandra/news/data-sonification-sounds-from-around-the-milky-way.html)

- Discovery: [Sonification of Cyclone Sidr](https://youtu.be/RRluA1r3rTk)


## Load and observe/clean dataset

We will be working with a dataset consisting of monthly atmospheric CO2 values measured at Mauna Loa Observatory, Hawaii from the Scripps CO2 program website.

For Scripps CO2 program data information see: https://scrippsco2.ucsd.edu/data/atmospheric_co2/primary_mlo_co2_record.html

In [None]:
# Read and format the csv file located at the provided URL into a Pandas DataFrame
co2raw = pd.read_csv(
  # The URL of the csv file
  "https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/monthly/monthly_in_situ_co2_mlo.csv",
  # The row position of the column headers for the dataset
  header=56,
  # Create new column header names to rename given headers
  names=["year", "month", "date", "CO2ppm", "season_adj"],
  # Only take the specified columns from the csv file
  usecols=[0,1,2,4,5],
  # Parse dates from the data given in column 2
  parse_dates=[2],
  # How to parse the date values
  date_parser=lambda x: pd.to_datetime(int(x), unit='D', origin=pd.Timestamp('01-01-1900')),
  # Set the column index of the DataFrame
  index_col=2
)

# Print the data
co2raw

In [None]:
# Replace missing values (-99.99) with NaN
co2 = co2raw.replace(-99.99, np.nan)

# Trim dataset to remove leading and trailing missing CO2 data
co2 = co2.loc[co2["CO2ppm"].first_valid_index():co2["CO2ppm"].last_valid_index()]

# Print the data
co2

In [None]:
# Configure the plot
plt.figure(figsize=[8, 5])
plt.xlabel("Time (months)")
plt.ylabel("CO2 ppm")
plt.title("Monthly atmospheric CO2 values measured at Mauna Loa Observatory, Hawaii")

# Plot the data
plt.plot(co2["CO2ppm"])

**Question:** What properties of sound could we use to sonify this dataset (e.g., frequency, amplitude, timbre, tempo, and spatial position)?

## Audification
Audification is a type of sonification in which a data series is directly translated into an audio waveform. This sonification process is applied in research ranging from medicine to seismology and astronomy.

Audification is typically suitable for large datasets that have a cyclical component.

Examples:

- [Vibrations of the Sun](https://soundcloud.com/nasa/sun-sonification)
- [Audification of brainwave data](https://youtu.be/y1Nl3De_frM)

### Audification of a sine wave
We will first demonstrate the process of audification by generating a sine wave over time and then converting that data into audio.

When creating audio we need to consider two things, the sample rate and the shape of the sound wave.

In [None]:
# How many times per second (Hz) are we sampling our data?
sampleRate = 44100

# Over how many seconds are we sampling our data?
time = 2

# Generate a series of time samples at a rate of 'sampleRate' Hz over 'time' 
# seconds (44100 Hz * 2 seconds = 88200)
timeSample = np.linspace(0, time, sampleRate * time)
timeSample

In [None]:
# Generate a sine wave with a frequency of 'frequency' Hz over 'timeSample' samples 
frequency = 220
sineWave = np.sin(2 * np.pi * frequency * timeSample)

In [None]:
# Configure the sine wave plot
plt.figure(figsize=[15, 5])
plt.xlabel('Time (seconds)')
plt.ylabel('Amplitude')
plt.title(str(frequency) + ' Hz sine wave')

# Only plot 1/4 of the data (0.25 seconds)
plt.plot(timeSample[1:(sampleRate // 4)], sineWave[1:(sampleRate // 4)])

In [None]:
# Generate audio from sine wave data
Audio(sineWave, rate=sampleRate)

**Question**: Would a 440 Hz sine wave have a higher or lower pitch than the 220 Hz sine wave?

### Audification of CO2 concentration data
We need to modify our data in order to create an audification. First, our data has a sample rate of 12 samples per year. That sample rate is approximately 0.0000004 Hz, and based on the lower limit of human hearing (20 Hz), this frequency is well below our ability to hear.

To bring our dataset into an audible frequency range we must speed it up considerably. We will compress the time scale of our data to a sample rate of 3000 Hz (the lowest frequency at which we can playback audio with the IPython Audio tool). This equates to a frequency increase of about 10^10.

In [None]:
# Sample our data at a rate of 3000 Hz
dataSampleRate = 3000

# Generate a series of time samples at a rate of 'dataSampleRate' Hz over 1 second
timeSample = np.linspace(0, 1, dataSampleRate)

In [None]:
# Use a linear interpolation to fill NaN values–the audio player WILL NOT WORK
# with data that contains any NaN values
co2Int = co2["CO2ppm"].interpolate()

In [None]:
# Configure the plot
plt.figure(figsize=[8, 5])
plt.xlabel('Time (seconds)')
plt.ylabel('Amplitude')
plt.title('CO2 concentration compressed to a sample rate of 3000 Hz')

# Plot the time compress waveform of CO2 concentration data
plt.plot(timeSample[:len(co2Int)], co2Int)

In [None]:
# The sample rate of our data is approximately 0.0000004 Hz (12 samples / year),
# we speed it up about 10^10 times (3000 Hz)
Audio(co2Int, rate=dataSampleRate)

**Question:** Why is our audification so short?

### Audification of normalized CO2 concentration data
Even when resampling our dataset at a higher frequency, we still don't really get a useful audio representation of our waveform. We need to modify our dataset even further to establish a central value about which we can measure displacement. To do this we will normalize our dataset by removing the longterm trend from the data (i.e., subtracting the seasonally adjusted CO2 concentration values from the true CO2 concentration values)

In [None]:
# Remove the longterm trend of increasing CO2 concentration
co2["fit_removed"] = (co2["CO2ppm"] - co2["season_adj"])

In [None]:
# Use a linear interpolation to fill NaN values–the audio player WILL NOT WORK
# with data that contains any NaN values
co2FitInt = co2["fit_removed"].interpolate()

# Print the data
co2FitInt

In [None]:
# Plot CO2 concentration data
plt.figure(figsize=[15, 10])
plt.subplot(2, 2, 1)
plt.xlabel('Time (months)')
plt.ylabel('CO2 ppm')
plt.title('CO2 concentration')
plt.plot(co2["CO2ppm"])

# Plot seasonally adjusted CO2 concentration data
plt.subplot(2, 2, 2)
plt.xlabel('Time (months)')
plt.ylabel('CO2 ppm')
plt.title('Seasonally adjusted CO2 concentration')
plt.plot(co2["season_adj"])

# Plot normalized CO2 concentration data over 3000 Hz sample frequency
plt.subplot(2, 2, 3)
plt.xlabel('Time (seconds)')
plt.ylabel('Amplitude')
plt.title('Normalized CO2 concentration artificially sampled at 3000 Hz')
plt.plot(timeSample[:len(co2FitInt)], co2FitInt)

In [None]:
# The sample rate of our data is approximately 0.0000004 Hz (12 samples / year),
# we speed it up about 10^10 times (3000 Hz)
Audio(co2FitInt, rate=dataSampleRate)

**Question:** Does this audification provide you with any insight into the data or help you pick out any discernable patterns?

## Sonification
Sonification is the process of mapping data to properties of sound. As previously mentioned, these properties can include pitch (frequency), amplitude (loudness), tempo (beat speed), and timbre (quality). The spatial position of sound (for example, panning audio to the left or right channel in a stereo mix) can also be used as a mapping property.
 
As opposed to audification, sound property mapping provides more options and opportunities to represent the features of a dataset.

In this demonstration we will only explore using variations in pitch, tempo, and timbre to represent monthly atmospheric CO2 values.

### Subsample CO2 concentration data for sonification
We will subsample our data to create a smaller dataset that will be appropriate for generating shorter demo sonifications. Additionally, due to some constraints with audio playback in Google Colab we have to create audio files of our sonifications and then load them into our notebook for playback. In this case, smaller files are preferable for shorter load times.

In [None]:
# Create a subsample of the CO2 data from the year 2000 to present (co2Modern)
co2Modern = co2[co2["year"] >= 2000]

# Print the data
co2Modern

In [None]:
# We are only going to be working with the measured CO2 concentration values
# moving forward, so we will store it in a variable (co2ppm)
co2ppm = co2Modern["CO2ppm"]

# Print the data
co2ppm

In [None]:
# Plot the subsampled CO2 concentration data
plt.plot(co2ppm)

In [None]:
# Check out some basic statistics of the subsampled data
co2ppm.describe()

**Question:** Given the values of the descriptive statistics, could we directly use the CO2 concentration data as frequencies (in Hz) in a sonification?

### Functions for generating audio streams and creating audio files
The function `createAudioStream` creates a music 21 audio stream that is playable in local Jupyter notebook using the call `[streamName].show('midi')`. This functionality is not available in Google Colab, so we must also create an audio file for playback using the function `createAudioFile`.

`createAudioStream` requires three arguments:
- `notes` - a series of music21 Note objects
- `bpm` - beats per minute, or the speed of the sonification (default value is 120)
- `instrumentName` - the name of a synthesized instrument to play the notes (default value is "Piano"). A list of instrument names is available in the [music21 documentation](http://web.mit.edu/music21/doc/moduleReference/moduleInstrument.html#module-music21.instrument)

These three arguments provide the data mappings to frequency (the `notes` argument), tempo (the `bpm` argument), and timbre (the `instrumentName` argument).

In [None]:
def createAudioStream(notes, bpm=120, instrumentName="Piano"):
  """
  This function creates and returns a new stream of notes using the music21
  Stream class given a series of notes, beats per minute (bpm), and an instrument
  name (instrumentName). For a list of valid instrument names see the music21
  Instrument module documentation https://web.mit.edu/music21/doc/moduleReference/moduleInstrument.html
  """
  # Create a new music21 stream object to add notes to
  newStream = stream.Stream()
  # Set the tempo string of the stream
  newStream.append(tempo.MetronomeMark(number=bpm))
  # Set the instrument to play the stream notes
  newStream.append(getattr(instrument, instrumentName)())
  # Iterate over the notes provided in the series
  for thisNote in notes:
    # Append the note to the new stream
    newStream.append(thisNote)
    # Set the length of the note (4 notes make a beat)
    thisNote.seconds = 60 / 4 / bpm
  # Return a music21 stream object
  return newStream

In [None]:
def createAudioFile(stream, filename):
  """
  This function writes two audio files given a music21 stream and a filename.
  It writes a MIDI file using music21 and then uses the newly written MIDI file
  to create the specified audio file using FluidSynth midi to audio conversion
  """
  # Use the music21 stream object write function to create a MIDI file
  stream.write("midi", filename.split('.')[0] + ".mid")
  # Use the FluidSynth module to call the sound font and convert the newly 
  # created MIDI file to the specified filename using the midi_to_audio function
  FluidSynth("font.sf2").midi_to_audio(filename.split('.')[0] + ".mid", filename)

### Sonification using CO2 concentration data values as frequency mapped to pitch
In this demonstration we will use a function that converts a data value interpreted as a frequency to a corresponding pitch.

For example, a data value of 440 would be interpreted as the frequency 440 Hz. This frequency would then be converted to scientific pitch notation as the pitch A4 (A in octave 4).

Moving forward we will be thinking of frequency in terms of Hz and scientific pitch notation. It might be helpful to reference this [table of piano key pitches and frequencies](https://en.wikipedia.org/wiki/Piano_key_frequencies#List).

We will use the defined function `translateValueToPitch` to translate data values as frequency to pitch. This function takes one argument:
- `valueAsFrequency` - a data value assumed to be a frequency

In [None]:
def translateValueToPitch(valueAsFrequency):
  """
  This function translates raw data values as frequency to a corresponding
  pitch. It returns the pitch of the provided frequency falue if the value
  is greater than zero. If the value is less than zero it returns a rest (i.e., 
  silence).
  """
  # Test if the frequency value is greater than zero
  if (valueAsFrequency > 0):
    # Create a music21 Note object
    convNote = note.Note()
    # Set the pitch of the Note object based on the supplied frequency value
    convNote.pitch.frequency = valueAsFrequency
  else:
    # Set the Note to a Rest value (silence)
    convNote = note.Rest()
  # Return the Note object with the assigned pitch
  return convNote

In [None]:
# Test the translateValueToPitch function
testNote = translateValueToPitch(440)
testNote

The `translateValueToPitch` function has returned a [music21 Note object](http://web.mit.edu/music21/doc/moduleReference/moduleNote.html#module-music21.note). A Note object has many properties. For this workshop the most important property is the note name. We can get a full desciptive name by calling the property `fullName` on a Note object.

In [None]:
# Print out full note name information (fullName)
testNote.fullName

In [None]:
# Map CO2 concentration values directly to pitch
rawPitch = co2ppm.apply(translateValueToPitch)

#Print the data
rawPitch

In [None]:
# Create an audio stream of the rawPitch values using the createAudioStream function
# Pass in the series of notes (rawPitch), a tempo in bpm, and an instrument name
# from the list located in the right-hand column of this music21 documentation
# page: https://web.mit.edu/music21/doc/moduleReference/moduleInstrument.html
rawPitchStream = createAudioStream(rawPitch, 100, 'Xylophone')

# If running on a local machine, uncomment the line below to play the MIDI data
# in the Jupyter notebook without having to create a file
# rawPitchStream.show('midi')

In the next cell we are creating an audio file from the music21 stream we created. We will create a FLAC audio file, but it is also possible to create other standard audio file formats such as mp3, mp4, and wave files by changing the file extension. When running this notebook in Colab I found that FLAC audio files tended to load faster with than other file types when using the Audio tool. 

In [None]:
# Create an audio file from the rawPitchStream
createAudioFile(rawPitchStream, 'rawPitch.flac')

# Load the newly created audio file for playback
Audio('rawPitch.flac')

In [None]:
# Plot the data for visual reference
plt.plot(co2ppm)

**Activity:** Create a sonification of the seasonally adjusted CO2 concentration data using the `translateValueToPitch` method.

In [None]:
# Write your code here

**Question:** Are there any shortcomings with the method we used to create this sonification?

### Sonification using values mapped to linear pitch range
In this demonstration we will use a function that maps a data value from the dataset range (dataMin - dataMax) to the pitch range of a 97 key piano (~16 Hz – ~4186 Hz).

We will use the defined function `mapValueToPitchRange` to map data values from the data domain to a pitch range. This function takes two arguments:
- `data` - the data to be mapped to the new pitch range
- `dataMinMax` - a list containing the min and max of the dataset ([min, max])

In [None]:
def mapValueToPitchRange(data, dataMinMax):
  """
  This function maps a data value from the dataset domain (data min - data max)
  to the frequency range of a standard 97 key piano (~16 Hz – ~4186 Hz). It
  returns a music21 Note object with the pitch of the mapped frequency value if
  the value is greater than zero. If the value is less than zero it returns a
  Note object with a pitch frequency of 1. 
  """
  convNote = note.Note()
  dataRange = dataMinMax[1] - dataMinMax[0]
  # Human hearing range: ~20Hz - ~20000Hz
  MIN_HZ = 16.35
  MAX_HZ = 4186.01
  convNote.pitch.frequency = (
    ((data - dataMinMax[0]) * (MAX_HZ - MIN_HZ)) / dataRange
  ) + MIN_HZ
  return convNote

In [None]:
# Test the mapValueToPitchRange function
mapValueToPitchRange(50, [0, 100]).fullName

In [None]:
# Calculate the minimum and maximum values of the CO2 concentration data
co2Range = co2ppm.agg(["min", "max"])

# Convert values to pitch range
pitchRange = co2ppm.apply(
  lambda x: mapValueToPitchRange(x, [co2Range["min"], co2Range["max"]])
)

In [None]:
# Create an audio stream of the pitchRange values using the createAudioStream function
# Pass in the series of notes (pitchRange), a tempo in bpm, and an instrument name
# from the list located in the right-hand column of this music21 documentation
# page: https://web.mit.edu/music21/doc/moduleReference/moduleInstrument.html
pitchRangeStream = createAudioStream(pitchRange, 80, 'Glockenspiel')

# If running on a local machine, uncomment the line below to play the MIDI data
# in the Jupyter notebook without having to create a file
# pitchRangeStream.show('midi')

In [None]:
# Create an audio file from the pitchRangeStream
createAudioFile(pitchRangeStream, 'pitchRange.flac')

# Load the newly created audio file for playback
Audio('pitchRange.flac')

In [None]:
# Plot the actual concentration data and the mapped frequency data for visual reference
plt.figure(figsize=[12, 5])
plt.subplot(1, 2, 1)
plt.title("CO2 concentration over time")
plt.plot(co2ppm)

plt.subplot(1, 2, 2)
plt.title("CO2 concentration mapped to frequency over time")
plt.plot(pitchRange.apply(lambda x: x.pitch.frequency))

**Activity:** Create a sonification of the seasonally adjusted CO2 concentration data using the `mapValueToPitchRange` method.

In [None]:
# Write your code here

### Sonification using values mapped to logarithmic pitch range
The frequency ranges between pitches are not linear. For example, the frequency of note A4 is 440 Hz. To get to the next octave of this note we must double the frequency of A4, giving A5 as 880 Hz (i.e., A6 = 1760, A3 = 220, and so on). In order to map our values to an appropriate pitch scale, we can apply a log base two transform to pitch range. This makes it easier to discern pitches at lower frequencies and produces a more natural and pleasing sound.

In this demonstration we will use a function that maps a data value from the dataset range (dataMin - dataMax) to the log2 frequency range of a 97 key piano (log2(\~16 Hz) – log2(\~4186 Hz)).

We will use the defined function `mapValueToPitchRangeExpScale` to map data values from the data domain to a log scale pitch range. This function takes two arguments:
- `data` - the data to be mapped to the new pitch range
- `dataMinMax` - a list containing the min and max of the dataset ([min, max])

In [None]:
# https://stackoverflow.com/questions/19472747/convert-linear-scale-to-logarithmic
#           x - x0
# log(y) = ------- * (log(y1) - log(y0)) + log(y0)
#          x1 - x0
def mapValueToPitchRangeExpScale(data, dataMinMax):
  """
  This function maps a data value from the dataset domain (data min - data max)
  to the log2 frequency range of a standard 97 key piano (~16 Hz – ~4186 Hz). It
  returns a music21 Note object with the pitch of the mapped frequency value if
  the data value is greater than zero. If the data value is less than zero it
  returns a Note object with a pitch frequency of 1. 
  """
  convNote = note.Note()
  dataRange = dataMinMax[1] - dataMinMax[0]
  # Human hearing range: ~20Hz - ~20000Hz
  MIN_HZ = 16.35
  MAX_HZ = 4186.01
  convNote.pitch.frequency = np.exp2(
      (data - dataMinMax[0]) / dataRange *
      (np.log2(MAX_HZ) - np.log2(MIN_HZ)) +
      np.log2(MIN_HZ)
  )
  return convNote

In [None]:
# Test the mapValueToPitchRange function
mapValueToPitchRangeExpScale(0, [-50, 50]).fullName

In [None]:
# Convert values to pitch range
pitchRangeExp = co2ppm.apply(lambda x:
  mapValueToPitchRangeExpScale(x, [co2Range["min"], co2Range["max"]])
)

In [None]:
# Create an audio stream of the pitchRangeExp values using the createAudioStream
# function. Pass in the series of notes (pitchRangeExp), a tempo in bpm, and an
# instrument name from the list located in the right-hand column of this music21
# documentation page: https://web.mit.edu/music21/doc/moduleReference/moduleInstrument.html
pitchRangeExpStream = createAudioStream(pitchRangeExp, 100, 'Xylophone')

# If running on a local machine, uncomment the line below to play the MIDI data
# in the Jupyter notebook without having to create a file
# pitchRangeExpStream.show('midi')

In [None]:
# Create an audio file from the pitchRangeExpStream
createAudioFile(pitchRangeExpStream, 'pitchRangeExp.flac')

# Load the newly created audio file for playback
Audio('pitchRangeExp.flac')

In [None]:
# Plot the actual concentration data and the mapped frequency data for visual reference
plt.figure(figsize=[12, 5])
plt.subplot(1, 2, 1)
plt.title("CO2 concentration over time")
plt.plot(co2ppm)

plt.subplot(1, 2, 2)
plt.title("CO2 concentration mapped to frequency over time")
plt.plot(pitchRangeExp.apply(lambda x: x.pitch.frequency))

**Activity:** Create a sonification of the seasonally adjusted CO2 concentration data using the `mapValueToPitchRangeExpScale` method.

In [None]:
# Write your code here

### Sonification using values mapped to a musical scale
In this demonstration we will use a function that maps a data value from the dataset range (dataMin - dataMax) to a specified [musical scale](https://en.wikipedia.org/wiki/Scale_(music)) with the given root note (e.g., an ordered group of pitches) and an octave range.

Using a musical scale provides us with some options to make our sonification more "musical". Previously we were mapping values to frequencies and pitches without consideration for ordering of notes. Predefined musical scales can evoke culturally influenced emotions that can give our sonification an aesthetic feeling. For example, major scales are often interpreted as happy, while minor scales are interpreted as sad.

We will use the defined function `mapValueToScale` to map data values from the data domain to a musical scale. This function takes five arguments:
- `data` - the data to be mapped to the new pitch range
- `dataMinMax` - a list containing the min and max of the dataset ([min, max])
- `tonic` - a string containing a valid scientific pitch notation for the root note of the scale
- `scaleName` - a string containing a valid scale subclass name from the music21 scale class (default value is "MajorScale"). A list of scale names is available in the [music21 documentation](http://web.mit.edu/music21/doc/moduleReference/moduleScale.html#module-music21.scale)
- `octaveRange` - a list containing the min and max octaves over which to map the pitches (default value is [3, 5])

In [None]:
def mapValueToScale(data, dataMinMax, tonic="c", scaleName="MajorScale",
  octaveRange=[3, 5]):
  """
  This function maps a data value from the dataset domain (data min - data max)
  to a musical scale with a provided tonic note over a provided octave range. It
  returns a music21 Note object with the pitch of the mapped data value if the
  data value is greater than zero. If the data value is less than zero it
  returns a Note object with a pitch frequency of 1. 
  """
  convNote = note.Note()
  sc1 = getattr(scale, scaleName)(tonic)
  dataRange = dataMinMax[1] - dataMinMax[0]
  SCALE_RANGE = len(sc1.getPitches(tonic + str(octaveRange[0]),tonic + str(octaveRange[1]))) - 1
  pos = np.round(((data - dataMinMax[0]) * SCALE_RANGE) / dataRange)
  convNote.pitch = sc1.getPitches(tonic + str(octaveRange[0]),tonic + str(octaveRange[1]))[int(pos)]
  return convNote

The C major scale has a root note (tonic) of C and consists of the notes: C, D, E, F, G, A, B.

In [None]:
# Test the mapValueToScale function
cMajorD = mapValueToScale(1, [0, 7], octaveRange=[3,4])
cMajorD.fullName

In [None]:
# Convert values to musical scale
musicalNotes = co2ppm.apply(lambda x:
  mapValueToScale(
    x,
    [co2Range["min"], co2Range["max"]],
    tonic="c",
    scaleName="MinorScale",
    octaveRange=[1, 4]
  )
)

In [None]:
# Create an audio stream of the musicalNotes values using the createAudioStream
# function. Pass in the series of notes (musicalNotes), a tempo in bpm, and an
# instrument name from the list located in the right-hand column of this music21
# documentation page: https://web.mit.edu/music21/doc/moduleReference/moduleInstrument.html
musicalNotesStream = createAudioStream(musicalNotes, 100, 'Kalimba')

# If running on a local machine, uncomment the line below to play the MIDI data
# in the Jupyter notebook without having to create a file
# musicalNotesStream.show('midi')

In [None]:
# Create an audio file from the musicalNotesStream
createAudioFile(musicalNotesStream, 'musicalNotes.flac')

# Load the newly created audio file for playback
Audio('musicalNotes.flac')

In [None]:
# Plot the actual concentration data and the mapped frequency data for visual reference
plt.figure(figsize=[12, 5])
plt.subplot(1, 2, 1)
plt.title("CO2 concentration over time")
plt.plot(co2ppm)

plt.subplot(1, 2, 2)
plt.title("CO2 concentration mapped to frequency over time")
plt.plot(musicalNotes.apply(lambda x: x.pitch.frequency))

**Activity:** Create a sonification of the seasonally adjusted CO2 concentration data using the `mapValueToScale` method.

In [None]:
# Write your code here

### Sonification playground
Use the cells below to generate your own sonifications of the CO2 concentration data using the provided functions.

You can the call the following functions on a Pandas Series using the `apply()` function:
- `translateValueToPitch(valueAsFrequency)`
- `mapValueToPitchRange(data, dataMinMax)`
- `mapValueToPitchRangeExpScale(data, dataMinMax)`
- `mapValueToScale(data, dataMinMax, tonic="c", scaleName="MajorScale",
  octaveRange=[3, 5])`

For example, to create a sonification of the modern CO2 data you could do the following:

1. Map the data to a log scale pitch range to return a series of music21 Note objects
```python
pitchRange = co2Modern["season_adj"].apply(lambda x:
      mapValueToPitchRangeExpScale(x, [co2Range["min"], co2Range["max"]])
)
```
1. Create a music21 stream of the Note objects at 100 bpm using a Guitar sound
```python
notesStream = createAudioStream(pitchRange, 100, 'Guitar')
```
1. Create an audio file from the stream and play it back
```python
createAudioFile(notesStream, 'demoNotes.flac')
Audio('demoNotes.flac')
```

## Other sonification resources
- [Sonic Pi](https://sonic-pi.net/) - a code-based music creation and performance tool based on the Ruby programming language
- [p5.js](https://p5js.org/) - a JavaScript library for creative coding that includes a library for interfacing with web-based audio.
- [Web audio API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API) - base API for controling audio on the web
- [SAS Graphics Accelerator](https://support.sas.com/software/products/graphics-accelerator/samples/index.html) - a free Chrome extension that allows you to sonify data captured from web tables and data from web-based SAS visualization
- [TwoTone](https://app.twotone.io/) - An interactive web application for easily creating sonifications with uploaded data
- [Loud Numbers](https://www.loudnumbers.net/) - a data sonification podcast and mailing list