# Interaction with the World Homework (#3)
Python Computing for Data Science (c) J Bloom, UC Berkeley 2018

Due Tuesday 2pm, Feb 20, 2018

# 1) Monty: The Python Siri

Let's make a Siri-like program (call it Monty!) with the following properties:
   - record your voice command
   - use a webservice to parse that sound file into text
   - based on what the text, take three different types of actions:
       - send an email to yourself
       - do some math
       - tell a joke

So for example, if you say "Monty: email me with subject hello and body goodbye", it will email you with the appropriate subject and body. If you say "Monty: tell me a joke" then it will go to the web and find a joke and print it for you. If you say, "Monty: calculate two times three" it should response with printing the number 6.

Hint: you can use speed-to-text apps like Houndify (or, e.g., Google Speech https://cloud.google.com/speech/) to return the text (but not do the actions). You'll need to sign up for a free API and then follow documentation instructions for using the service within Python. 

In [4]:
import monty
%run monty.py 10  # argument after monty.py dictates how long the recording will last


* Email sent.


# 2) Write a program that identifies musical notes from sound (AIFF) files. 

  - Run it on the supplied sound files (12) and report your program’s results. 
  - Use the labeled sounds (4) to make sure it works correctly. The provided sound files contain 1-3 simultaneous notes from different organs.
  - Save copies of any example plots to illustrate how your program works.
  
  https://piazza.com/berkeley/spring2018/ay250class13410/resources -> Homeworks -> hw3_sound_files.zip

** Import the required modules. **

In [1]:
import aifc
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

from scipy.signal import spectrogram, find_peaks_cwt
from IPython.display import display, Markdown

sns.set(style="white", context="poster")
%matplotlib notebook

** First I need to know which notes correspond to which frequency. The following
data assumes $A_4$ is tuned to 440 Hz, which seems to match our example data.**

In [2]:
# Grab data from some webpage I found
notes_df = pd.read_html()[1]

# Do some formatting on the dataframe
#notes_df = notes_df[:13]
notes_df.set_index([0], inplace=True)
notes_df.rename(columns={1:"Frequency (Hz)", 2:"Wavelength (cm)"}, inplace=True)
notes_df.drop("Note", inplace=True)

notes_df.head(1)

Unnamed: 0_level_0,Frequency (Hz),Wavelength (cm)
0,Unnamed: 1_level_1,Unnamed: 2_level_1
C0,16.35,2109.89


** Now let's write the functions to actually perform the required task. **

In [3]:
def get_data(filename):
    '''
    Pass in an aifc file loc and this wil return a numpy array with the
    corresponding data, dtype=np.int32, as well as the number of frames
    and the frame rate.
    '''
    
    with aifc.open(filename, "rb") as file:
        nframes = file.getnframes()
        framerate = file.getframerate()
        signal = file.readframes(nframes)

    y = np.frombuffer(signal, dtype=np.int32).byteswap()
    y = y[:framerate+1]  #  A second of data should do
    
    return framerate, y

In [86]:
def plot_data(audio_data, framerate, title=None):
    '''
    Pass in audio_data as well as the corresponding number of frames and 
    frame rate and this will plot the data as a matplotlib power specgram.
    
    Returns the frequency array and power spectrum (as returned by specgram).
    '''
    
    ax, fig = plt.subplots()
    #plt.figure()
    with np.errstate(divide='ignore'):  # suppress an annoying warning
        Pxx, freqs, bins, im = plt.specgram(audio_data, NFFT=2**11, 
                                            Fs=framerate, cmap=plt.cm.viridis,
                                            pad_to=int(5e4))

    plt.ylim(1,2000)
    plt.xlim(0,0.95)
    plt.xlabel("Time [sec]")
    plt.ylabel("Frequency [Hz]")
    plt.tight_layout(pad=2)  # Prevent labels getting cut off in notebook
    if not title is None:
        plt.title(title)
    plt.show()

    return freqs, Pxx, ax

In [80]:
def get_peak_freqs(freqs, Pxx, dB=-3, df=notes_df):
    '''
    Pass in frequency array and corresponding power spectrum array. 
    
    Also takes kwarg 'dB' (default: -3), which determines the cutoff for 
    frequencies to be registered relative to the maximum peak. I.e. if
    db = -10, then only frequency peaks greater than 10% of the intensity of
    the dominant frequency will be considered.
    
    And takes kwarg 'df', which corresponds to a pandas dataframe used to 
    translate frequency values to notes.
    
    Returns a tuple with the notes corresponding to the dominant frequencies 
    and the corresponding frequency values.
    '''
    
    # I'm just going to average over the rows. Since I'm doing this anyway
    # it seems like it would be better to set NFFT to number of frames. But
    # it turns out that doing this seems to give bad peak fits.
    Pxx_mean = np.mean(Pxx, axis=1)
    
    # Then I grab peaks of averaged rows using scipy
    peak_inds = find_peaks_cwt(Pxx_mean, np.arange(.02,400))
    
    # Get max peak value
    max_value = np.amax(Pxx_mean[peak_inds])
    
    # Keep only the peaks greater than the cutoff value
    power_ratio = 10**(dB / 10)  # Convert dB to ratio of powers
    peak_inds = peak_inds[np.where(Pxx_mean[peak_inds] > max_value*power_ratio)[0]]
    peak_freqs = freqs[peak_inds]
    
    # Probably makes more sense just to hardcode this data into the function,
    # but since I already grabbed the dataframe...
    note_freqs = df["Frequency (Hz)"].tolist()
    note_names = list(df.index)
    
    # Now that we have a list of peak freqs from the specgram, check to see
    # if any of them match up to our musical notes
    notes = []
    for i, note in enumerate(note_freqs):
        # atol sets the tolerance for how close X needs to be to Y
        if np.isclose(peak_freqs, float(note), atol=3).any():
            
            # Only grab the lowest harmonic
            skip = False
            for freq in notes:
                r = divmod(float(note), freq[1])[1]
                if r < 5 or (freq[1]-r) < 5:
                    skip = True
                    continue
                
            if not skip:
                notes.append((note_names[i], float(note)))
    
    s = " "
    for note in notes:
        s += note[0] + ", "
    display(Markdown("# Notes:  " + s.strip(", ")))
    
    return notes

In [122]:
def get_notes(filename):
    '''
    String all the functions together. Takes as input an .aif file and then
    plots the (first second of) the corresponding spectrogram and returns
    the dominant notes.
    '''
    framerate, y = get_data(filename)
    freqs, Pxx, ax = plot_data(y, framerate, title=filename)
    notes = get_peak_freqs(freqs, Pxx, dB=-10)
    s = "Notes: "
    for note in notes:
        s += note[0] + ", "
    plt.text(.65, 1850, s.strip(", "), bbox=dict(facecolor='red'),
              fontsize=12)
    
    plt.savefig(filename.strip(".aif") + "_example.png")
    plt.show()

    return notes

**Test the functions, the plots are also in the repo under [filename]_example.png**

In [121]:
filelist = ["A4_PopOrgan.aif", "C4+A4_PopOrgan.aif", 
            "F3_PopOrgan.aif", "F4_CathedralOrgan.aif"]
for file in filelist:
    notes = get_notes(file); print(notes)

<IPython.core.display.Javascript object>

# Notes:  A4

[('A4', 440.0)]


<IPython.core.display.Javascript object>

# Notes:  C4, A4

[('C4', 261.63), ('A4', 440.0)]


<IPython.core.display.Javascript object>

# Notes:  F3

[('F3', 174.61)]


<IPython.core.display.Javascript object>

# Notes:  F4

[('F4', 349.23)]


**Identify unknown notes, the plots are also in the repo under [filename]_example.png**

In [123]:
filelist = !ls *.aif
filelist = filelist[:-4]

for file in filelist:
    get_notes(file)

<IPython.core.display.Javascript object>

# Notes:  C4, D6

<IPython.core.display.Javascript object>

# Notes:  C2

<IPython.core.display.Javascript object>

# Notes:  E2

<IPython.core.display.Javascript object>

# Notes:  B1, C5

<IPython.core.display.Javascript object>

# Notes:  F3

<IPython.core.display.Javascript object>

# Notes:  A4

<IPython.core.display.Javascript object>

# Notes:  C4

<IPython.core.display.Javascript object>

# Notes:  G2

<IPython.core.display.Javascript object>

# Notes:  C5

<IPython.core.display.Javascript object>

# Notes:  D5

<IPython.core.display.Javascript object>

# Notes:  F4

<IPython.core.display.Javascript object>

# Notes:  G3

Hints: You’ll want to decompose the sound into a frequency power spectrum. Use a Fast Fourier Transform. Be care about “unpacking” the string hexcode into python data structures. The sound files use 32 bit data. Play around with what happens when you convert the string data to other integer sizes, or signed vs unsigned integers. Also, beware of harmonics.