# Lab 7 - Assignment
*This week you will augment a basic sinsuoidal encoder to account for masking, using triangles centered at top frequencies to remove unnecessary frequencies.*

# <span style="color:red">Your Task for This Lab</span>

You will implement a simple encoding scheme that accounts for masking. Instead of immediately choosing the top n frequencies at each frame in the STFT, we will repeat the following n times: save the top frequency and then remove all frequencies underneath a masking triangle centered at that frequency. The hope is that we will get n frequencies that matter most to our perception of the overall sound.

In the "Encode Audio" block below, fill in the lines of code that say "To Do."

## Import Libraries

In [9]:
import librosa
import numpy as np
import IPython.display as ipd

## Load Audio File

In [10]:
filename = "Voices/male1.wav"
y, sr = librosa.load(filename)

In [11]:
ipd.Audio(y, rate = sr)

## Compute STFT 

In [12]:
#compute the stft of y and store it in D
# take the transpose of D so that time frames is the first dimension
D = librosa.stft(y).T

## Create Encoded Array

In [13]:
# n is the number of peaks to look at
n = 50

# create an empty array a to be our encoded structure
a = np.zeros(D.shape)

## Encode Audio
*We will save the magnitude of the top n frequencies, avoiding nearby frequencies that may be masked.*

In [14]:
# q determines the width of the triangles
# q is the ratio of the center frequency to the frequency width
q = 5

index = 0
for frame in range(len(D)):
    
    dft = D[frame]
    #print(dft)
        
    #make a copy of the magnitude dft
    length = np.shape(dft)[0]
    #print(length)
    
    magSpec = np.abs(dft)
    #print(magSpec)
    
    freq = (np.arange(length)/length) * (sr//2)
    #print(freq)
    
    # n times
    for peak in range(n):
        
        ########################## To Do ##########################
        # determine the index corresponding to maximum magnitude
        # store in i
        # hint: use np.argmax()
        i = np.argmax(magSpec)
        #print(i)
        ###########################################################

        ########################## To Do ##########################
        # store the maximum magnitude value in h
        # (this will be the height of our triangle)
        h = magSpec[i]
        ###########################################################
        
        ########################## To Do ##########################
        # store the magnitude h in array a 
        # at the matching frame and frequency index
        a[frame][i] = h
        #print(a)
        ###########################################################
        
        # compute half win size
        # width is i/q
        # halfWin will be floor(width/2)
        halfWin = int(np.floor(i/(q*2)))
        
        ########################## To Do ##########################
        # generate the ramp up (first half of triangle)
        # number of points should be size of halfWin
        # don't include the end point
        # hint: use np.linspace()
        rampUp = np.linspace(0, h, halfWin, endpoint=False)
        #print(rampUp)
#         rampUp = rampUp[:-1]
        ###########################################################
        
        ########################## To Do ##########################
        # generate the ramp down
        # hint: use np.flip()
        rampDown = np.flip(rampUp)
        ###########################################################
        
        ########################## To Do ##########################
        # combine your ramp up, the triangle peak h, and the ramp down
        # to get array containing the masking triangle.
        # it should have length (2 * halfWin + 1)
        # hint: use np.concatenate()
        triangle = np.concatenate((rampUp, [h], rampDown))
        ###########################################################
        
        # the following code takes the triangle and places it in
        # the correct position in an array so that the peak lines
        # up with the correct magSpec index
        zeros = np.zeros(length)
        padded = np.concatenate((zeros,triangle,zeros),axis = 0)
        shift = length + halfWin - i
        tri = padded[shift:(shift + length)]
        
        # update magSpec according to the rule:
        # add a zero if the magSpec value is under the triangle
        # otherwise, add the magSpec value 
        magSpec = np.array([0 if (magSpec[idx] <= tri[idx]) else magSpec[idx] for idx in range(length)])

In [15]:
print(magSpec)
print()
print('length: ', len(magSpec))
print()
print('max: ', max(magSpec))
print()
print(np.argmax(magSpec))



[0.04179413 0.01765683 0.0130902  ... 0.00139583 0.00139581 0.0013958 ]

length:  1025

max:  0.07594695687294006

10


## Decode Audio
*Decode the signal by performing the inverse-STFT on a*

In [16]:
yhat = librosa.istft(a.T)
ipd.Audio(yhat, rate = sr)