# Lab 1: decimation

In [8]:
def decimate(x, k):
    '''Decimate a signal x by a factor of k
    
    Parameters
    ----------
    x : of type np.ndarray
        The input signal as an array
    
        *Important*: The audio should NOT be read inside the function.
                     An array containing the audio should be the input instead.
        
    k : int > 0
        The decimation factor
        
    Returns
    -------
    y : of type np.ndarray
        The decimated signal in the form of an array
    '''
    # Write your code below this line
    if k > 0:
        dec_arr = []
        i = 0
        while i < x.size:
            dec_arr.append(x[i])
            i+=k

        y = np.array(dec_arr)

        return y

### Some tests to determine if decimate works properly

Run the following cell to check your implementation

In [9]:
import numpy as np
import soundfile as sf
from IPython.display import Audio
from scipy.io import wavfile
from scipy.fft import fft

print('Testing length of decimated signals: ', end='')
assert len(decimate(np.arange(12), 1)) == 12
assert len(decimate(np.arange(12), 2)) == 6
assert len(decimate(np.arange(12), 3)) == 4
assert len(decimate(np.arange(12), 4)) == 3
assert len(decimate(np.arange(12), 5)) == 3
assert len(decimate(np.arange(12), 20)) == 1
print('Okay!')

print('Testing values of decimated signals: ', end='')
assert np.allclose(decimate(np.arange(12), 1), np.arange(0, 12, 1))
assert np.allclose(decimate(np.arange(12), 2), np.arange(0, 12, 2))
assert np.allclose(decimate(np.arange(12), 3), np.arange(0, 12, 3))
assert np.allclose(decimate(np.arange(12), 4), np.arange(0, 12, 4))
assert np.allclose(decimate(np.arange(12), 5), np.arange(0, 12, 5))
assert np.allclose(decimate(np.arange(12), 20), [0])
print('Okay!')

Testing length of decimated signals: Okay!
Testing values of decimated signals: Okay!


In [10]:
# FULL DISCLOSURE: I was not sure how to get the highest frequency present in the sample file given with this assignment 
# given what we have learned. Since this value is important for calculations in the questions below, I derive it below 
# with the help of code generated by chatGPT - since FFTs have not been covered, I hope this is academically acceptable.
# I am happy to discuss this if there are any issues, or there was a better way to get this data given the techniques we have studied.

def get_highest():
    # Step 1: Read the WAV file
    sample_rate, data = wavfile.read('sweep.wav')

    # If the audio file is stereo, take one channel
    if data.ndim > 1:
        data = data[:, 0]

    # Step 2: Perform the FFT
    fft_result = fft(data)

    # Get the magnitude of the FFT
    fft_magnitude = np.abs(fft_result)

    # Step 3: Find the frequency bins
    freqs = np.fft.fftfreq(len(fft_result), 1 / sample_rate)

    # Step 4: Identify the highest frequency with significant amplitude
    # Ignoring the zero frequency (DC component) and negative frequencies
    positive_freqs = freqs[:len(freqs)//2]
    positive_magnitude = fft_magnitude[:len(fft_magnitude)//2]

    highest_freq = positive_freqs[np.argmax(positive_magnitude)]
    return highest_freq
get_highest()
# On my computer, this produces 3,949

3949.0

---
## Use the space below to import your own audio file, decimate it and export it.

#### Don't forget to normalize it before exporting it!

In [11]:
tone, srate = sf.read('tone_mono.wav')
sweep, srate2 = sf.read('sweep.wav')
print('sweep sr: ', srate2)

def export_audio (x, k):
    decimated_arr = decimate(x, k)
    abs_arr = np.absolute(decimated_arr)
    abs_max = abs_arr.max()
    norm_arr = decimated_arr / abs_max
    return norm_arr

sf.write("output.wav", export_audio(tone, 34), int(srate/ 34))


sweep sr:  16000


---
## Use the space below to call the function defined above and test different values of k on the provided file (lab1-sweep.wav)

### You can create more cells if you need them.

In [12]:
Audio(export_audio(sweep, 1), rate=srate2/1)

In [13]:
Audio(export_audio(sweep, 3), rate=srate2/3)

In [14]:
new_srate = int(srate2/5)
Audio(export_audio(sweep, 5), rate=srate2/5)


---
Using the test file:
## Questions:

1. What's the largest value of `k` before the signal presents distortion (in this case, distortion refers to any audible change to the signal)?

2. At that point the signal presents aliasing, explain in terms of the sampling theorem why this distortion is taking place.

3. What is the original sampling rate of the signal? What is the minimum sampling rate before it presents distortion? (Show your methematical procedure to find the latter).

4. Given what you know about the test signal’s original sampling rate, and the value of k where distortion starts to happen, what is the frequency range of the test signal in Hertz?(Important: Your procedure must be shown to get credits)

---
Using your own audio:

5. What effect does decimation have on complex waveforms?


## Answer here:

1- A decimator effectively lowers the sample rate by a given factor, k, by removing every k sample and outputting the resulting reduced file. Therefore, we can determine the new sample rate of the outputted file by dividing the original sample rate by k. Since aliasing occurs when the frequencies of a signal exceed twice the sampling rate, known as the nyquist limit, and the highest frequency present in the sweep.wav file is 3,949 hz (see note above, ~l.156), we need to A) determine what the lowest sample rate that can represent a nyquist limit of 3,949 hz, and B). determine the highest value of K that produces a sample rate above this. 

To determine A, the lowest sample rate, I use the calculation for the nyquist limit (nyqFrequency = sampleRate / 2), and set that nyquist frequency to 3,949. This gives me a sample rate of 7,898 samples per second. Now, given the original sample rate of 16,000 samples per second (derived using print above), I need to find the highest K integer value that produces a sample rate of at least 7,898 samples per second. Since 16,000 divided by 2 produces 8,000 (just above the sampling rate limit), and 16,000 divided by 3 produces 5,333.33 (well below the sampling rate limit), I can determine that the highest possible K value that does not produce aliasing on the sweep.wav file is 2. 

Short answer: largest K = 2 

2- Aliasing occurs when the highest frequency of a signal exceeds twice the sample rate of the converter, known as the nyquist frequency. Signal converters need to have at least two samples per cycle of a signal in order to accurately represent it: any parts of the signal that exceed this frequency will be misinterpreted. Since any frequency above the nyquist limit will be too high for two samples per cycle, that component of the signal will be misinterpreted as a lower frequency. This process of approximating and misinterpreting the original signal creates the distortion known as aliasing. 

3- The sample rate of the original signal is 16,000 samples per second, therefore any frequency above 8,000 hz will produce aliasing because it will be above the nyquist frequency - calculated by dividing the sample rate by two: nyqFreq = sampleRate / 2 . Since the highest frequency of the signal is 3,949 hz (see above note, ~l.156), we need to find the minimum sample rate that can represent that frequency without aliasing. Therefore, we set our nyquist limit to 3,949 hz, and solve for the variable sampleRate: samplerate / 2 = 3, 3,949. Therefore, the lowest samplerate that could represent the sweep.wav signal would be 7,898 samples per second. 

4- Given the known sample rate of 16,000 samples per second and the fact that the nyquist limit is half the sampling rate, we can determine the range of frequencies that will not produce aliasing. Therefore we can determine the nyquist limit by dividing 16,000 by two, giving us 8,000 hz. Therefore, the range of frequencies within the file that will not distort is 0 to 8,000 hz. 

5- Complex waveforms, the result of the addition of at least two waves, will naturally generate more spectra and overtones than simple sine waves, which only produce a fundamental frequency without overtones. Since complex waveforms will therefore generate higher frequency spectra than simple sine waves, they are more likely to produce signals that exceed the nyquist frequency of a given system than sine waves. Therefore, complex waveforms will alias faster with sample rate reduction compared to sine waves of the same frequency. I noticed this very effect on my provided 'tone_mono' WAV file. The original sample contains a complex waveform created in Serum. As I decimated the signal to varying degrees, I noticed how quickly its timbre shifted compared to "sweep.wav" - a result of the higher frequency spectra exceeding the nyquist limit and producing distortion. 
