<nav class="navbar navbar-default">
  <div class="container-fluid">
    <div class="navbar-header" style="float: left">
        <a class="navbar-brand" href="0_Index.ipynb" target="_self"> <h2> &uarr; Back to front page</h2></a>
    </div>
  </div>
</nav>

# Exploring an audio file

When audio signals are recorded to a digital medium, this is done by sampling the amplitude of an analog signal (aka. voltage output from a microphone) at regular intervals known as the *sampling period* $T_s$, resulting in a sequence of measured values. An illustration of this process is shown below.

<img src="Figures/Task1_Ts.png" style="width: 600px; margin-left: 100px" />

Closely related to the sampling period $T_s$ is the *sampling rate* or *sampling frequency* $f_s$, defined as the *number of samples per second*. The sampling frequency $f_s$ is calculated simpy by inverting the samping period $T_s$.

$$f_s =\frac{1}{T_s}$$

In order to reconstruct an analog voltage signal which can be sent to speakers or headphones, both knowledge of the sampling frequency used during recording *and* the measured sample values are required. Happily, all this information is contained within any audio file format.

The codecell below uses a function available through the `scipy` module to read the audio file `example_music.wav` containing some relaxing classical music. The sequence of audio amplitude measurements which contain the actual audio signal is assigned to the variable `sampleData` in the form of a very long `array`, while the sampling frequency is an integer value which is assigned to the variable `fs`. 

After loading the audio file, the data is passed along to an embedded Auduio playback widget accessible through the IPython module. 

In [4]:
import scipy.io.wavfile as wavfile # Import module for handling of .wav audio files
from IPython.display import Audio   # For loading embedded audio player

fs, sampleData = wavfile.read("sample_audio.wav") # "fs" is sampling frequency, "sampleData" is the sequence of measurements

# Use the following lines to listen to the audio signal
Audio(sampleData, rate=fs)

## a)

Run the codecell and listen to the audio clip. We can now begin our analysis of the audio signal. 

Using the data gathered in the codecell above, write a script which prints the sampling frequency $f_s$, sampling period $T_s$ and the total duration of the audio file. <br>*extra: what is the size of the file in number of bits?*<br> *hint: the function `len()` should be useful here*. 


In [7]:
# WRITE YOUR CODE HERE:
ts = 1/fs
numSamples = len(sampleData) # Number of samples in the audio signal
totalDuration = numSamples / fs # Total duration of the audio signal in seconds

dataType = sampleData.dtype
bitsPerSample = dataType.itemsize * 8
# Used to check how many bits per sample

numChannels = sampleData.shape[1] if sampleData.ndim > 1 else 1 # Check if the audio is mono or stereo
fileSizeBits = numSamples * numChannels * bitsPerSample  # 16 bits per sample

print(f"Sampling Frequency (fs): {fs} Hz")
print(f"Sampling Period (Ts): {ts} seconds")
print(f"Total Duration: {totalDuration} seconds")
print(f"Bits per sample: {bitsPerSample}")
print(f"Size of the file: {fileSizeBits} bits")

Sampling Frequency (fs): 22050 Hz
Sampling Period (Ts): 4.5351473922902495e-05 seconds
Total Duration: 20.340725623582767 seconds
Bits per sample: 16
Size of the file: 7176208 bits


## b)

Use the Audio object to play back the audio clip once with double the sample rate $f_s$, and once with half the sample rate. What effects can you hear? Explain the cause of these effects.


In [8]:
# WRITE YOUR CODE HERE:

Audio(sampleData, rate=fs) # Original audio
 
Audio(sampleData, rate=2*fs) # 2*fs is equivalent to 2*fs

Audio(sampleData, rate=fs/2)  # fs/2 is equivalent to fs/2

print("When the sample rate is doubled, the audio is played back at twice the speed. When the sample rate is halved, the audio is played back at half the speed,\nthe reason for this is that the sample rate is the number of samples per second, so when the sample rate is doubled, the audio is played back at twice the speed, \nand when the sample rate is halved, the audio is played back at half the speed.")

When the sample rate is doubled, the audio is played back at twice the speed. When the sample rate is halved, the audio is played back at half the speed,
the reason for this is that the sample rate is the number of samples per second, so when the sample rate is doubled, the audio is played back at twice the speed, 
and when the sample rate is halved, the audio is played back at half the speed.


ANSWER THEORY QUESTIONS HERE:

A crucial parameter to be aware of when working with audio is the bit rate of the digital audio signal measured in bps (bits per second).

## c) 

Given each sample having 16 bit resolution, what is the bit rate for the audio recording and what is the total size of the file (assuming no compression)?


In [9]:
# WRITE YOUR CODE HERE:
# From task a) we know 16 bits per samle with the variable "bitsPerSample"
numChannels = sampleData.shape[1] if sampleData.ndim > 1 else 1 # Check if the audio is mono or stereo
bitRate = fs * numChannels * bitsPerSample # Bit rate = fs * numChannels * sampleBits
fullSizeBits = bitRate * totalDuration # Total size of the audio file in bits¨

print(f"Bit Rate: {bitRate} bits per second")
print(f"Size of the file: {fullSizeBits} bits")

Bit Rate: 352800 bits per second
Size of the file: 7176208.0 bits


It is often desirable to provide a graphical plot of the audio signal. Simply passing the `sampleData` to the `plot()` function should acheive this, but the x-axis will show sample index `n` instead of the time `t` in seconds, which may be rather unhelpful. Based on the total number of samples in the audio clip and the sampling rate, we can adjust the plot to show signal amplitude as a function of time $t$ in seconds by creating a new array (e.g. `t`) of equal length to our audio signal which spans the duration of our audio signal.

For example, given an audio file with $N$ samples and sampling rate $f_s$, what we want is an array `t` with `N` elements linearly spaced in the time span $0 \leq t < \frac{N}{f_s}$.

## d)

Write the code to generate this plot in the codecell below. `numpy.array`, `numpy.arange` and `numpy.linspace` can all be used here. If done correctly, the resulting figure should look something like [this](Figures/audio_plot.png).


In [None]:
# WRITE YOUR CODE HERE:
import numpy as np
import matplotlib.pyplot as plt

t = np.linspace(0, totalDuration, numSamples) 

plt.figure(figsize=(12, 6))
plt.plot(t, sampleData)

plt.title("Audio Signal")
plt.xlabel("Time [s]")
plt.ylabel("Amplitude")
plt.show()

<br>
<nav class="navbar navbar-default">
  <div class="container-fluid">
    <div class="navbar-header" style="float: right">
        <a class="navbar-brand" href="2_processing_audio.ipynb" target="_self">Next page: <i>Processing audio signals</i> &gt;</a>
    </div>
  </div>
</nav>