[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/beatlab-mcmaster/workshop_signalsensorsound/blob/main/audioanalysis.ipynb)

# Signals, Sensors, Sounds - Pt. 1: Computational Audio Analysis

Welcome to Colab! Colab, or "Google Colaboratory", lets you write and execute Python code in your browser. This notebook alternates two types of blocks, called "cells": **Markdown cells** (like this one) for text, and **code cells** (with a gray background) for Python code.  

For example, here is a code cell that computes a value, stores it in a variable, and prints the result. Note that we can add text comments to Python code by using the `#` symbol. Any text between a `#` and the end of the line will be ignored by Python.

In [1]:
my_birthyear = 2008
my_age = 2024 - my_birthyear # compute my age and store it in the variable my_age
print(my_age) # print the variable my_age

16


To execute, or "run", the code in the above cell, hover over the cell and press the play button to the left of the code. You can also run the code by pressing "Ctrl+Enter" on your keyboard.  

You can edit any code cell and run it to see what happens. 

<mark>**Task:** If you are new to Python notebooks, go ahead and change the variable "my_birthyear" in the above cell to your actual birth year, and then run the code to calculate your age. </mark>

Note that your changes will not permanently alter this notebook - once you close this tab, your changes will be lost. If you want to save a personal copy of your edited notebook, you can use the File menu **File→Save a copy in Drive**.

The first chapter of this notebook will give a brief introduction to the foundations of Python. If you are already familiar with Python, feel free to skip ahead to [1.4 Libraries](#1-4-libraries). You need to run the code cell in that chapter to load numpy, which will be used for the audio analysis later on.

## 1. Introduction to Python

Let's start with Python basics: variables, lists, functions, and libraries. 

<a id='1-1-variables'></a>
### 1.1 Variables

In the first code cell, you defined two **variables**, "my_birthyear" and "my_age", that each store a number. In any future code cells, you can now call the variable name to access the stored number. You can also overwrite the variable at any time to hold a new value, by using the syntax `variable = value`:

In [2]:
print(my_age) # access the variable
my_age = 99 # overwrite the old value
print(my_age) # access the new value

16
99


Variables can not only store numbers (called **"integers"** or **"floats"**), but also text (called **"strings"**), and **Booleans**, which can represent one of two logical values `True` or `False`. Note that some of the operators will behave differently depending on which data type they are used on. The first code cell below assigns integers to the variables a and b, checks their types, and then adds the two integers using the `+` operator. The second code cell assigns strings instead of integers to the variables. The same `+` operator, used on strings instead of integers, concatenates the two strings:

In [3]:
a = 15 # assign variable a to be an integer
b = 27 # assign variable b to be an integer
print("a: ", type(a)) # check type of variable a
print("b: ", type(b)) # check type of variable b
print("a + b = ", a+b) # print a + b

a:  <class 'int'>
b:  <class 'int'>
a + b =  42


<mark>**Task:** Change the operator to multiply the two integers a and b. </mark>

In [13]:
a = "Hello" # assign variable a to be a string
b = "World" # assign variable b to be a string
print("a: ", type(a)) # check type of variable a
print("b: ", type(b)) # check type of variable b
print("a + b = ", a+b) # print a + b

a:  <class 'str'>
b:  <class 'str'>
a + b =  HelloWorld


<mark>**Task:** What happens when you try to multiply two strings instead of two integers? </mark>  

<mark>**Task:** Change the operator back to `+`. Assign an integer to variable a and a string to variable b. What happens when you try to add the variables now? </mark>

### 1.2 Lists

Variables can also store multiple values in the form of a **list**. Lists are created using square brackets `[]`:

In [7]:
rhythm = ["beat", "metre", "tempo"] # assign new list to the variable "rhythm"
print(rhythm)

['beat', 'metre', 'tempo']


Python lists are ordered, which means that each item has a fixed position in the list. The position is defined by the **index**. Note that the first item has the index 0. Using the index, you can access a specific item from the list: 

In [16]:
print(rhythm[1]) # print the list item with the index 1

metre


<mark>**Task:** Using the `variable = value` syntax and the correct list index, replace the first item of the rhythm list with "pulse". Print your list to see whether it worked. </mark>

### 1.3 Functions

For operations on variables that go beyond the simple operators `+ - * / =`, you can use **functions**. Functions take the form `functionName(arguments)`. Python comes with handy functions, like `print()` and `type()` which you have already used above. Functions take **arguments** in the function brackets. For functions like `print()` and `type()`, the only argument you need to specify is the variable that you want to print or check the type of. Other functions take several arguments. For example, for the function `round()`, the first argument specifies the object that you want to round, and the second argument specifies the number of decimal places of the output:

In [17]:
pi = 3.14159265359 
round(pi, 2) # round the variable "pi" to two digits

3.14

If you want to know which arguments a function takes, or what the function does with the arguments, you can call the `help()` function to read the function documentation: 

In [18]:
help(round)

Help on built-in function round in module builtins:

round(number, ndigits=None)
    Round a number to a given precision in decimal digits.

    The return value is an integer if ndigits is omitted or None.  Otherwise
    the return value has the same type as the number.  ndigits may be negative.



The different variable types that were introduced in [1.1. Variables](#1-1-variables) each come with their own functions, called **methods**. To use a method on a variable, instead of passing the variable as an argument, you use the syntax `variable.method()`. For example, `list.count()` counts how often a value (passed as argument) occurs in a list: 

In [23]:
student_ages = [14, 16, 15, 16, 15, 17, 16, 18, 18, 18, 16, 17]
student_ages.count(16) # count how often 16 appears in the list

4

<mark>**Task:** Call `help(list)` to see all methods of the variable type `list`. Try some of them on the student_ages list. </mark> 

A function is nothing but a block of code that was assigned to a function name. When calling the function name, the code is executed. This means that you can write your own functions and execute them by calling the function name that you assigned them. For example, the following code cell defines a function called `my_mean` that computes the mean of all items in a list. Note that the code for the function is indented to delimit where the function code begins and ends. 

In [28]:
def my_mean(my_list): 
    my_sum = sum(my_list) # calculate the sum of all items in the list
    my_length = len(my_list) # count how many items are in the list 
    my_mean = my_sum / my_length # calculate the mean as sum/length
    return my_mean # return the variable "my_mean"

mean_age = my_mean(student_ages)
print(mean_age)

16.333333333333332


<mark>**Task:** Write a function that takes two integers as arguments a and b and returns their product. </mark>

<a id='1-4-libraries'></a>
### 1.4 Libraries

Luckily, you don't have to write new functions for everything you want to do in Python. Other developers have written functions for common operations and published them in **libraries**. To use the functions in a library, you first need to import the library. For example, the library NumPy is popular for numeric operations on arrays, which are similar to lists but require less computing power. The following code cell imports NumPy (calling it "np" as an abbreviation), creates an array, adds a constant to the array, and computes the mean of the result:

In [26]:
import numpy as np # import NumPy library and call it "np"

student_ages = np.array([14, 16, 15, 16, 15, 17, 16, 18, 18, 18, 16, 17]) # create a new array
student_ages_2030 = student_ages + 6
mean_age_2030 = student_ages_2030.mean()
print(round(mean_age_2030, 1)) # print mean age in 2030

22.3


<a id='2-read-audio'></a>
## 2. Read an Audio File

This part includes saving a file to the local storage and reading it in, so we need to set the folder path using the os library:

In [11]:
import os
os.chdir("workshop_signalsensorsound")

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'workshop_signalsensorsound/'

Let's see what Python can do with audio! First, let's download an audio file from YouTube, load it into the Python session, and explore the file.

### 2.1 Download Audio

To download an audio file from YouTube, we will use the yt-dlp library. Since Colab does not come with an installation of yt-dlp, you need to install it first:

In [8]:
pip install yt_dlp

Collecting yt_dlp
  Downloading yt_dlp-2024.11.4-py3-none-any.whl.metadata (172 kB)
Downloading yt_dlp-2024.11.4-py3-none-any.whl (3.2 MB)
   ---------------------------------------- 0.0/3.2 MB ? eta -:--:--
   ---------------------------------------- 3.2/3.2 MB 37.2 MB/s eta 0:00:00
Installing collected packages: yt_dlp
Successfully installed yt_dlp-2024.11.4
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: C:\Users\Joshua Schlichting\AppData\Local\Programs\Python\Python312\python.exe -m pip install --upgrade pip


Then, import the library to your current Python session:

In [9]:
import yt_dlp

Now, go to [YouTube.com](https://www.youtube.com), search for your favourite song, copy the video url, and paste it as argument to the `download()` method in the code cell below. Make sure to enclose the URL in parenthesis! The code cell uses the `yt-dlp.YoutbeDL()` function to set download options (`'format': 'bestaudio'` sets the format to audio, rather than video, and `'outtmpl': 'myaudio.mp3'` sets the file name and file type), and the `download()` method to download from your chosen YouTube link. When you run the code cell, your song will be downloaded and stored as "myaudio.mp3" in the temporary Colab cloud storage.

In [15]:
yt_dlp.YoutubeDL({'format': 'bestaudio', 'outtmpl': 'myaudio.mp3'}).download("https://www.youtube.com/watch?v=X_5D1t8Qkus") # download MP3 from YouTube

[youtube] Extracting URL: https://www.youtube.com/watch?v=X_5D1t8Qkus
[youtube] X_5D1t8Qkus: Downloading webpage
[youtube] X_5D1t8Qkus: Downloading ios player API JSON
[youtube] X_5D1t8Qkus: Downloading mweb player API JSON
[youtube] X_5D1t8Qkus: Downloading m3u8 information
[info] X_5D1t8Qkus: Downloading 1 format(s): 251
[download] Destination: myaudio.mp3
[download] 100% of    2.75MiB in 00:00:00 at 15.61MiB/s  


0

### 2.2 Load Audio in Python

If you didn't get yt-dlp to work, you can use *my* favourite song for the following steps, which is already saved as "myaudio.mp3" (unless you overwrote the file with your own song in the previous step). 

For the next steps, we will use the librosa library, which provides a wide range of functions to display and analyse audio in Python. First, import librosa:

In [15]:
import librosa

Next, load myaudio.mp3 using the `librosa.load()` function. Audio files are stored in two variables: y stores the waveform as a time series of amplitudes, and sr stores the sampling rate, that is, the number of samples per second of audio.

In [16]:
y, sr = librosa.load('myaudio.mp3') # store the audio in the two variables y and sr

  y, sr = librosa.load('myaudio.mp3') # store the audio in the two variables y and sr


NoBackendError: 

Librosa doesn't include an audio player, so we'll import a player from IPython and display a player for myaudio.mp3:

In [None]:
from IPython.display import Audio
Audio(data=y, rate=sr)