<h1>Intro to Computer Music, Lab2</h1>
<h2>Gus Xia, NYU Shanghai</h2>

In this lab you will:

1. create simple waves from scratch 
2. reproduce the audio samples used in the class
3. learn more librosa functions

Again, here is a jupyter notebook cheat sheet:
https://www.cheatography.com/weidadeyue/cheat-sheets/jupyter-notebook/

<h2> Load packages </h2>



In [3]:
# To begin using librosa we need to import it, and other tools such as matplotlib and numpy
from pylab import *
import librosa             # The librosa library
import librosa.display     # librosa's display module (for plotting features)
import IPython.display     # IPython's display module (for in-line audio)
import matplotlib.pyplot as plt # matplotlib plotting functions
import matplotlib.style as ms   # plotting style
import numpy as np              # numpy numerical functions
ms.use('seaborn-muted')         # fancy plot designs
from __future__ import print_function # use the print() function from Python3

<h2> A pure sine wave </h2>


In [4]:
# create a sine wave from scratch 
# try to modify some parameters
A = 1;
f = 440;
# f = 440 * 11
phi = 0;
sr = 44100;
# sr = 4410
T = 2;
y = [A * sin(2*pi*f*t + phi) for t in arange(0.,T,1./sr)]
IPython.display.Audio(data=y, rate=sr) # press the "play" button to hear audio

<h2>Amplitude and loudness</h2>

In [5]:
# different amplitudes
print("the max and min values of signal y are:", max(y), min(y))
yy = []
scalers = [1./16, 1./8, 1./4, 1./2, 1, 2, 4, 8, 16]
for s in scalers:
    new = [i * s for i in y]
    #librosa.output.write_wav('scale'+str(s)+'.wav', np.array(new), sr)    
    yy = yy + new;
IPython.display.Audio(data=yy, rate=sr) # press the "play" button to hear audio

the max and min values of signal y are: 0.999999746258 -0.999999746258


<h3>Question 1: figure out the barely audible signal to your ears. Compared to it, how loud is originally signal y (the dB difference)? </h3>

In [6]:
print("The sound with 1/16 scalar is the barely audible signal")
print(20 * np.log10(16), end="")
print("dB")

The sound with 1/16 scalar is the barely audible signal
24.0823996531dB


<h2>Frequency and pitch</h2>

In [7]:
def sine_wave(f, t, sr):
    return [sin(2*pi*f*t) for t in arange(0.,t,1./sr)]

f0 = 300;
ratio = 2
#ratio = 2**(1./12)
yy = []
for i in range(0,13):
    y = sine_wave(f0 * (ratio**i), 1., 44100.)
    yy = yy + y
    
IPython.display.Audio(data=yy, rate=sr) 

<h3>Question 2: what do you hear? Explain what happened? </h3>

In [8]:
print("The tone of the sound increases gradually at first")
print("Frequency is doubled in each loop")
print("When the frequency is too large and sample rate is less than two times of the frequency")
print("Aliasing happens and the tone starts to decrease or lost")

The tone of the sound increases gradually at first
Frequency is doubled in each loop
When the frequency is too large and sample rate is less than two times of the frequency
Aliasing happens and the tone starts to decrease or lost


<h3>Question 3: if you change the ratio to be 2**(1/12), what do you hear instead? Does the relationship between frequency and pitch also follows the "log-linear" law? Briefly explain it according to what you heard</h3>

In [9]:
f0 = 300;
ratio = 2**(1./12)
yyy = []
for i in range(0,13):
    y = sine_wave(f0 * (ratio**i), 1., 44100.)
    yyy = yyy + y
    
IPython.display.Audio(data=yyy, rate=sr)

In [10]:
print("The tone of the sound increases a semitone in each loop")
print('It follows the "log-linear" law:')
print("For each i, as the frequency becomes 2**(i/12), the pitch becomes i/12")
print("Therefore, the log of frequency and pitch has the ratio of (i/12 * log2) / (i/12) = log2")
print("Which is a constant")

The tone of the sound increases a semitone in each loop
It follows the "log-linear" law:
For each i, as the frequency becomes 2**(i/12), the pitch becomes i/12
Therefore, the log of frequency and pitch has the ratio of (i/12 * log2) / (i/12) = log2
Which is a constant


<h3>Question 4: Try to Re-create the audio samples on page 41 of the lecture slides. Do a single-blind test and see whether your friend could tell the difference between 2-bit, 4-bit, 8-bit, and 16-bit encoding. </h3>

In [27]:
y = sine_wave(441, 2, 44100)
y = np.asarray(y)
import time

In [28]:
print("Level Method")
print("n-bit stores 2**n values, the n-bit code has no exact meaning")
def build(code):
    temp = [-1 + 2 / (2**code - 1) * i for i in range(2**code - 1)]
    temp.append(1.0)
    return np.asarray(temp)

def encode(array, value):
    diff = np.full_like(array, 4)
    idx = np.full_like(array, 0, dtype=np.int64)
    for i in range(value.shape[0]):
        cur_diff = np.abs(array - value[i])
        idx[diff > cur_diff] = i
        diff[diff > cur_diff] = cur_diff[diff > cur_diff]
    idx = np.asarray(idx, dtype=np.int64)
    temp = np.take(value, idx)
    return temp

Level Method
n-bit stores 2**n values, the n-bit code has no exact meaning


In [30]:
start = time.time()
encoder = build(2)
result = encode(y, encoder)
print("--- %s seconds ---" % (time.time() - start))
IPython.display.Audio(data=result, rate=sr)

--- 0.0051839351654052734 seconds ---


In [31]:
start = time.time()
encoder = build(4)
result = encode(y, encoder)
print("--- %s seconds ---" % (time.time() - start))
IPython.display.Audio(data=result, rate=sr)

--- 0.021405935287475586 seconds ---


In [32]:
start = time.time()
encoder = build(8)
result = encode(y, encoder)
print("--- %s seconds ---" % (time.time() - start))
IPython.display.Audio(data=result, rate=sr)

--- 0.2017202377319336 seconds ---


In [None]:
start = time.time()
encoder = build(16)
result = encode(y, encoder)
print("--- %s seconds ---" % (time.time() - start))
IPython.display.Audio(data=result, rate=sr)

In [49]:
print("Encoding Method")
print("First digit of the n-bit code denotes sign, 0 denotes positive, 1 denotes negative")
print("Ignore code 1000 as it means negative zero")
print("n-bit code can store 2 ** (code - 1) - 1 states")
def encode(array, code):
    temp = array * (2 ** (code - 1) - 1)
    temp = np.around(temp)
    temp = np.int8(temp) if code <= 8 else np.int16(temp)
    temp = temp / (2 ** (code - 1) - 1)
    return temp

Encoding Method


In [50]:
result = encode(y, 2)
IPython.display.Audio(data=result, rate=sr)

In [51]:
result = encode(y, 4)
IPython.display.Audio(data=result, rate=sr)

In [52]:
result = encode(y, 8)
IPython.display.Audio(data=result, rate=sr)

In [53]:
result = encode(y, 16)
IPython.display.Audio(data=result, rate=sr)