Sound as linear information

Original lab written by: Emily J. King

Goals: Code to understand digital pure tones and digital songs as vectors and how basic linear operations affect their sounds.

Additional files needed: LinearDatasound.npz, 
in the same folder as this ipynb file or in the path. 

NOTE: Due to multiple song files being played, do not use "run all".

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import Audio # needed to play audio

We saw in a previous lab (Chapter_3_Lab) how to visualize vectors as function plots. Let's recall what this looking like because we will use this visualization a lot here.

In [None]:
a=np.array([1, 3, -1, -2, 4, 3])
plt.plot(a,'-o')

We are now about to use that same type of visualization for a vector with 1000 entries.  This vector consists of evenly spaced points from (called "sampling") a sine function.  Why a sine function?  Because sounds are made up of linear combinations of sine waves.

In [None]:
fs = 44100
freq = 200
pure=np.sin(2*np.pi*np.arange(1, fs + 1)/(fs/freq))
plt.plot(pure[:1000])

Now we are going to literally listen to that vector, which encodes a pure tone.


In [None]:
Audio(pure, normalize=False, rate=fs) 

Next, we will scale the vector by scalar multiplying with 0.25.  What do you think will happen?

In [None]:
plt.plot(0.25 * pure[:1000])
plt.ylim([-1, 1])
plt.show()

What do you think the scalar multiple will sound like?


In [None]:
Audio(0.25*pure, normalize=False, rate=fs) 

What do you think will happen if we scalar multiply by other positive scalars?

Now let's create a vector of another pure tone, this one with a different frequency ("bounciness").

In [None]:
freq2 = 500
pure2=np.sin(2*np.pi*np.arange(1, fs + 1)/(fs/freq2))
plt.plot(pure2[:1000])

How does this sound different than the other vector of the pure tone?

In [None]:
Audio(pure2, normalize=False, rate=fs) 

And what happens when we add the vectors?


In [None]:
plt.plot(pure[:1000]+pure2[:1000])

What do you think the vector addition will sound like?

In [None]:
Audio(pure+pure2, rate=fs) 

And now what happens if we multiply each entry by -1?

In [None]:
plt.plot(pure[:1000], label='pure')
plt.plot(-pure[:1000], label='-pure')
plt.legend()
plt.show()

What do you think the negation will sound like?


In [None]:
Audio(-pure, normalize=False, rate=fs) 

Both the pure tone and its negation should sound really similar.  This makes sense as they have the exact same "bounciness".  (If you remember trig identities, since they are coming from sine waves, they are shifts of each other.)

Why might negation be useful? Let's listen to what happens when we use vector addtion to add the pure tone and its negation.

In [None]:
Audio(pure+(-pure), normalize=False, rate=fs) 

In [None]:
plt.plot(pure[:1000]+(-pure[:1000]), label='pure')

What's going on?  We've already seen algebraically that for ANY vector, adding it to its scalar multiple by -1 results in the zero vector.  This is essentially how noise cancelling headphone work: the background noise is estimated and the negation of it is generated to cancel it out.  Since it is easier to estimate sounds that are repeated (e.g., the constant drone inside an airplane), those sounds are more easily cancelled.

Speaking of noise, let us generate an example of what is called "noise" in signal processing.  The 0.1 factor is to make it small enough to just disrupt the  but not overpower the pure tone when we later mix them.

In [None]:
noise = 0.1 * np.random.randn(*pure.shape)
plt.plot(noise[:1000])

Now, let's mix the pure tone and the noise using vector addition.


In [None]:
plt.plot(pure[:1000]+noise[:1000])

In [None]:
Audio(pure+noise, rate=fs) 

Now, we are switching from pure tones to actual song recordings.  The vectors have a million and one entries each.  We will plot the first half million.

These (and many digital songs that you listen to) are actually two vectors (channels), basically one vector for your left headphone and another for your right.  We plot both vectors for the first song so you can see that they are similar.  And then will only plot one subsequently.  However, all of the linear operations will be performed on both channels.

In [None]:
npzfile=np.load('LinearDatasound.npz')
locals().update(npzfile)
fs = 44100 # sampling rate
L = 1000001

Song credits:

x = clip from King Gizzard and the Lizard Wizard - Flightless Records - 2017
y = clip from Björk - Hunter - OLI Records - 1998
z = clip from Black Star - Definition - Rawkus Records - 1999
u = clip from Johnny Cash - So Doggone Lonesome - Sun Records - 1955
v = clip from Vicente Fernández - Guadalajara (remasterizado) - 2006

In [None]:
plt.plot(x)   
plt.xlim([1, L])
plt.legend(['Left channel', 'Right channel'])
plt.show()

Let's listen to the song.  

In [None]:
Audio(x.T, normalize=False, rate=fs) 

Note that the second input in the sound function is telling Python how quickly to play the vector, which represents measurements over time, here in CD quality (44.1 kHz).

What do you think will happen if we scalar multiply the song vector by 0.25?

In [None]:
Audio(0.25*x.T, normalize=False, rate=fs) 

What do you think 4x will sound like?  (Due to Python/Jupyter's Audio command restrictions, we can't listen to scalar multiples of sounds with scalar larger than 1.)

Now let's listen to a song from a very different genre.

In [None]:
Audio(z.T, normalize=False, rate=fs) 

And compare the plot of the song vector.

In [None]:
plt.plot(z[:,0])

What do you think will happen if we add the two vectors? (Actually, we will add together the first channel vectors to make a new first channel and then add together the second channel vectors to make a new second channel.)  This will actually sound discordant since the songs are very different from each other.

In [None]:
Audio(x.T+z.T, rate=fs) 

Now let's try a different linear combination.

In [None]:
a=2
b=0.5
Audio(a*x.T+b*z.T, rate=fs) 

Now we add the pure tone from above to the very beginning of the first song.  (We first make two copies of the pure tone so it has two channels.)

In [None]:
Audio(np.row_stack((pure, pure)) + x[:len(pure), :].T, rate=fs) 

So, performing the linear operations on digital sound vectors is useful.  (Possibly informative, based on the context.)

Summarizing:

Scalar multiplying by a scalar > 1: Makes the sound louder

Scalar multiplying by a positive scalar < 1: Makes the sound quieter

Scalar multiplying by negative 1: Makes a similar sound that can be used to cancel out the original sound

Vector addition: Mixes the sounds

Linear combination: Mixes the sounds at potentially different amounts