# Introduction to Python et al., Working with Audio Signals

[return to main page](index.ipynb)

For most of the exercises, we will use the very popular programming language [Python](https://www.python.org) together with a few external libraries from the [Scientific Python Stack](http://scipy.org).
To get started, have a look at this [Python Introduction](http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/intro-python.ipynb) and this [simple signal processing example](http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/simple-signals.ipynb).

Note that Python is not the only option for the kind of tasks that we'll tackle here.
If you are interested in some alternatives, have a look at [Julia](http://julialang.org/), [R](http://www.r-project.org/), [Octave](http://octave.org/) or [Scilab](http://www.scilab.org/).
All the mentioned applications are open-source software and there are of course even more alternatives (free and proprietary).

Most of the exercises in this course (including the one you're reading right now) are presented as [IPython notebooks](http://ipython.org/notebook.html).
They can be [viewed online](http://nbviewer.ipython.org/github/spatialaudio/communication-acoustics-exercises/blob/master/index.ipynb), but it makes much more sense to download them and run them with [IPython](http://ipython.org/).
To get an idea what IPython is all about, have a look at this [IPython Introduction](http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/intro-ipython.ipynb).
You should check out the terminal version of IPython (use the command `ipython3`) as well as the Qt console (start it with `ipython3 qtconsole`).
But to use this very notebook, start the IPython notebook server (which will open a browser window for you) with the command:

    ipython3 notebook

ATTENTION: always make sure to run Python version 3.x and *not* Python 2.x!

## Notebook Cells

This notebook consists of so-called "cells", which can be used for normal text (see above) or for Python code (see below).
*Code cells* can be selected by a mouse click, the code can be edited and then executed by pressing *Shift+Enter* or by clicking the "play" symbol in the top part of the page.

Don't be shy, try it:

In [None]:
50 - 5 * 4 + 12

Code cells can have multiple lines (use *Enter* for line breaks).
When the code cell is executed, all lines are executed, but only the value of the last line (strictly speaking, of the last *statement*, and only if it's an *expression statement*, since it's the only statement that yield a value when evaluated) is displayed (except if it's value is `None`).

Here's another code cell for you to play with:

New cells can be inserted by pressing the *a* or *b* keys (to insert *above* or *below* the current cell) or via the menu. You should also have a look at "Help" -> "Keyboard Shortcuts".

You can step through all cells in the notebook by repeatedly pressing *Shift+Enter*.
Alternatively, you can click "Run All" in the "Cell" menu.

## Importing Modules/Packages

In order to work with numeric arrays (in our case mainly audio signals), we import the [NumPy](http://www.numpy.org) package.

In [None]:
import numpy as np

Now we can use all NumPy functions (by prefixing `np.`).

In [None]:
np.zeros(10000)

## Tab Completion

*Exercise:* Type `np.ze` and then hit the *Tab* key ...

## Array, Vector, Matrix

Audio signals can be stored in NumPy *arrays*.
Arrays can have arbitrarily many dimensions, but let's use only one-dimensional arrays for now.
Arrays can be created like this:

In [None]:
a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Note that the result is not displayed when you assign to a variable (because assignment is a *statement* and not an *expression*).
To show the data, write the variable name separately as the last (or only) line of a code cell.

In [None]:
a

BTW, there is an easier way to get this particular array:

In [None]:
b = np.arange(10)
b

Note that the range starts with `0` and ends just before the given stop value!

If you are not used to programming, this might seem strange at first sight, but you'll see that this is vastly superior to starting with `1` and including the stop value.
If you're not convinced yet, have a look at [what E. W. Dijkstra has to say](http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html).

## Getting Help

If you want to know details about the usage of [`np.arange()`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html) and all its supported arguments, have a look at its help text.
Just append a question mark to the function name:

In [None]:
np.arange?

A help window should open in the lower part of the browser window.
This window can be closed by hitting the *q* key (like "quit").

Let's get some more help:

In [None]:
np.zeros?

You can also get help for the whole NumPy package:

In [None]:
np?

You can get help for any object in the namespace by appending a question mark to the name of the object.
Let's check what the help system can tell us about our variable `a`:

In [None]:
a?

The help system may come in handy when solving the following exercises ...

## `np.arange()`

We'll often need sequences of evenly spaced numbers, so let's create some.

*Exercise:* Create a sequence of numbers with `np.arange()`, starting with 0 and up to (but not including) 6 with a step size of 1.

*Exercise:* Create a sequence of numbers with `np.arange()`, starting with 0 and up to (but not including) 0.6 with a step size of 0.1.

*Exercise:* Create a sequence of numbers with `np.arange()`, starting with 0.5 and up to (but not including) 1.1 with a step size of 0.1.

The previous exercise is somewhat tricky.
If you got it right, have a look at [arange considered harmful](http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/misc/arange.ipynb) for what you *could have* done wrong.
If you got an unexpected result, have a look at [arange considered harmful](http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/misc/arange.ipynb) for an explanation.

*Exercise:* Can you fix the problem?

What do we learn from all this?
$\Rightarrow$
`np.arange()` is great, but use it only with integer step sizes!

## `np.linspace()`

Another, slightly different method to create a sequence of evenly spaced numbers is [`np.linspace()`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html).
Have a look at the documentation.

In [None]:
np.linspace?

*Exercise:* Create a sequence of numbers with `np.linspace()`, starting with 0 and up to (including) 6 with a step size of 1.

Note that the resulting array will have a *floating point* data type even if all inputs (and the step size) are integers.
This is not the case with `np.arange()`.

*Exercise:* Create a sequence of numbers with `np.linspace()`, starting with 0 and up to (but not including) 6 with a step size of 1.

*Exercise:* Create a sequence of numbers with `np.linspace()`, starting with 0 and up to (but not including) 0.6 with a step size of 0.1.

*Exercise:* Create a sequence of numbers with `np.linspace()`, starting with 0.5 and up to (but not including) 1.1 with a step size of 0.1.

Note that `np.linspace()` doesn't have the above-mentioned problem we had with `np.arange()`.

## `np.arange()` vs. `np.linspace()`

*Exercise:* Find some examples where `np.array()` works better and some where `np.linspace()` should be preferred.

## A Shorthand for `np.arange()` and `np.linspace()`

If you want to save a few keystrokes and make your code a little more obscure, have a look at [`np.r_[]`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.r_.html).

In [None]:
np.r_?

## Creating a Sine Tone

Now let's create our first audio signal, shall we?

Let's generate a signal using the equation $y(t) = A\sin(\omega t)$ with $\omega = 2\pi f$ and $f$ being the frequency of the sine tone.
The maximum signal amplitude is given by $A$.
The variable $t$ obviously represents time.
Let's create a digital signal with evenly spaced values for $t$.

We can use the function [`np.sin()`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.sin.html) to create a sine tone. Let's look at its help text first.

In [None]:
np.sin?

Now that we know which function to call, we need appropriate input.
And that's where our sequences of evenly spaced values from above come into play.

The nice thing about NumPy functions like `np.sin()` is that they can operate on whole arrays at once, so it is not necessary to call the function on each single value separately.
Therefore, we can store the whole range of values for our time variable $t$ in one array.

According to the equation, each value of $t$ has to be multiplied with (the constant) $\omega$.
That's another nice thing about NumPy: we don't have to multiply every value of the array $t$ separately with $\omega$, we can multiply the whole array with a scalar at once, and NumPy does the element-wise multiplication for us.
This is called ["broadcasting"](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html), in case you stumble upon that in the docs.
The array returned by `np.sin()` can (again using broadcasting) be multiplied by the constant scalar $A$ to get the final result.

The only thing that's still missing is $\pi$, but that's simple:

In [None]:
np.pi

*Exercise:* Create a sine tone with a frequency of 500 Hz, a duration of 1 second and an amplitude of 0.3.
Use a sampling rate of 44.1 kHz.
 The first value of $t$ should be 0.

In [None]:
dur = 1  # duration in seconds
amp = 0.3  # maximum amplitude
freq = 500  # frequency of the sine tone in Hertz
fs = 44100  # sampling frequency in Hertz

# t = ???

# y = ??? * np.sin(??? * t)

What happens if `dur` is not an integer?  
What happens if `dur` is not an integer multiple of `1/fs`?

*Exercise:* Try if your code still works in those cases.

## Array Properties

Let's have a quick look at $y$.

In [None]:
y

That's just a bunch of numbers, probably we can get some more information ...

*Exercise:* Try those different ways to obtain the size of the resulting array:

In [None]:
len(y)

In [None]:
y.shape

In [None]:
y.size

*Exercise:* What's the difference between them?

In [None]:
len?

In [None]:
np.ndarray.shape?

In [None]:
np.ndarray.size?

*Exercise:* There's a lot more information about the array, try the following commands and find out what they mean.

In [None]:
y.ndim

In [None]:
y.dtype

In [None]:
y.itemsize

In [None]:
y.nbytes

In [None]:
y.size * y.itemsize

In [None]:
y.strides

In [None]:
y.flags

You can also get some statistical values about the data in the array.
Those are not very interesting for our sine tone, but it's still good to know that these functions exist.

In [None]:
y.max()

In [None]:
y.min()

In [None]:
y.ptp()

In [None]:
y.mean()

In [None]:
y.std()

In [None]:
y.var()

Most of these *methods* also exist as *functions*, e.g.

In [None]:
np.max(y)

## Plotting

NumPy cannot plot by itself, it needs some help from [`matplotlib`](http://matplotlib.org/).

In [None]:
import matplotlib.pyplot as plt

Now we can plot the data from our array:

In [None]:
plt.plot(y)

The peculiar thing about `matplotlib` is that it doesn't actually show the plot!

But it's in there somewhere ... we just have to use `show()` to reveal it:

In [None]:
plt.show()

Now the plot window should be visible.
Check out [all those fancy buttons](http://matplotlib.org/users/navigation_toolbar.html)!

*Exercise:* Zoom into the plot so that only one period of the sine tone is visible.
There are different ways to do that.

Note that while the plot window is open, we cannot use any of the code cells in our notebook.
You have to close the plot window to be able to continue.
This is annoying.
As is typing `plt.show()` all the time.

That's why the IPython notebook provides this special (a.k.a. "magic") command:

In [None]:
%matplotlib

After that, you don't have to use `show()` anymore, and you can run code cells even if the plot window is open.

Try it:

In [None]:
plt.plot(y)

Nice, isn't it?

There is still something ugly going on:
The plotting functions return some strange object(s) which we don't need (for now).
Those can be ignored by appending a semicolon to the last statement of a code cell (is has no effect after any other statement):

In [None]:
plt.plot(y);

Very nice!

The plot window is very useful for quick interactive exploration, but sometimes you might want to save the plots within the notebook.
That can be done in *inline mode*:

In [None]:
%matplotlib inline

Anything you plot now, will be shown right below the code cell.
Let's only plot a part of the signal this time.

In [None]:
plt.plot(y[:300]);

Inline plots are saved (as images) within the notebook.

If you want to use the plot window again, just do:

In [None]:
%matplotlib

As always, for more info:

In [None]:
%matplotlib?

There is a similar "magic" command which may come in handy for quickly trying stuff without having to type all those imports and all the `np.` and `plt.` prefixes:

In [None]:
%pylab?

However, this is not recommended for "serious" notebooks, because it makes the code harder to read and it may lead to confusion between functions with the same name (but different semantics) from different namespaces (`sum()` vs. `np.sum()`, `max()` vs. `np.max()`, `all()` vs. `np.all()` etc.).

## Tweaking the Plot

Let's look again at our plot.

In [None]:
plt.plot(y);

Since we passed only a single array to the `plot()` function, the x-axis shows the sample index from 0 to the length of the signal in samples (minus one).
It may be more meaningful to show the time in seconds.

But let's close the previous plot first.

In [None]:
plt.close()

If we pass two arrays to the `plot()` function, the first one defines the mapping from sample indices to the actual values displayed on the x-axis, the second one specifies the y values.

In [None]:
plt.plot(t, y);

Good, now the x-axis shows the time in seconds.
Let's create axis labels so that everyone knows.

In [None]:
plt.xlabel("Time / Seconds")
plt.ylabel("Amplitude")
plt.title("Sine Tone with {} Hz".format(freq));

For more information, have a look at [plotting with `matplotlib`](http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/plotting/matplotlib.ipynb).

## Listening to the Signal

Note: This section is still work-in-progress.
Currently, there isn't yet a Python library available that makes it easy to play back a NumPy array.
For now we'll use an unofficial development version of *PySoundCard*.

Python cannot play audio on its own, but there are several external libraries available for that.
We'll be using [PySoundCard](https://github.com/bastibe/PySoundCard/), some other libraries are shown at [this overview page](http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/playback-recording/index.ipynb).

To install PySoundCard, we grab the sources from [Github](https://github.com/), switch to the development branch "playrec" and install it:

    git clone https://github.com/bastibe/PySoundCard.git
    cd PySoundCard
    git checkout playrec
    python3 setup.py develop --user

<span style="color:gray">
Hopefully the "playrec" branch will be included in the next official release of PySoundCard, then you can just install it according to [the documentation](https://github.com/bastibe/PySoundCard/blob/master/README.md).
If you have [pip](https://pip.pypa.io/en/stable/installing.html) and the library [PortAudio](http://www.portaudio.com/) already installed, it *will then be* be enough to run this command in a terminal:
</span>

<pre style="color:gray">
pip3 install PySoundCard --user
</pre>

Note: After the installation, you should restart any running IPython kernels (e.g. using the menu "Kernel" $\to$ "Restart"), otherwise they won't know about the newly installed Python module.

Once installed, you can use it like this:

In [None]:
import pysoundcard as sc

WARNING: You should turn the volume down, just to be sure not to destroy your loudspeakers/headphones/ears.

In [None]:
sc.play(y)

*Exercise:* It's possible that you hear clicks in the beginning/end. What could be the reason for that? How could that be mitigated?

## Writing the Signal to a File

It's possible to write WAV files with the [wave module](https://docs.python.org/3/library/wave.html) from Python's standard library.
Since this is quite complicated, we'll use the more convenient external library [PySoundFile](https://github.com/bastibe/PySoundFile/).
Of course there are also alternatives; have a look at [this overview page](http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/audio-files/index.ipynb).

Have a look at [the documentation](http://pysoundfile.readthedocs.org/) for how to install PySoundFile.
If you have [pip](https://pip.pypa.io/en/stable/installing.html) and the library [libsndfile](http://www.mega-nerd.com/libsndfile/) already installed, it should be enough to run this command in a terminal:

    pip3 install PySoundFile --user

Again, you will have to restart the IPython kernel (e.g. using the menu "Kernel" $\to$ "Restart") to be able to import the newly installed Python module.

Once installed, you can use it like this:

In [None]:
import soundfile as sf

In [None]:
sf.write(y, 'my_first_signal.wav', fs)

Note that the sampling rate has to be passed to be stored within the file.

*Exercise:* Find the sound file that was just written and play it in an external audio player ([`play`](http://sox.sourceforge.net/), [`mplayer`](https://www.mplayerhq.hu/), [`aqualung`](http://aqualung.factorial.hu/), [`vlc`](http://www.videolan.org/vlc/), ...).

## Creating a Function

Generating our sine tone wasn't very complicated, but let's still create a function that does all the steps for us.
We can then use this function repeatedly to create several different sine tones.

*Exercise:* Create a function named `mysine()` based on the template below.

In [None]:
def mysine(frequency, amplitude, duration, samplerate):
    """Generate sine tone with given frequency/amplitude/duration."""
    
    # add your code here!

    return ...  # ???

Note the indentation of 4 spaces.
Every statement inside the function must start with the same indentation (except you use statements which themselves need additional indentation, like a `for` loop).

The string below the first line is called "docstring".
Every function should have that.
The docstring can be shown with `help(mysine)` or, as we've seen before, by appending a question mark to the name (the former works in all Python interpreters, the latter only in IPython):

In [None]:
mysine?

*Exercise:* See what happens if you use two question marks.

*Exercise:* Check if your function works by calling it with a few different input values and plotting the results.

## Creating another Function (for Plotting)

*Exercise:* Create a function that takes two arguments (the signal as a NumPy array and the samplerate) and creates a plot with axis labels and with the x-axis showing the time in seconds.

## `for` Loops

In the next exercise, you'll have to do very similar things multiple times in a row.
That sounds like a job for a `for` loop.

That's how `for` loops work in Python:

In [None]:
for i in "one", 2, "III":
    # code within the loop body uses 4 spaces for indentation
    # ...
    print("i:", i)
    # ...
# de-indent to continue after the loop
print("finished!")

*Exercise:* Use a `for` loop to call the function `mysine()` three times with frequencies of your choice and store the results in three WAV files that have the respective frequencies in their file names.
Use [the format method of `str` objects](https://docs.python.org/3/library/stdtypes.html#str.format) to create the file names.

Listen to the WAV files to check if everything is OK.

## Signal Processing with the *SciPy* Library

The name "SciPy" stands for two slightly different things:

* The [*Scientific Python Ecosystem*](http://scipy.org/), consisting of [NumPy](http://numpy.scipy.org/), [matplotlib](http://matplotlib.org/), [IPython](http://ipython.org/), the [SciPy library](http://scipy.org/scipylib/) and many more libraries and tools.

* The [SciPy library](http://docs.scipy.org/doc/scipy/reference/tutorial/general.html), which in turn is part of the *Scientific Python Ecosystem*.

We were using the former already, now let's use the latter:

In [None]:
from scipy import signal

This imports SciPy's [signal processing module](http://docs.scipy.org/doc/scipy/reference/tutorial/signal.html).

Unlike "`import numpy as np`" we *never* import the whole `scipy` namespace.
The SciPy package is a collection of [many sub-packages and sub-modules](http://docs.scipy.org/doc/scipy/reference/); we only import those that we need.

You should *always* use one of those forms:

    from scipy import foobar
    import scipy.foobar as foo
    from scipy.foobar import foofun

... and *never* one of those:

    import scipy
    from scipy import foofun

... where `foobar` is the name of a sub-package/sub-module and `foofun` is the name of a function.

*Exercise:* Use the SciPy function `signal.chirp()` to create two linear sine sweeps with an initial frequency of 100 Hz, a final frequency of 5000 Hz and with two *different lengths* of your choice.
Listen to the results.

*Exercise:* Change the sweep type from `'linear'` to `'log'` and listen to the results.

We'll see more of SciPy in later exercises, stay tuned!

## Superposition of Tones

*Exercise:* Create a signal with a length of 2 s, as a superposition of five sine tones with amplitudes of $\frac{1}{5}$ and the frequencies 500 Hz, 1000 Hz, 1500 Hz, 2000 Hz and 2500 Hz.
Listen to it.

*Exercise:* Same as above, but skip one of the frequencies.
Try to hear the difference.

*Exercise:* Create a superposition of a sine tone with a frequency of 500 Hz and a sine tone with a frequency of 507 Hz -- each with an amplitude of $\frac{1}{2}$.
What do you notice when listening to the signal?

## More Than One Channel $\to$ Two-Dimensional Arrays

Up to now, we were only using audio signals with a single channel.
Those could be easily stored in a one-dimensional NumPy array.

To store more than one channel, we can use a two-dimensional array.
Two-dimensional arrays somewhat look like lists of lists, but internally, they are still stored in one contiguous area of memory.

There are several functions for creating arrays, which allow to specify the number of rows and columns, e.g.

In [None]:
np.zeros((4, 2))

In [None]:
np.ones((4, 2))

Arrays can be created from lists of lists:

In [None]:
np.array([[1, 2], [3, 4], [5, 6], [7, 8]])

Note that the inner lists provide the individual rows of the array.

Two-dimensional arrays can also be created by concatenating a list of one-dimensional arrays (or lists) by columns:

In [None]:
a = np.column_stack([[1, 2, 3, 4], [5, 6, 7, 8]])
a

It is common to store the channels of a multi-channel signal as the columns of an array.
This is not guaranteed, though, you might also encounter functions that expect the channels to be along the rows of an array.

If you want to flip rows and columns, you can transpose the array:

In [None]:
b = a.T
b

The transposed array is *not* a copy of the original one, it's rather a different *view* into the same memory.
This means that if you change an element of the transposed array, this change will also be visible in the original array!

In [None]:
b[1, 2] = 7777
a

You can see in the array properties that the transposed array doesn't "own" the data:

In [None]:
b.flags

The transposed array has a property called `base` that holds a reference to the original array (which "owns" the memory area).

In [None]:
b.base is a

*Exercise:* Create a two-channel signal with a sine tone in each channel.
Use different frequencies for the channels to be able to check which one is left and right.
Listen to the signal with headphones and check which channel is left and which is right.

*Exercise:* If you solved the previous exercise by concatenating two one-dimensional sine tones, try it now with a single call to `np.sin()`.
If you did this already, try it the other way.

## Inter-aural Time Difference

You will learn about this later in the lecture, but you can already listen to it now!

*Exercise:* Use two nested `for` loops to generate a sequence of two-channel signals and play them back immediately.
Both channels should contain a sine tone with the same frequency, but with a time lag relative to the other channel.
In the inner loop, you should play back a series of signals with a relative delay of 0.6, 0.4, 0.2, 0, -0.2, -0.4 and -0.6 ms.
The outer loop shall play this series three times, using the frequencies
500, 1000 and 2000 Hz, respectively.
Run the script and notice how you perceive (or not) the different time lags.

This exercise is meant for headphone listening!

## Appendix

If there is too much time left, you can have a look at those commands:

In [None]:
%whos

In [None]:
%qtconsole

In [None]:
%run?

<p xmlns:dct="http://purl.org/dc/terms/">
  <a rel="license"
     href="http://creativecommons.org/publicdomain/zero/1.0/">
    <img src="http://i.creativecommons.org/p/zero/1.0/88x31.png" style="border-style: none;" alt="CC0" />
  </a>
  <br />
  To the extent possible under law,
  <span rel="dct:publisher" resource="[_:publisher]">the person who associated CC0</span>
  with this work has waived all copyright and related or neighboring
  rights to this work.
</p>