# CS448 - Lab 0: An Introduction

Below you will find a series of tasks that will help you get started with audio processing using Python. We will start with learning how to generate test sounds, how to read/write audio files, how to perform simple waveform editing and finally how to use real time audio I/O. Look at the provided hints and try to figure out how to do these tasks on your own. The instructions are intentionally vague so that you practice your problem solving. If you get stuck, talk to me or our TAs for more hints. Make sure that any necessary soundfiles are inside a subdirectory called "data" at the same file location as your notebook.

Have fun!

## Exercise 1. Generating and playing basic sounds

It is important to be able to generate multiple types of test sounds to test various parts of an audio processing chain. Some of the most important ones are sinusoids, chirps, and certain types of noise. For this exercise you need to generate the following test signals, at a sampling rate of 8kHz and for a duration of a second. Plot them if you like and see if they look right. You will probably have to zoom into the plots to check these waveforms. Also, play these sounds from your computer’s speakers (IMPORTANT: Turn your computer’s volume down, some of these sounds will be loud!)

Useful numpy functions: ```random.randn```, ```sin```, ```linspace```, ```logspace```, ```fromfile```


This function can help you play a sound from a notebook. The ```rate``` parameter is the *sampling rate*, i.e. how many sound samples to play per second (so if you give this function an array of length 8000 and use a rate of 8000, it will play a sound for one second). We will be talking more about this parameter during the next lecture, but for now this should be enough info to get you going.

In [1]:
# Make a sound player function that plays array "x" with a sample rate "rate", and labels it with "label"
def sound(x, rate=8000, label=''):
    from IPython.display import display, Audio, HTML
    display(
        HTML('<style> table, th, td {border: 0px; }</style> <table><tr><td>' +
             label + '</td><td>' + Audio(x, rate=rate)._repr_html_()[3:] +
             '</td></tr></table>'))


Let's start by generating white noise. Complete the code below for it.

In [2]:
# Function that returns noise samples
import numpy as np


def make_noise(duration=1, sample_rate=8000):
    noise = [np.random.randn() for i in range(sample_rate * duration)]
    return noise, sample_rate


# Generate it
x, sr = make_noise()

# Play the generated sound
sound(x, rate=sr, label='Noise')


0,1
Noise,Your browser does not support the audio element.


Generate a sinusoid with a frequency of 440 Hz

In [4]:
# Function that returns a sinusoid
import numpy as np


def make_sine(frequency=440, duration=1, sample_rate=8000):
    samples = np.arange(duration * sample_rate) / sample_rate
    sine = np.sin(2 * np.pi * frequency * samples)

    return sine, sample_rate


# Generate it
x, sr = make_sine()

# Play the generated sound
sound(x, rate=sr, label='440Hz tone')


0,1
440Hz tone,Your browser does not support the audio element.


Generate a "linear chirp" from 0 Hz to 4 kHz (do not use a chirp function, write the code yourself). What is a chirp you ask? I don't know, google it ...

In [5]:
# Function that returns a chirp
def make_chirp(duration=1, sample_rate=8000):
    import numpy as np

    f_start = 0
    f_end = 4000

    t = np.linspace(0, duration, int(duration * sample_rate))
    sig = np.sin(2 * np.pi * ((f_start * t) + ((f_end - f_start) /
                                               (2 * duration)) * t**2))

    return sig, sample_rate


# Generate it
x, sr = make_chirp()

# Play the generated sound
sound(x, rate=sr, label='Chirp')


0,1
Chirp,Your browser does not support the audio element.


Generate a sinusoidal tone with an exponentially decreasing amplitude from 100 to 0.0001

In [26]:
# Function that returns a "ping"
import numpy as np


def make_ping(frequency=440, duration=1, sample_rate=8000):
    samples = np.arange(duration * sample_rate) / sample_rate

    exps = np.linspace(np.log(0.0001), np.log(100), len(samples))
    sine = np.sin(2 * np.pi * frequency * samples) * np.exp(np.flip(exps))

    return sine, sample_rate


# Generate it
x, sr = make_ping(440)

# Play the generated sound
sound(x, rate=sr, label='ping')


0,1
ping,Your browser does not support the audio element.


You will now make a stereo file. This is represented as a 2d array, one part containing the left channel and the other containing the right channel. For the left channel generate a quarter-second sinusoidal tone of frequency 523.24Hz with an exponentially decaying amplitude from 100 to 0.0001. For the right channel do the same thing but use a frequency of 784Hz.  Start the right channel tone after a quarter second. Play this and verify that it sounds ok (it should sound like a video game “ping-pong” sound).

In [None]:
# Function that returns a "ping" "pong"
def make_pingpong( sample_rate=8000):
    # YOUR CODE HERE
    raise NotImplementedError()

# Generate it
x = make_pingpong()

# Play the generated sound
sound( x, rate=sr, label='ping-pong')

Download and load into python the file [ https://drive.google.com/uc?export=download&id=1BZ5qqH34-GCoJcSoCMxo7YWS0kwX3BSz ]. It contains a sound waveform encoded as a series of 16-bit values (it's not a soundfile, just a dump of the sample values). Find out what its sample rate is (there’s no trick here, this one is trial and error). Show some examples where it sounds wrong and explain why.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

sound( x, rate=wrong_sample_rate, label='Sounds off')
sound( x, rate=another_wrong_sample_rate, label='Sounds off too')
sound( x, rate=right_sample_rate, label='Sounds right!')

## Exercise 2. Saving sounds

What good are sounds if we can’t store them? For most of this class we will be using what is known as a PCM format (more on next lecture). The most popular of these formats is the WAVE file, which we will use most often. When saving a sound to a file we need to be careful and make sure we don’t lose any information.

Take the “ping-pong” sound from above and save it to a WAVE file. Play the file back, or open it with an audio editor and find out if there’s anything wrong. If so, find a way to fix it.

Useful python package: ```scipy.io.wavfile```

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

## Exercise 3. Basic Sound Editing

Here we will learn to do some simple manipulations of sounds. Ordinarily you would do this with an audio editor with a graphical interface, but hey life sucks and you have to do this with code.

Useful python commands:  ```scipy.io.wavfile.wavread, numpy.hstack```

To make your life easy here's a function that will load a sound provided its URL.  It's input will be the URL of a WAVE soundfile, and the output will be a tumple of the sampling rate, and the sound waveform.  If you prefer to minimize nework traffic, you can also download the files below and just load then locally.

In [None]:
# Load a remote WAVE file given its URL, and return the sample rate and waveform
def wavreadurl( url):
    import urllib.request, io, scipy.io.wavfile
    f = urllib.request.urlopen( urllib.request.Request( url))
    sr,s = scipy.io.wavfile.read( io.BytesIO( f.read()))
    return sr, s.astype( 'float32')/32768

Load the file [ https://drive.google.com/uc?export=download&id=1CbCTIMNDfJUpCRpYMK9IFdb-jQJgZbmP ] and listen to it. Clearly something is wrong. Try to fix the problem using code.


In [None]:
# YOUR CODE HERE
raise NotImplementedError()

sound( x, rate=sr, label='Fixed sound')

Use the above file to create a countdown instead.



In [None]:
# YOUR CODE HERE
raise NotImplementedError()

sound( x, rate=sr, label='Countdown')

Load the file [ https://drive.google.com/uc?export=download&id=1C6xgDOS0sQd6zCNbnRVBwbZJ5qoUyETg ] There’s something wrong here too. Fix it!

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

sound( x, rate=sr, label='Fixed sound')

Load the two files [ https://drive.google.com/uc?export=download&id=1Bnlff8-cMeNRUgqtfY6nu9psEAA_-Alm ] and [ https://drive.google.com/uc?export=download&id=1BlAzhHEJCu81VybUHOKOStkVlXSWUYeO ] They are roughly at the same tempo and you want to make a music mix out of them. Play the first sound for two seconds, then fade it out over four seconds. While the first sound fades out the second one should fade in at the same speed. Congrats, you just learned how to (poorly) DJ in python!




In [None]:
# YOUR CODE HERE
raise NotImplementedError()

sound( x, rate=sr, label='Awesome mix')

## Exercise 4. Real-time processing

In real-life you can’t just load an existing soundfile, process at your leisure and save it. You have to be able to process sound in real-time. This means that you will record tiny snippets of sound, quickly process each one and then move to the rest without looking back again. In this exercise we will try a couple of real-time things to get the hang of it. Interpreted languages is generally horrible for real-time systems, but we’ll stick with them since it’s much simpler than writing low-level code. In python you can use the package ```pysoundcard``` or ```sounddevice``` to get some low-level audio control. Open an audio stream with a sample rate of 16kHz and a single channel. Use a buffer size of 1024 samples.

You will then create a loop in which we get a snippet of sound from the microphone at each pass. Inside the loop you will read from the stream (which should be taking samples off the microphone). Using this, measure the standard deviation of each incoming sound snippet of sound and after eight seconds of recording plot these as a sequence.

Useful python commands:  ```pysoundcard.Stream.read```, ```sounddevice.InputStream```

*Note: You cannot run this part on Google colab, if you do so it will try to open the audio i/o of the remote machine.  You need to run this one on your local machine.*

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

Now let’s try to add some output as well. We will make a robot voice effect that makes use of a ring modulator. This is the same effect that’s been used to generate robot voices for many older films and TV shows (e.g. the Daleks in Dr. Who).

We will reuse the loop that we made above, but this time we will additionally have an audio output. Do the same as above, but this time you can also write to the stream to send a buffer of samples to the speaker. For a test you can simply pass the input buffer from read to write, and this would simply play from the speakers the sounds you make to the microphone (tip #1: wear headphones to avoid a feedback loop! tip #2: Every time you put on headphones set the volume to a very low value to avoid any painful surprises). 

Once you verify that a passthrough works, multiply each input snippet with a a 440Hz sine and send that to the output to create a voice transformation. If successful, it should sound robotic. Congrats, you just made your first real-time audio effect!

Useful python commands:  ```pysoundcard.Stream.write```, ```sounddevice.Stream```

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

Optional: If you have an audio interface you can use the code above as an effects box.  E.g., if you are an electric guitar player, try transforming the audio signal with `ouput = tanh( a * input)` and depending on the value of the scalar `a` it will produce a proportional distortion effect.  Or if you are an electric violin player you can try `output = 0 * input` which will make the output sound much more pleasing.  As we learn about more types of processing throughout the semester you can experiment with plugging them in this real-time loop.