# Use internal microphone to plot waveforms (Part 1: animated waveform)
This script should make a continuous line plot of a sound that is picked up by the computer's microphone
It derives from https://www.youtube.com/watch?v=AShHJdSIxkY
Note that if the speaker volume is too high, the graph will look broken (values exceeding +-128 will wrap)

It uses the python module "pyaudio", which also needed "portaudio"

First, as always, we import the needed packages; in this case pyaudio to grab sound from the microphone (either internal or external), struct to convert the digital sound from packed binary to integer, numpy for array handling and matplotlib to plot

In [1]:
# first import numpy and matplotlib as per almost all our scripts
import numpy as np
import matplotlib.pyplot as plt

# also import the sound handling module (sound from computer microphone) and
# module for dealing with raw bit data
import pyaudio
import struct

This "backend" will allow plots to come up outside the jupyter browser (as popups)

In [2]:
%matplotlib tk

Here we define a segment of sound to process.  This essentially takes a continuous time-series and makes it a set of finite-length signals.  These will be the number of samples per frame.  Here we use 1024 * 4, or 4096.  The format here is 16-bit integer, channel is 1 for the machines internal mic being mono (not stereo), and the rate is a somewhat standard 44.1 kHz

In [3]:
# define the number of data points per sample, i.e., there will be "CHUNK" data points in each sample
CHUNK = 1024 * 4

# define the sampling rate; this is in samples (cycles) per second
RATE = 44100

# set the data format to be 16-bit integer and set the number of channels to 1 (for mono)
FORMAT = pyaudio.paInt16
CHANNELS = 1

Next, we create a stream using the variables defined above.  Here I'm not sure why we're using integer 16, only later to change to integer 8

In [4]:
# create object called "sream" to hold audio information
p = pyaudio.PyAudio()

stream = p.open(
    format = FORMAT,
    channels = CHANNELS,
    rate = RATE,
    input = True,
    output = True,
    frames_per_buffer = CHUNK
)

Finally, we convert the stream to bits (note we need to add the exception_on_overflow otherwise it will crash with an overflow error).  We also have a look at the variable "data"

Note that "data" will have "CHUNK" numbers in it, and these will be unsigned integers expressed in hex bits.  For some reason this is always blank the first time through.  No idea why...  however, if you run this block once, you'll get all zeros.  Run it again, and again, and eventually you'll see non-zero numbers

In [5]:
data = stream.read(CHUNK, exception_on_overflow=False)
data

b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x

Here we convert from raw bytes into decimal un-signed 8-bit integers (0 to 255).  I'm not sure why/how we started with 16-bit integers (pyaudio.paInt16) and now go to 8, but we end up with twice the number of data values (2*CHUNK)

In [6]:
data_int = struct.unpack(str(2 * CHUNK) + 'B', data)
data_int

(0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,


Let's make a quick plot and see what we get

In [7]:
fig, ax = plt.subplots()
ax.plot(data_int,'-')
plt.show()

Here it gets a little confusing... we convert the data using numpy, to signed integers.  Again, on my system these are signed integer-8, so valid values after the call below range from -128 to 127.  Also, we need to take every other value ([::2]), and I think because we went from Int16 to Int8?  Not sure.  And since we're taking every other value, the input is (2 * CHUNK) but the output is just (CHUNK).
Again, we make a quick plot to see if it looks okay.

In [8]:
data_int = np.array(struct.unpack(str(2 * CHUNK) + 'B', data), dtype = 'b')[::2]
fig, ax = plt.subplots()
ax.plot(data_int,'-')
plt.show()

We now have a plot that looks okay.  The final step is to animate this in "real-time".  This is two step process: first we make a dummy line plot with random values just to set the axis limits and such.  Next, we update using the while loop and real data.

In [9]:
fig, ax = plt.subplots()

# here the total time of the record is CHUNK/RATE, i.e., the x-axis will run from 0 to CHUNK samples,
# with RATE samples/second, so CHUNK/RATE gives us total time (in seconds) for CHUNK, so if we want the graph
# to be in terms of time(seconds), 

time = np.linspace(0,CHUNK/RATE,CHUNK)
ax.set_xlabel('time (seconds)')
line, = ax.plot(time, np.random.rand(CHUNK))
ax.set_xlim(0,CHUNK/RATE)

# otherwise we plot as samples

#samples = np.arange(0,CHUNK)
#ax.set_xlabel('samples')
#line, = ax.plot(samples, np.random.rand(CHUNK))
#ax.set_xlim(0,CHUNK)

# set the rest of the plotting specs
ax.set_ylim(-150,150)
ax.set_title('Audio Waveform')
ax.set_ylabel('amplitude')

Text(0, 0.5, 'amplitude')

In [10]:
# last step... continuously loop to update images
while True:
    data = stream.read(CHUNK,exception_on_overflow = False)
    data_int = np.array(struct.unpack(str(2 * CHUNK) + 'B', data), dtype = 'b')[::2]
    line.set_ydata(data_int)
    fig.canvas.draw()
    fig.canvas.flush_events()

TclError: invalid command name "pyimage30"